summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/swr
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVENicolai Hähnle2017-09-181-0/+1
| | | | | | | | | | | | | | | | | To be able to properly distinguish between GL_ANY_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED_CONSERVATIVE. This patch goes through all drivers, having them treat the two query types identically, except: 1. radeon incorrectly enabled conservative mode on PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE. 2. st/mesa uses the new query type. Fixes dEQP-GLES31.functional.fbo.no_attachments.* Reviewed-by: Marek Olšák <[email protected]>
* gallium: introduce PIPE_CAP_LOAD_CONSTBUFTimothy Arceri2017-09-151-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* swr: use ARRAY_SIZE macroEric Engestrom2017-09-141-4/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-133-6/+15
| | | | | | | | | | Add InstanceStrideEnable field and rename InstanceDataStepRate to InstanceAdvancementState in INPUT_ELEMENT_DESC structure. Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices() and FetchJit::JitGatherVertices() and assert if they are triggered. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: adjust linux cpu topology identification codeTim Rowley2017-09-131-43/+38
| | | | | | | Make more robust to handle strange strange configurations like a vmware exported 4-way numa X 1-core configuration. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Missed conversion to SIMD_TTim Rowley2017-09-131-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: whitespace changesTim Rowley2017-09-131-0/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: add graph write to jit debug putputTim Rowley2017-09-131-3/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Migrate memory pointers to gfxptr_t typeTim Rowley2017-09-139-36/+36
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove hardcoded clip/cull slot from clipperTim Rowley2017-09-131-14/+21
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Start to remove hardcoded clipcull_dist vertex attrib slotTim Rowley2017-09-133-8/+15
| | | | | | | | Add new field in SWR_BACKEND_STATE::vertexClipCullOffset to specify the start of the clip/cull section of the vertex header. Removed use of hardcoded slot from binner. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move clip/cull enables in APITim Rowley2017-09-139-40/+40
| | | | | | Moved from from SWR_RASTSTATE to SWR_BACKEND_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add new API SwrStallBETim Rowley2017-09-132-0/+17
| | | | | | | | SwrStallBE stalls the backend threads until all work submitted before the stall has finished. The frontend threads can continue to make forward progress. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-063-1189/+446
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove use of C++14 template variableTim Rowley2017-09-062-6/+14
| | | | | | SWR rasterizer must remain C++11 compliant. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 FE remove templated immediates workaroundTim Rowley2017-09-061-90/+20
| | | | | | Fixed properly in gcc-compatible fashion. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 PA - rename Assemble_simd16 to AssembleTim Rowley2017-09-063-31/+15
| | | | | | For consistency and to support overloading. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-065-1739/+696
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Removed some trailing whitespace caught during reviewTim Rowley2017-09-063-10/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: set caps for VB 4-byte alignmentTim Rowley2017-09-061-3/+6
| | | | | | | | | | Needed to compensate for change to fetch jit requiring alignment. Fixes regressions in piglit: vertex-buffer-offsets and about another hundred of the vs-input*byte* tests. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Allow gather of floats from fetch shader with 2-4GB offsetsTim Rowley2017-09-062-1/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Report format max_samples=1 to maintain support for "fake" msaa.Cherniak, Bruce2017-09-011-11/+11
| | | | | | | | | | | | | | | | | | | | Accompanying patch "st/mesa: only try to create 1x msaa surfaces for 'fake' msaa" requires driver to report max_samples=1 to enable "fake" msaa. Previously, 0 and 1 were treated equivalently in st_init_extensions() and either could enable "fake" msaa. This patch raises the swr default msaa_max_count from 0 to 1, so that swr_is_format_supported will report max_samples=1. Real msaa can still be enabled by exporting SWR_MSAA_MAX_COUNT with a pow2 value between 2 and 16. This patch is necessary to prevent an OpenSWR regression resulting from the st/mesa patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102038 Acked-by: Brian Paul <[email protected]> Reviewed-By: George Kyriazis <[email protected]>
* swr: limit pipe_draw_info->restart_index usageTim Rowley2017-08-231-1/+4
| | | | | | | | Only copy this value when in restart drawing mode. Eliminates valgrind errors when running trivial programs. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix invalid casting for calls to Interlocked* functionsTim Rowley2017-08-163-7/+7
| | | | | | CID: 1416243, 1416244, 1416255 CC: [email protected] Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: introduce PIPE_CAP_MEMOBJTimothy Arceri2017-08-031-0/+1
| | | | | | | | | | | | | | This can be used to guard support for EXT_memory_object and related extensions. v2: update gallium docs v3 (Timothy Arceri): - add cap to nv50 Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* swr/rast: fix core / knights split of AVX512 intrinsicsTim Rowley2017-08-024-55/+69
| | | | | | | | Move AVX512BW specific intrinics to be Core-only. Move some AVX512F intrinsics back to common implementation file. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: simplify knob default value setupTim Rowley2017-08-022-14/+11
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: split gen_knobs templates into .h/.cppTim Rowley2017-08-025-118/+166
| | | | | | | Switch to a 1:1 mapping template:generated for future maintenance. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: gen_knobs template code styleTim Rowley2017-08-021-2/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: switch gen_knobs.cpp licenseTim Rowley2017-08-021-12/+17
| | | | | | | Unintentionally added with an apache2 license; relicense to match the rest of the tree. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: fix scons gen_knobs.h dependencyTim Rowley2017-08-021-1/+1
| | | | | | | | Copy/paste error was duplicating a gen_knobs.cpp rule. Fixes: 5079c277b57 ("swr: [scons] Fix windows build") Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: constify swr rasterizerTim Rowley2017-08-0218-323/+339
| | | | | | Add "const" as appropriate in method/function signatures. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 shaders - widen fetch and vertex shadersTim Rowley2017-08-026-5/+238
| | | | | | Work in progress, disabled by default. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: vmask() implementations for KNLTim Rowley2017-08-021-0/+14
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: rename frontend pVertexStoreTim Rowley2017-08-021-6/+9
| | | | | | Rename to reflect global nature. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: fix movemask_ps / movemask_pd on AVX512Tim Rowley2017-08-021-2/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: stop using MSFT types in platform independent codeTim Rowley2017-08-0214-31/+35
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: enable USE_SIMD16_FRONTEND by defaultTim Rowley2017-08-021-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: disable AVX512 optimization of SSE / AVX codeTim Rowley2017-08-021-0/+4
| | | | | | | | | | Disable an optimization which implemented sse/avx operations on avx512 using avx512 intrinsics (to avoid switching between lane widths). Compile with SIMD_OPT_128_AVX512 / SIMD_OPT_256_AVX512 defined to enable these optimizations. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: fix USE_SIMD16_FRONTEND issuesTim Rowley2017-08-0214-74/+49
| | | | | | | Fix problems found when enabling USE_SIMD16_FRONTEND, mostly related to vMask / movemask_ps(pd). Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: simdlib better separation of core vs knights avx512Tim Rowley2017-08-0215-245/+911
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: threadID via portable std::this_thread::get_id()Tim Rowley2017-08-021-9/+11
| | | | | | | Replace use of Win32 GetCurrentThreadId() with portable std::this_thread::get_id(). Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: add PIPE_QUERY_SO_OVERFLOW_ANY_PREDICATE and corresponding capNicolai Hähnle2017-08-021-0/+1
| | | | | | | | v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit in the documentation Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_NIR_SAMPLERS_AS_DEREFNicolai Hähnle2017-07-311-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* swr: fix transform feedback logicGeorge Kyriazis2017-07-274-8/+71
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The shader that is used to copy vertex data out of the vs/gs shaders to the user-specified buffer (streamout or SO shader) was not using the correct offsets. Adjust the offsets that are used just for the SO shader: - Make sure that position is handled in the same special way as in the vs/gs shaders - Use the correct offset to be passed in the core - consolidate register slot mapping logic into one function, since it's been calculated in 2 different places (one for calcuating the slot mask, and one for the register offsets themselves Also make room for all attibutes in the backend vertex area. Fixes: - all vtk GL2PS tests - 18 piglit tests (16 ext_transform_feedback tests, arb-quads-follow-provoking-vertex and primitive-type gl_points v2: - take care of more SGV slots in slot mapping logic - trim feState.vsVertexSize - fix GS interface and incorporate GS while calculating vsVertexSize Note that vsVertexSize is used in the core as the one parameter that controls vertex size between all stages, so it has to be adjusted appropriately for the whole vs/gs/fs pipeline. Also note that GS and SO is not fully implemented. This will be addressed later. fixes: - fixes total of 20 piglit tests CC: 17.2 <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: non-regex knob fallback code for gcc < 4.9Tim Rowley2017-07-271-0/+21
| | | | | | | | gcc prior to 4.9 didn't implement <regex>, causing a startup crash in the swr knob parameter reading code. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: use the correct variable for no undefined symbolsEmil Velikov2017-07-241-1/+1
| | | | | | | | | | | | | | | | | | | | | The variable name was missing a leading LD_, which resulted in a missing check for unresolved symbols in the backend binaries. With the link addressed with earlier patches, we can correct the typo. Thanks to Laurent for the help spotting this. v2: Split from a larger patch. Cc: [email protected] Cc: Bruce Cherniak <[email protected]> Cc: Tim Rowley <[email protected]> Cc: Laurent Carlier <[email protected]> Fixes: 9475251145174882b532 "swr: standardize linkage and check for unresolved symbols" Reviewed-by: Eric Engestrom <[email protected]> Reported-by: Laurent Carlier <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* swr: don't forget to link KNL/SKX against pthreadsEmil Velikov2017-07-241-0/+8
| | | | | | | | | | Analogous to previous commit but for the KNL/SKX backends. Cc: Bruce Cherniak <[email protected]> Cc: Tim Rowley <[email protected]> Cc: Laurent Carlier <[email protected]> Fixes: 1cb5a6061ce ("configure/swr: add KNL and SKX architecture targets") Signed-off-by: Emil Velikov <[email protected]>
* swr: don't forget to link AVX/AVX2 against pthreadsEmil Velikov2017-07-241-0/+8
| | | | | | | | | | | | | | | | | | Seems like the backends have been using pthreads since day one, yet we've been missing the link. With later commit we'll fix a typo, hence the libraries will be build with -Wl,no-undefined, aka failing the build on unresolved symbols. v2: Split from a larger patch. Cc: [email protected] Cc: Bruce Cherniak <[email protected]> Cc: Tim Rowley <[email protected]> Cc: Laurent Carlier <[email protected]> Fixes: c6e67f5a9373e916a8d2 "gallium/swr: add OpenSWR rasterizer" Reviewed-by: Eric Engestrom <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* swr/rast: quit using linux-specific gettid()Tim Rowley2017-07-212-4/+3
| | | | | | | | | | | | | Linux-specific gettid() syscall shouldn't be used in portable code. Fix does assume a 1:1 thread:LWP architecture, but works for our current target platforms and can be revisited later if needed. Fixes unresolved symbol in linux scons builds. v2: add comment in code about the 1:1 assumption. Cc: [email protected] Reviewed-by: Bruce Cherniak <[email protected]>