aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/swr/rasterizer
Commit message (Collapse)AuthorAgeFilesLines
* swr/rast: Repair simd8 frontend code rotTim Rowley2017-11-201-1/+1
| | | | | | Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shaderTim Rowley2017-11-204-29/+220
| | | | | | Disabled for now. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Simplify GATHER* jit builder apiTim Rowley2017-11-203-47/+47
| | | | | | | General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add alignment to transpose targetsTim Rowley2017-11-201-8/+8
| | | | | | | | Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Cache eventmanagerTim Rowley2017-11-203-0/+9
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Enable AVX-512 targets in the jitterTim Rowley2017-11-202-10/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Points with clipdistance can't go through simplepoints pathTim Rowley2017-11-201-1/+2
| | | | | | | Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Code style change (NFC)Tim Rowley2017-11-201-2/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16Tim Rowley2017-11-205-3/+151
| | | | | | | Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Support flexible vertex layout for DS outputTim Rowley2017-11-202-0/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Faster emulated simd16 permuteTim Rowley2017-11-141-23/+11
| | | | | | | | Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* swr/rast: Use gather instruction for i32gather_ps on simd16/avx512Tim Rowley2017-11-141-11/+1
| | | | | | | | Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* swr/rast: Add api to override draws in flightTim Rowley2017-10-194-19/+31
| | | | | | | | Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16 (disabled for now)Tim Rowley2017-10-191-13/+428
| | | | | | | Refactored the gather operation to process 16 elements at a time via paired SIMD8 operations. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Change DS memory allocationTim Rowley2017-10-192-2/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix indentationTim Rowley2017-10-191-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Miscellaneous viewport array code changesTim Rowley2017-10-195-38/+71
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Minor changes for os-xTim Rowley2017-10-191-2/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: use proper alignment for debug transposedPrimsTim Rowley2017-10-061-2/+2
| | | | | | | | Causing a crash in ParaView waveletcontour.py test when _DEBUG defined due to vector aligned copy with unaligned address. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: do not crash on NULL strings returned by getenvEmil Velikov2017-10-021-1/+2
| | | | | | | | | | | | | | | | | | | | The current convenience function GetEnv feeds the results of getenv directly into std::string(). That is a bad idea, since the variable may be unset, thus we feed NULL into the C++ construct. The latter of which is not allowed and leads to a crash. v2: Better variable name, implicit char* -> std::string conversion (Eric) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832 Fixes: a25093de718 ("swr/rast: Implement JIT shader caching to disk") Cc: Tim Rowley <[email protected]> Cc: Laurent Carlier <[email protected]> Cc: Bernhard Rosenkraenzer <[email protected]> [Emil Velikov: make an actual commit from the misc diff] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Reviewed-by: Laurent Carlier <[email protected]> (v1)
* swr/rast: Handle instanceID offset / Instance Stride enableTim Rowley2017-09-251-7/+39
| | | | | | | | | | Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require similar changes, will need address this if it is determined that this path is still in use. Handle Force Sequential Access in FetchJit::Create. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove code supporting legacy llvm (<3.9)Tim Rowley2017-09-253-105/+15
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTENDTim Rowley2017-09-251-10/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Slightly more efficient blend jitTim Rowley2017-09-251-20/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Properly sized null GS bufferTim Rowley2017-09-251-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move SWR_GS_CONTEXT from thread local storage to stackTim Rowley2017-09-251-12/+11
| | | | | | | Move structure, as the size is significantly reduced due to dynamic allocation of the GS buffers. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-252-1/+12
| | | | | | | Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to FETCH_COMPILE_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: New GS state/context APITim Rowley2017-09-252-114/+168
| | | | | | | One piglit regression, which was a false pass: [email protected]@execution@geometry@dynamic_input_array_index Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel formatTim Rowley2017-09-253-17/+28
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: remove llvm fence/atomics from generated filesTim Rowley2017-09-221-0/+8
| | | | | | | | | | | We currently don't use these instructions, and since their API changed in llvm-5.0 having them in the autogen files broke the mesa release tarballs which ship with generated autogen files. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847 CC: [email protected] Tested-by: Laurent Carlier <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: use ARRAY_SIZE macroEric Engestrom2017-09-141-4/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-132-5/+14
| | | | | | | | | | Add InstanceStrideEnable field and rename InstanceDataStepRate to InstanceAdvancementState in INPUT_ELEMENT_DESC structure. Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices() and FetchJit::JitGatherVertices() and assert if they are triggered. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: adjust linux cpu topology identification codeTim Rowley2017-09-131-43/+38
| | | | | | | Make more robust to handle strange strange configurations like a vmware exported 4-way numa X 1-core configuration. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Missed conversion to SIMD_TTim Rowley2017-09-131-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: whitespace changesTim Rowley2017-09-131-0/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: add graph write to jit debug putputTim Rowley2017-09-131-3/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Migrate memory pointers to gfxptr_t typeTim Rowley2017-09-134-6/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove hardcoded clip/cull slot from clipperTim Rowley2017-09-131-14/+21
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Start to remove hardcoded clipcull_dist vertex attrib slotTim Rowley2017-09-132-8/+12
| | | | | | | | Add new field in SWR_BACKEND_STATE::vertexClipCullOffset to specify the start of the clip/cull section of the vertex header. Removed use of hardcoded slot from binner. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move clip/cull enables in APITim Rowley2017-09-138-32/+32
| | | | | | Moved from from SWR_RASTSTATE to SWR_BACKEND_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add new API SwrStallBETim Rowley2017-09-132-0/+17
| | | | | | | | SwrStallBE stalls the backend threads until all work submitted before the stall has finished. The frontend threads can continue to make forward progress. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: FE/Clipper - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-063-1189/+446
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove use of C++14 template variableTim Rowley2017-09-062-6/+14
| | | | | | SWR rasterizer must remain C++11 compliant. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 FE remove templated immediates workaroundTim Rowley2017-09-061-90/+20
| | | | | | Fixed properly in gcc-compatible fashion. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: SIMD16 PA - rename Assemble_simd16 to AssembleTim Rowley2017-09-063-31/+15
| | | | | | For consistency and to support overloading. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: FE/Binner - unify SIMD8/16 functions using simdlib typesTim Rowley2017-09-065-1739/+696
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Removed some trailing whitespace caught during reviewTim Rowley2017-09-063-10/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Allow gather of floats from fetch shader with 2-4GB offsetsTim Rowley2017-09-062-1/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix invalid casting for calls to Interlocked* functionsTim Rowley2017-08-163-7/+7
| | | | | | CID: 1416243, 1416244, 1416255 CC: [email protected] Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: fix core / knights split of AVX512 intrinsicsTim Rowley2017-08-024-55/+69
| | | | | | | | Move AVX512BW specific intrinics to be Core-only. Move some AVX512F intrinsics back to common implementation file. Reviewed-by: Bruce Cherniak <[email protected]>