aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/swr
Commit message (Collapse)AuthorAgeFilesLines
* swr/rast: Repair simd8 frontend code rotTim Rowley2017-11-201-1/+1
| | | | | | Keep non-default simd8 frontend code running for comparison purposes. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Implement AVX-512 GATHERPS in SIMD16 fetch shaderTim Rowley2017-11-204-29/+220
| | | | | | Disabled for now. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Simplify GATHER* jit builder apiTim Rowley2017-11-204-48/+48
| | | | | | | General cleanup, and prep work for possibly moving to llvm masked gather intrinsic. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add alignment to transpose targetsTim Rowley2017-11-201-8/+8
| | | | | | | | Needed to ensure alignment for avx512. Fixes address sanitizer crash. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Cache eventmanagerTim Rowley2017-11-203-0/+9
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Enable AVX-512 targets in the jitterTim Rowley2017-11-202-10/+0
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Points with clipdistance can't go through simplepoints pathTim Rowley2017-11-201-1/+2
| | | | | | | Fixes piglit glsl-1.20:vs-clip-vertex-primitives and glsl-1.30:vs-clip-distance-primitives. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Code style change (NFC)Tim Rowley2017-11-201-2/+7
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16Tim Rowley2017-11-205-3/+151
| | | | | | | Widen fetch shader to SIMD16, enable SIMD16 types in the jitter, and provide utility EXTRACT/INSERT SIMD8 <-> SIMD16 utility functions. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Support flexible vertex layout for DS outputTim Rowley2017-11-202-0/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Faster emulated simd16 permuteTim Rowley2017-11-141-23/+11
| | | | | | | | Speed up simd16 frontend (default) on avx/avx2 platforms; fixes performance regression caused by switch to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* swr/rast: Use gather instruction for i32gather_ps on simd16/avx512Tim Rowley2017-11-141-11/+1
| | | | | | | | Speed up avx512 platforms; fixes performance regression caused by swithc to simdlib. Reviewed-by: Bruce Cherniak <[email protected]> Cc: [email protected]
* swr: Fixed an uncommon freed-memory access during state validationBruce Cherniak2017-11-102-17/+25
| | | | | | | | | | | | | | | | | | | | | State validation is performed during clear and draw calls. Validation during clear was still accessing vertex buffer state. When the currently set vertex buffers are client arrays, this could lead to accessing freed memory. Such is the case with the VMD application. Previously, vertex buffer validation depended on a dirty bit or the draw info indicating an indexed draw. This required special handling for clears. But, vertex buffer validation still occurred which was unnecessary and wrong. Now, only minimal validation is performed during clear, deferring the remainder to the next draw. And, by setting the dirty bit in swr_draw_vbo for indexed draws, vertex buffer validation is only dependent upon a single dirty bit. This fixes a bug exposed by the VMD application when changing models. Reviewed-By: George Kyriazis <[email protected]>
* util: move os_time.[ch] to src/utilNicolai Hähnle2017-11-092-2/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* swr: Replace the check for c++11 by the unified versionGert Wollny2017-11-081-2/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSETMarek Olšák2017-11-061-0/+1
|
* gallium: add cap for driver specified max combined shader resources.Dave Airlie2017-11-011-0/+1
| | | | | | | | Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* swr: Rework scratch space allocationGeorge Kyriazis2017-10-192-30/+23
| | | | | | | | | | | | | | | | | Remove allocation of > 2kbyte buffers into context memory in swr_copy_to_scatch_space() (which is used to copy small vertex/index buffers and shader constants to a scratch space to be used by the upcoming draw.) Large shader constant allocations need to be done in the circular scratch buffer instead of context memory, because their values persist across render calls. Also lower SCRATCH_SINGLE_ALLOCATION_LIMIT to 8k, since allocations of larger buffers will get too large for the circular scratch space. Fixes render issues with CEI Ensight. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: knob overrides for Intel Xeon PhiTim Rowley2017-10-195-1/+37
| | | | | | | | Architecture benefits from having more threads/work outstanding. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add api to override draws in flightTim Rowley2017-10-194-19/+31
| | | | | | | | Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO. Patch by Jan Zielinski. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Widen fetch shader to SIMD16 (disabled for now)Tim Rowley2017-10-191-13/+428
| | | | | | | Refactored the gather operation to process 16 elements at a time via paired SIMD8 operations. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Change DS memory allocationTim Rowley2017-10-192-2/+3
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix indentationTim Rowley2017-10-191-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Miscellaneous viewport array code changesTim Rowley2017-10-195-38/+71
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Minor changes for os-xTim Rowley2017-10-191-2/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: simd16 shaders work in progressTim Rowley2017-10-113-2/+21
| | | | | | | | Start building vertex shaders as simd16. Disabled by default, set USE_SIMD16_SHADERS in knobs.h to experiment. Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4.Eric Anholt2017-10-101-0/+1
| | | | | | | | | | | | | | | | Because vc4 can control the order that tiles are rasterized in, we can use it to implement overlapping blits using normal drawing and GL_ARB_texture_barrier, as long as we can tell the kernel what order to render the tiles in. This commit introduces the core gallium support, vc4 changes will follow. v2: Fix on the simulator. v3: Add the cap (disabled) to other drivers, add rst docs for the cap. v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS v5: Drop vc4 changes from this commit, for clarity. Reviewed-by: Nicolai Hähnle <[email protected]> (v3)
* swr/rast: use proper alignment for debug transposedPrimsTim Rowley2017-10-061-2/+2
| | | | | | | | Causing a crash in ParaView waveletcontour.py test when _DEBUG defined due to vector aligned copy with unaligned address. Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: add PIPE_CAP_TGSI_ANY_REG_AS_ADDRESSMarek Olšák2017-10-061-0/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: Remove util_format_s3tc_init()Matt Turner2017-10-021-2/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: Remove util_format_s3tc_enabledMatt Turner2017-10-021-4/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* swr/rast: do not crash on NULL strings returned by getenvEmil Velikov2017-10-021-1/+2
| | | | | | | | | | | | | | | | | | | | The current convenience function GetEnv feeds the results of getenv directly into std::string(). That is a bad idea, since the variable may be unset, thus we feed NULL into the C++ construct. The latter of which is not allowed and leads to a crash. v2: Better variable name, implicit char* -> std::string conversion (Eric) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101832 Fixes: a25093de718 ("swr/rast: Implement JIT shader caching to disk") Cc: Tim Rowley <[email protected]> Cc: Laurent Carlier <[email protected]> Cc: Bernhard Rosenkraenzer <[email protected]> [Emil Velikov: make an actual commit from the misc diff] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Reviewed-by: Laurent Carlier <[email protected]> (v1)
* swr: Remove unneeeded comparisonGeorge Kyriazis2017-09-261-2/+1
| | | | | | No need to check if screen->pipe != pipe, so we can just assign it. Just do it. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: Handle resource across context changesGeorge Kyriazis2017-09-264-10/+70
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Swr caches fb contents in tiles. Those tiles are stored on a per-context basis. When switching contexts that share resources we need to make sure that the tiles of the old context are being stored and the tiles of the new context are being invalidated (marked as invalid, hence contents need to be reloaded). The context does not get any dirty bits to identify this case. This has to be, then, coordinated by the resources that are being shared between the contexts. Add a "curr_pipe" hook in swr_resource that will allow us to identify a MakeCurrent of the above form during swr_update_derived(). At that time, we invalidate the tiles of the new context. The old context, will need to have already store its tiles by that time, which happens during glFlush(). glFlush() is being called at the beginning of MakeCurrent. So, the sequence of operations is: - At the beginning of glXMakeCurrent(), glFlush() will store the tiles of all bound surfaces of the old context. - After the store, a fence will guarantee that the all tile store make it to the surface - During swr_update_derived(), when we validate the new context, we check all resources to see what changed, and if so, we invalidate the current tiles. Fixes rendering problems with CEI/Ensight. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Handle instanceID offset / Instance Stride enableTim Rowley2017-09-251-7/+39
| | | | | | | | | | Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require similar changes, will need address this if it is determined that this path is still in use. Handle Force Sequential Access in FetchJit::Create. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove code supporting legacy llvm (<3.9)Tim Rowley2017-09-253-105/+15
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix allocation of DS output data for USE_SIMD16_FRONTENDTim Rowley2017-09-251-10/+6
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Slightly more efficient blend jitTim Rowley2017-09-251-20/+10
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Properly sized null GS bufferTim Rowley2017-09-251-1/+1
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Move SWR_GS_CONTEXT from thread local storage to stackTim Rowley2017-09-251-12/+11
| | | | | | | Move structure, as the size is significantly reduced due to dynamic allocation of the GS buffers. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-252-1/+12
| | | | | | | Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to FETCH_COMPILE_STATE. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: New GS state/context APITim Rowley2017-09-253-212/+253
| | | | | | | One piglit regression, which was a false pass: [email protected]@execution@geometry@dynamic_input_array_index Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Add support for R10G10B10_FLOAT_A2_UNORM pixel formatTim Rowley2017-09-253-17/+28
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* scons: use python3-compatible generatorEric Engestrom2017-09-251-4/+2
| | | | | | | These changes were generated using python's `2to3` tool. Suggested-by: Ilia Mirkin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* scons: use python3-compatible print()Eric Engestrom2017-09-251-3/+3
| | | | | | | | | These changes were generated using python's `2to3` tool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852 Reported-by: Alex Granni <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* swr/rast: remove llvm fence/atomics from generated filesTim Rowley2017-09-221-0/+8
| | | | | | | | | | | We currently don't use these instructions, and since their API changed in llvm-5.0 having them in the autogen files broke the mesa release tarballs which ship with generated autogen files. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847 CC: [email protected] Tested-by: Laurent Carlier <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: add PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVENicolai Hähnle2017-09-181-0/+1
| | | | | | | | | | | | | | | | | To be able to properly distinguish between GL_ANY_SAMPLES_PASSED and GL_ANY_SAMPLES_PASSED_CONSERVATIVE. This patch goes through all drivers, having them treat the two query types identically, except: 1. radeon incorrectly enabled conservative mode on PIPE_QUERY_OCCLUSION_PREDICATE. We now do it correctly, only on PIPE_QUERY_OCCLUSION_PREDICATE_CONSERVATIVE. 2. st/mesa uses the new query type. Fixes dEQP-GLES31.functional.fbo.no_attachments.* Reviewed-by: Marek Olšák <[email protected]>
* gallium: introduce PIPE_CAP_LOAD_CONSTBUFTimothy Arceri2017-09-151-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* swr: use ARRAY_SIZE macroEric Engestrom2017-09-141-4/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fetch compile state changesTim Rowley2017-09-133-6/+15
| | | | | | | | | | Add InstanceStrideEnable field and rename InstanceDataStepRate to InstanceAdvancementState in INPUT_ELEMENT_DESC structure. Add stubs for handling InstanceStrideEnable in FetchJit::JitLoadVertices() and FetchJit::JitGatherVertices() and assert if they are triggered. Reviewed-by: Bruce Cherniak <[email protected]>