aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: save currently bound vertex samplers and sampler views in st_contextMarek Olšák2019-12-094-3/+11
| | | | | | for st_draw_feedback.c Reviewed-by: Dave Airlie <[email protected]>
* st/mesa: support UBOs for Selection/Feedback/RasterPosMarek Olšák2019-12-091-2/+37
| | | | Reviewed-by: Dave Airlie <[email protected]>
* gallivm: implement LOAD with CONSTBUF but don't enable it for llvmpipeMarek Olšák2019-12-091-3/+36
| | | | | | | This is already used in st_draw_feedback.c, because it uses shaders generated for drivers. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: implement TEX_LZ and TXF_LZ opcodesMarek Olšák2019-12-092-5/+11
| | | | | | | gallivm receives these opcodes anyway because st_draw_feedback.c uses shaders that were assembled for drivers, not llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]>
* drirc: set allow_higher_compat_version for Faster Than LightGurchetan Singh2019-12-091-1/+9
| | | | | | | | | | | | With 781a78 ("mesa: enable ARB_direct_state_access in compat for GL3.1+), it's possible to have DSA with GL3.1+. FTL creates a GL3.1 compat context, but fails the _mesa_has_geometry_shaders(..) check in frame_buffer_texture. Bump the compat version to pass the check. Reviewed-by: Marek Olšák <[email protected]>
* util/atomic: Fix p_atomic_add for unlocked and msvc pathsRoland Scheidegger2019-12-091-2/+2
| | | | | | | | | | | Braces mismatch (flagged by CI, untested). Fixes: 385d13f26d2 "util/atomic: Add a _return variant of p_atomic_add" Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno: Track the set of UBOs to be uploaded in UBO analysis.Eric Anholt2019-12-093-19/+25
| | | | | | | | | | | We were iterating over the entire 32-entry array each time, when we can just use a bitset to know that we're only uploading from the first entry normally. Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL fishtank. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.Eric Anholt2019-12-091-3/+0
| | | | | | | | | The default is to not throw GL errors when drawing with mapped buffers, but we were forcing it on for unclear reasons. Internally we keep all our buffers mapped anyway, so it should be a no-op other than reducing CPU overhead (.23% in a perf report for WebGL fishtank) Reviewed-by: Rob Clark <[email protected]>
* freedreno/fdperf: use drmOpen()Rob Clark2019-12-092-1/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium/util: Support POLYGON in u_stream_outputs_for_verticesAlyssa Rosenzweig2019-12-091-1/+8
| | | | | | | | | | | | u_decomposed_prims_for_vertices cannot support POLYGON, but POLYGON is trivial to support as a special case directly (since we have the number of vertices directly). Fixes aborts in Panfrost in apps using GL_POLYGON. Fixes: e881aa8c12c ("gallium/util: Add u_stream_outputs_for_vertices helper") Signed-off-by: Alyssa Rosenzweig <[email protected]> Revewied-by: Eric Anholt <[email protected]>
* intel: Add pci-ids for Jasper LakeAnuj Phogat2019-12-092-0/+4
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: Add device info for 1x4x6 Jasper LakeAnuj Phogat2019-12-091-4/+21
| | | | | | | | Also removing the FIXME comments after matching the numbers with updated documentation. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* lima: expose tiled format modifier in query_dmabuf_modifiers()Vasily Khoruzhick2019-12-091-0/+1
| | | | | | Fixes: 8c12f4e5f24f ("lima: enable tiling") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: handle DRM_FORMAT_MOD_INVALID in resource_from_handle()Vasily Khoruzhick2019-12-091-0/+4
| | | | | | | | | Assume that resource is tiled if we get DRM_FORMAT_MOD_INVALID in resource_from_handle() and we don't have RO. Fixes: 8c12f4e5f24f ("lima: enable tiling") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* turnip: add hw binningJonathan Marek2019-12-094-19/+372
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: do not use VK_TRUE/VK_FALSESamuel Pitoiset2019-12-091-12/+12
| | | | | | | For consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallivm: add bitfield reverse and ufind_msbDave Airlie2019-12-093-0/+41
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Krzysztof Raszkowski <[email protected]>
* gallium/scons: fix graw_gdi buildRoland Scheidegger2019-12-071-0/+2
| | | | | | Fixes: 44a6b0107b37 (gallivm: add nir->llvm translation (v2)) Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* aco: propagate temporaries into expanded vectorsDaniel Schürmann2019-12-071-2/+7
| | | | | | | | | Gives a very slight decrease in code size: Totals from affected shaders: Code Size: 1708488 -> 1702768 (-0.33 %) bytes Max Waves: 2858 -> 2855 (-0.10 %) Reviewed-by: Rhys Perry <[email protected]>
* aco: improve readfirstlane after uniform ssbo loads on GFX7Daniel Schürmann2019-12-071-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | pipeline-db changes for GFX7: 80310 shaders in 40472 tests Totals: SGPRS: 3655900 -> 3643916 (-0.33 %) VGPRS: 2678324 -> 2686324 (0.30 %) Spilled SGPRs: 1730 -> 1634 (-5.55 %) Spilled VGPRs: 14 -> 21 (50.00 %) Scratch size: 15540 -> 15536 (-0.03 %) dwords per thread Code Size: 136106120 -> 135457616 (-0.48 %) bytes LDS: 1259 -> 1259 (0.00 %) blocks Max Waves: 601014 -> 600206 (-0.13 %) Totals from affected shaders: SGPRS: 307832 -> 295848 (-3.89 %) VGPRS: 267864 -> 275864 (2.99 %) Spilled SGPRs: 770 -> 674 (-12.47 %) Spilled VGPRs: 14 -> 21 (50.00 %) Scratch size: 16 -> 12 (-25.00 %) dwords per thread Code Size: 22007488 -> 21358984 (-2.95 %) bytes LDS: 65 -> 65 (0.00 %) blocks Max Waves: 28668 -> 27860 (-2.82 %) Reviewed-by: Rhys Perry <[email protected]>
* aco: use soffset for MUBUF instructions on SI/CIDaniel Schürmann2019-12-071-15/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | pipeline-db changes for GFX7: 80310 shaders in 40472 tests Totals: SGPRS: 3655300 -> 3655900 (0.02 %) VGPRS: 2677732 -> 2678324 (0.02 %) Spilled SGPRs: 1730 -> 1730 (0.00 %) Spilled VGPRs: 14 -> 14 (0.00 %) Scratch size: 15540 -> 15540 (0.00 %) dwords per thread Code Size: 136488364 -> 136106120 (-0.28 %) bytes LDS: 1259 -> 1259 (0.00 %) blocks Max Waves: 601039 -> 601014 (-0.00 %) Totals from affected shaders: SGPRS: 316312 -> 316912 (0.19 %) VGPRS: 273844 -> 274436 (0.22 %) Spilled SGPRs: 770 -> 770 (0.00 %) Spilled VGPRs: 14 -> 14 (0.00 %) Scratch size: 16 -> 16 (0.00 %) dwords per thread Code Size: 22724904 -> 22342660 (-1.68 %) bytes LDS: 114 -> 114 (0.00 %) blocks Max Waves: 30861 -> 30836 (-0.08 %) Reviewed-by: Rhys Perry <[email protected]>
* radv: Enable ACO on GFX7 (Sea Islands)Daniel Schürmann2019-12-071-2/+3
| | | | | | | | This patch also disables AMD_shader_ballot on GFX7 by default if ACO is used. Note that shader_ballot works correctly, but performance seems inferior. To enable shader_ballot use RADV_PERFTEST=shader_ballot. Reviewed-by: Rhys Perry <[email protected]>
* aco: return to loop_active mask at continue_or_break blocksDaniel Schürmann2019-12-071-13/+4
| | | | Reviewed-by: Rhys Perry <[email protected]>
* radv: disable Youngblood app profile if ACO is usedDaniel Schürmann2019-12-071-2/+3
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement exclusive scan for SI/CIDaniel Schürmann2019-12-071-2/+35
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement inclusive_scan for SI/CIDaniel Schürmann2019-12-071-5/+41
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement (clustered) reductions for SI/CIDaniel Schürmann2019-12-072-39/+74
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: don't use a scalar temporary for reductions on GFX10Daniel Schürmann2019-12-072-3/+3
| | | | | | This patch also adds the scalar temporary for scans on SI/CI Reviewed-by: Rhys Perry <[email protected]>
* aco: flush denorms after fmin/fmax on pre-GFX9Daniel Schürmann2019-12-071-15/+46
| | | | Reviewed-by: Rhys Perry <[email protected]>
* radv: only flush scalar cache for SSBO writes with ACO on GFX8+Daniel Schürmann2019-12-071-1/+2
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: disable disassembly for SI/CI due to lack of support by LLVMDaniel Schürmann2019-12-071-0/+4
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement 64bit ine/ieq for SI/CIDaniel Schürmann2019-12-071-5/+7
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement 64bit i2b for SI /CIDaniel Schürmann2019-12-071-2/+7
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: make 1/2*PI a literal constant on SI/CIDaniel Schürmann2019-12-074-15/+19
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement 64bit VGPR shifts for SI/CIDaniel Schürmann2019-12-071-7/+27
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: split read/writelane opcode into VOP2/VOP3 version for SI/CIDaniel Schürmann2019-12-079-35/+72
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: fix disassembly of writelane instructions.Daniel Schürmann2019-12-071-1/+7
| | | | | | | ACO writes an unused 3rd operand for internal usage which makes LLVM recoginize it as illegal instruction. Reviewed-by: Rhys Perry <[email protected]>
* aco: recognize SI/CI SMRD hazardsDaniel Schürmann2019-12-071-2/+27
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement quad swizzles for SI/CIDaniel Schürmann2019-12-071-30/+75
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: move buffer_store data to VGPR if neededDaniel Schürmann2019-12-071-1/+1
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement nir_op_isign on SI/CIDaniel Schürmann2019-12-071-2/+7
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: only use scalar loads for readonly buffers on SI/CIDaniel Schürmann2019-12-071-1/+1
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: implement nir_op_fquantize2f16 for SI/CIDaniel Schürmann2019-12-071-7/+16
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: fix SMEM offsets for SI/CIDaniel Schürmann2019-12-071-1/+2
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: SI/CI - fix sampler anisoDaniel Schürmann2019-12-071-5/+20
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: handle gfx7 int8/10 clamping on exportsDave Airlie2019-12-071-8/+37
| | | | | | Co-authored-by: Daniel Schürmann <[email protected]> Reviewed-by: Rhys Perry <[email protected]>
* aco: Initial GFX7 SupportDaniel Schürmann2019-12-074-72/+95
| | | | Reviewed-by: Rhys Perry <[email protected]>
* aco: refactor visit_store_fs_output() to use the BuilderDaniel Schürmann2019-12-071-49/+15
| | | | Reviewed-by: Rhys Perry <[email protected]>
* anv: Re-emit all compute state on pipeline switchJason Ekstrand2019-12-071-0/+7
| | | | | | | | | It's a very odd case to hit in the real world. However, there are some CTS tests which switch back and forth between dispatch and clear without changing the pipeline. Fixes: bc612536eb2f "anv: Emit a dummy MEDIA_VFE_STATE before switching..." Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Re-capture all batch and state buffersJason Ekstrand2019-12-071-6/+3
| | | | | | | | | When we moved from allocating BOs directly to using the BO cache, we lost the EXEC_OBJECT_CAPTURE flag on all our state buffers. Fixes: 3119b96bdf57 "anv: Allocate block pool BOs from the cache" Fixes: ee77938733cd "anv: Allocate batch and fence buffers from..." Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>