summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFERMarek Olšák2016-10-041-4/+3
| | | | | | | Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: parse SURFACE_SYNC correctly on CIK-VIMarek Olšák2016-10-041-9/+16
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: Fix primitive restart when index changesJames Legg2016-10-041-7/+7
| | | | | | | | | | | If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: emit relocations for query fencesNicolai Hähnle2016-09-301-1/+1
| | | | | | | | | This is only needed for r600 which doesn't have ARB_query_buffer_object and therefore wouldn't really need the fences, but let's be optimistic about filling in this feature gap eventually. Cc: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: enable ARB_query_buffer_object (v2)Nicolai Hähnle2016-09-291-7/+14
| | | | | | | v2: enable only when compute is available Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add save_qbo_stateNicolai Hähnle2016-09-291-0/+12
| | | | | | | | Save compute shader state that will be used for the ARB_query_buffer_object implementation. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_get_shader_buffers/get_pipe_constant_buffers (v2)Nicolai Hähnle2016-09-292-0/+51
| | | | | | | | | | These functions extract the pipe state structure from the current descriptors, for state saving. v2: correctly dereference *buf (Bas) Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add r600_gfx_{write,wait}_fenceNicolai Hähnle2016-09-291-38/+3
| | | | | | | For bottom-of-pipe fences inside the gfx command stream. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add barrier_flags to r600_common_screenNicolai Hähnle2016-09-291-0/+5
| | | | | | | | | | | There are driver-specific context flags for barriers that are not covered by the Gallium barrier interfaces. The R600 settings of these flags may not be optimal, but we're not going to use them yet anyway. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3Tom Stellard2016-09-161-17/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/compute: Add some more debug printfsTom Stellard2016-09-161-0/+3
|
* radeonsi: reload PS inputs with direct indexing at each use (v2)Marek Olšák2016-09-141-16/+11
| | | | | | | | | | | | | | | | | | | | | The LLVM compiler can CSE interp intrinsics thanks to LLVMReadNoneAttribute. 26011 shaders in 14651 tests Totals: SGPRS: 1146340 -> 1132676 (-1.19 %) VGPRS: 727371 -> 711730 (-2.15 %) Spilled SGPRs: 2218 -> 2078 (-6.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35841268 -> 36009732 (0.47 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222559 -> 224779 (1.00 %) Wait states: 0 -> 0 (0.00 %) v2: don't call load_input for fragment shaders in emit_declaration Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: get rid of constant buffer preloadingMarek Olšák2016-09-141-24/+14
| | | | | | | | | | | | | | | | | 26011 shaders in 14651 tests Totals: SGPRS: 1152636 -> 1146340 (-0.55 %) VGPRS: 728198 -> 727371 (-0.11 %) Spilled SGPRs: 3776 -> 2218 (-41.26 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 35835152 -> 35841268 (0.02 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222372 -> 222559 (0.08 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: get rid of img/buf/sampler descriptor preloading (v2)Marek Olšák2016-09-141-132/+47
| | | | | | | | | | | | | | | | | | 26011 shaders in 14651 tests Totals: SGPRS: 1251920 -> 1152636 (-7.93 %) VGPRS: 728421 -> 728198 (-0.03 %) Spilled SGPRs: 16644 -> 3776 (-77.31 %) Spilled VGPRs: 369 -> 369 (0.00 %) Scratch VGPRs: 1344 -> 1344 (0.00 %) dwords per thread Code Size: 36001064 -> 35835152 (-0.46 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 222221 -> 222372 (0.07 %) Wait states: 0 -> 0 (0.00 %) v2: merge codepaths where possible Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename get_sampler_desc -> load_sampler_descMarek Olšák2016-09-141-11/+11
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: cosmetic changes in si_shader.cMarek Olšák2016-09-141-3/+5
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: load streamout buffer descriptors before use (v2)Marek Olšák2016-09-141-33/+14
| | | | | | v2: inline the code and remove the conditional that's a no-op now Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix FP64 UBO loads with indirect uniform block indexingMarek Olšák2016-09-131-2/+1
| | | | | | | No known tests. Cc: [email protected] Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up CP DMA emit codeMarek Olšák2016-09-131-84/+60
| | | | | | | | Unify the clear and copy paths, clean up the definitions. It looks more like a rework. It's a preparation for GDS support, which might or might not come. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the IB and buffer list in VM fault reportsMarek Olšák2016-09-131-1/+2
| | | | | | This is a fallout from reworking the debug flags. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add sampler view BOs to the BO list lastMarek Olšák2016-09-131-7/+10
| | | | | | | | | If si_sampler_view_add_buffer ends up flushing, then the code in begin_new_cs would previously have added the buffer(s) for whatever was previously bound to that slot. Now it would add only the new buffer. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: export SampleMask from pixel shaders at full rateMarek Olšák2016-09-133-16/+56
| | | | | | | Heaven and Valley write gl_SampleMask and not Z. Use 16_ABGR instead of 32_ABGR if Z isn't written. Reviewed-by: Nicolai Hähnle <[email protected]>
* android: add support for libmesa_amdgpu_addrlibMauro Rossi2016-09-131-1/+3
| | | | | | | | | | | Android porting of the following commits: f1f1ba3 "radeonsi: move sid.h/r600d_common.h to a common place." 69fca64 "amd/addrlib: move addrlib from amdgpu winsys to common code" This patch fixes android building errors Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: don't preload constants at the beginning of shadersMarek Olšák2016-09-121-20/+11
| | | | | | | | | | | | | | | | | | | | | | LLVM can CSE the loads, thus we can always re-load constants before each use. The decrease in SGPR spilling is huge. The best improvements are the dumbest ones. 26011 shaders in 14651 tests Totals: SGPRS: 1453346 -> 1251920 (-13.86 %) VGPRS: 742576 -> 728421 (-1.91 %) Spilled SGPRs: 52298 -> 16644 (-68.17 %) Spilled VGPRs: 397 -> 369 (-7.05 %) Scratch VGPRs: 1372 -> 1344 (-2.04 %) dwords per thread Code Size: 36136488 -> 36001064 (-0.37 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 219315 -> 222221 (1.33 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: flush TC L2 before using a compute indirect bufferMarek Olšák2016-09-091-2/+10
| | | | | | | There is no known test for this. Cc: 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix the VGT performance tweak for small instancesMarek Olšák2016-09-091-5/+6
| | | | | | | | Based on the VGT spec. The Vulkan driver doesn't do it optimally and they plan to fix it. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove the cache_flush atomMarek Olšák2016-09-097-12/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: remove PIPE_BIND_TRANSFER_READ/WRITEMarek Olšák2016-09-081-5/+0
| | | | | | | | not used in any useful way Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "radeonsi: enable SDMA on CIK"Marek Olšák2016-09-081-0/+4
| | | | | | | This reverts commit 0241d8300f66ee2c6c2c55fe64ac88d76440c591. It doesn't work with mobile Bonaire. It looks like the programming of tiling parameters is wrong on some chips.
* radeonsi: skip redundant INDEX_TYPE writesMarek Olšák2016-09-073-20/+32
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add more unlikely() uses into si_draw_vboMarek Olšák2016-09-071-5/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip draws with instance_count == 0Marek Olšák2016-09-071-3/+13
| | | | | | loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move sid.h/r600d_common.h to a common place.Dave Airlie2016-09-063-9060/+3
| | | | | | | | | | Step one to merging radv would be to move some files around. This only adds the include path to r600/radeonsi, because later we want to avoid having to add it to the generic target paths. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove VPORT_ZMIN/ZMAX from init config statesMarek Olšák2016-09-051-6/+0
| | | | | | | It's part of the viewport state now. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: set VPORT_ZMIN/MAX registers correctlyMarek Olšák2016-09-053-1/+4
| | | | | | | | | | | | Calculate depth ranges from viewport states and pipe_rasterizer_state::clip_halfz. The evergreend.h change is required to silence a warning. This fixes this recently updated piglit: arb_depth_clamp/depth-clamp-range Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: also do VS_PARTIAL_FLUSH before updating VGT ring pointersMarek Olšák2016-09-051-0/+6
| | | | | | | ported from Vulkan Acked-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix variable naming in si_emit_cache_flushMarek Olšák2016-09-051-31/+31
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not usedMarek Olšák2016-09-053-1/+5
| | | | | | | for less noise in the HUD Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add HUD queries for counting VS/PS/CS partial flushesMarek Olšák2016-09-051-0/+8
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a badly implemented GS bug workaroundMarek Olšák2016-09-051-8/+13
| | | | | | | Limit it to geometry shaders and Hawaii. Acked-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix texture format reinterpretation with DCCMarek Olšák2016-09-053-1/+14
| | | | | | | | | | | | DCC is limited in how texture formats can be reinterpreted using texture views. If we get a view format that is incompatible with the initial texture format with respect to DCC, disable DCC. There is a new piglit which tests all format combinations. What works and what doesn't was deduced by looking at the piglit failures. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix Gather4 with integer formatsMarek Olšák2016-09-051-3/+96
| | | | | | | | | | The closed compiler does the same thing. This fixes: GL45-CTS.texture_gather.*-int-* (18 tests) Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a crash in imageSize for cubemap arraysMarek Olšák2016-09-051-3/+1
| | | | | | | | | | | Sometimes it was f32, other times it was i32. Now it's always i32. This fixes: GL45-CTS.texture_cube_map_array.image_texture_size.texture_size_compute_sh Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix gl_PatchVerticesIn for tessellation evaluation shaderMarek Olšák2016-09-051-1/+6
| | | | | | | | | This fixes: GL45-CTS.tessellation_shader.tessellation_control_to_tessellation_evaluation .gl_PatchVerticesIn Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix cubemaps viewed as 2DMarek Olšák2016-09-051-0/+4
| | | | | | | | | | | This fixes: GL43-CTS.texture_view.view_sampling v2: fix a typo, merge both if statements Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]> (v1) Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always use the same function signature for llvm.SI.exportMarek Olšák2016-09-051-4/+4
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: derive buffer placement and flags only at initializationMarek Olšák2016-09-051-3/+2
| | | | | | | | | | Invalidated buffers don't have to go through it. Split r600_init_resource into r600_init_resource_fields and r600_alloc_resource. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set more sampler settingsMarek Olšák2016-09-052-2/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* Introduce .editorconfigEric Engestrom2016-08-311-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | A few weeks ago, Jose Fonseca suggested [0] we use .editorconfig files to try and enforce the formatting of the code, to which Michel Dänzer suggested [1] we start by importing the existing .dir-locals.el settings. The first draft was discussed in the RFC [2]. These .editorconfig are a first step, one that has the advantage of requiring little to no intervention from the devs once the settings files are in place, but the settings are very limited. This does have the advantage of applying while the code is being written. This doesn't replace the need for more comprehensive formatting tools such as clang-format & clang-tidy, but those reformat the code after the fact. [0] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121545.html [1] https://lists.freedesktop.org/archives/mesa-dev/2016-June/121639.html [2] https://lists.freedesktop.org/archives/mesa-dev/2016-July/123431.html Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Eric Anholt <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: add support for cull distances. (v1.1)Dave Airlie2016-08-302-4/+5
| | | | | | | | | | This should be all that is required for cull distances to work on radeonsi. v1.1: whitespace cleanup, add docs fix clipdist_mask usage. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>