summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: add support for Polaris (v2)Sonny Jiang2016-03-242-0/+10
| | | | | | | | | v2: Polaris chips should be defined after Stoney Signed-off-by: Sonny Jiang <[email protected]> (v1) Reviewed-by: Michel Dänzer <[email protected]> (v1) Signed-off-by: Leo Liu <[email protected]> (v2 diff) Reviewed-by: Alex Deucher <[email protected]> (v2 diff)
* radeonsi: silence a coverity warningNicolai Hähnle2016-03-241-1/+1
| | | | | | | | | | | | | | | The following Coverity warning 5378 tmpl.fetch_args = atomic_fetch_args; 5379 tmpl.emit = atomic_emit; >>> CID 1357115: Uninitialized variables (UNINIT) >>> Using uninitialized value "tmpl". Field "tmpl.intr_name" is uninitialized. 5380 bld_base->op_actions[TGSI_OPCODE_ATOMUADD] = tmpl; 5381 bld_base->op_actions[TGSI_OPCODE_ATOMUADD].intr_name = "add"; ... is a false positive, but what the hell. This change should "fix" it. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix out-of-bounds indexing of shader imagesNicolai Hähnle2016-03-231-1/+43
| | | | | | | | | | | | Results are undefined but may not crash. Without this change, out-of-bounds indexing can lead to VM faults and GPU hangs. Constant buffers, samplers, and possibly others will eventually need similar treatment to support GL_ARB_robust_buffer_access_behavior. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* radeonsi: cache flush/invalidation for missing PIPE_BARRIER_*_BUFFER bits (v2)Nicolai Hähnle2016-03-231-2/+12
| | | | | | | | | This fixes arb_shader_image_load_store-host-mem-barrier. v2: flush TC L2 for index buffers on <= CIK (Marek) Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix 2D array MSAA failures since image support landedMarek Olšák2016-03-231-1/+2
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* radeonsi: Set PIPE_SHADER_CAP_MAX_SHADER_IMAGESEdward O'Callaghan2016-03-211-1/+2
| | | | | | | | | This enables ARB_shader_image_load_store and ARB_shader_image_size. Signed-off-by: Edward O'Callaghan <[email protected]> [allow the same number of images for all shader stages and require LLVM 3.9] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: disable early Z if the fragment shader writes to memoryNicolai Hähnle2016-03-211-2/+12
| | | | | | Empirically, both the EXEC_ON_* flags and LATE_Z are necessary. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: force the DCC enable bit off in image descriptors for writing (v2)Nicolai Hähnle2016-03-211-8/+49
| | | | | | | | This avoids a lockup at least on Tonga. v2: only force DCC off on VI+ (Marek) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement MemoryBarrier (v2)Nicolai Hähnle2016-03-211-0/+37
| | | | | | v2: invalidate both constant and VMEM/TC L1 for constant buffers (Marek) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement volatile memory accessNicolai Hähnle2016-03-211-0/+4
| | | | | | | | | | Prevent loads from being re-ordered or coalesced. Atomics don't need special handling by definition, and stores don't need special handling because LLVM is unable to detect dead image or buffer stores. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement coherent memory access (v2)Nicolai Hähnle2016-03-211-4/+13
| | | | | | v2: set glc=1 for volatile also on buffers Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_MEMBAR down to LLVM opNicolai Hähnle2016-03-211-0/+31
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_ATOM* down to LLVM opNicolai Hähnle2016-03-211-8/+113
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_STORE down to LLVM opNicolai Hähnle2016-03-211-3/+80
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_LOAD down to LLVM op (v3)Nicolai Hähnle2016-03-211-0/+139
| | | | | | | v2: new signature style for buffer intrinsics (offsets) v3: new signature style for llvm.amdgcn.buffer.load.format (overloaded return) Reviewed-by: Marek Olšák <[email protected]> (v2)
* radeonsi: extract the LLVM type name construction into its own functionNicolai Hähnle2016-03-211-7/+19
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Lower TGSI_OPCODE_RESQ down to LLVM opNicolai Hähnle2016-03-211-0/+129
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract TXQ buffer size computation into its own functionNicolai Hähnle2016-03-211-20/+35
| | | | | | This will allow it to be reused for RESQ. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress shader imagesNicolai Hähnle2016-03-211-3/+33
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: update shader image descriptor for invalidated bufferNicolai Hähnle2016-03-211-1/+21
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement set_shader_images (v2)Nicolai Hähnle2016-03-216-29/+254
| | | | | | | | | Whether DCC is disabled depends on the access flags with which the image is bound: image_load supports DCC, but store and atomic don't. v2: remove an unnecessary masking of images->desc.enabled_mask Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: remove old CS tracingMarek Olšák2016-03-202-5/+3
| | | | | | | | | | | | | | Cons: - it was only integrated in r600g - it doesn't work with GPUVM - it records buffer contents at the end of IBs instead of at the beginning, so the replay isn't exact - it lacks an IB parser and user-friendliness A better solution is apitrace in combination with gallium/ddebug, which has a complete IB parser and can pinpoint hanging CP packets. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: process TGSI property NEXT_SHADERMarek Olšák2016-03-192-3/+33
| | | | | | | | | | | | This allows compiling the main shader part as ES or LS. If we get the correct hint, non-separable GLSL shaders no longer have to be compiled as VS first, followed by LS or ES compiled on demand. The result is that fewer shaders are compiled by piglit, but it doesn't improve piglit running time. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set DEPTH_BEFORE_SHADER based on FS_EARLY_DEPTH_STENCILNicolai Hähnle2016-03-141-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: avoid crash when a sampler state is bound for a buffer textureNicolai Hähnle2016-03-131-0/+1
| | | | | | | | | | | | | Sampler states don't really make sense with buffer textures, but they can be set anyway, so we need to be defensive here. This bug was lurking for a while and was finally noticed due to PBO uploads setting sampler states. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94284 Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Tested-by: Laurent Carlier <[email protected]> Tested-by: Shawn Starr <[email protected]>
* radeonsi: Lazily re-set sampler views after disabling DCCBas Nieuwenhuizen2016-03-111-3/+8
| | | | | | | | | | Clear DCC flags if necessary when binding a new sampler view. v2: Do not reset DCC flags of bound sampler views. v3: Check that we have a real texture (Nicolai) Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: update compressed_colortex_masks when a cmask is created or disabledNicolai Hähnle2016-03-103-2/+51
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_decompress_textures to si_blit.cNicolai Hähnle2016-03-103-23/+23
| | | | | | | | | Since it is all about calling into blitter functions, it makes more sense here. This change also reduces the size of the interfaces between .c files. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add CAPs returning PCI device locationMarek Olšák2016-03-091-0/+8
| | | | Reviewed-by: Brian Paul <[email protected]>
* radeonsi: set amdgpu metadata before exporting a textureMarek Olšák2016-03-093-1/+67
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: extract the texture descriptor computation into its own functionNicolai Hähnle2016-03-091-164/+186
| | | | | | | This will allow this code to be re-used for shader images. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: extract the buffer descriptor computation into its own functionNicolai Hähnle2016-03-091-25/+48
| | | | | | | This will allow it to be re-used for shader image descriptors. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove resource field from si_sampler_viewNicolai Hähnle2016-03-093-4/+2
| | | | | | | view->resource is redundant with view->base.texture, so get rid of it. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: accept pipe_resource in si_sampler_view_add_bufferMarek Olšák2016-03-091-11/+12
| | | | | | | | | | and rename .._buffers -> .._buffer Based loosely on Nicolai's patch. This will make it easier to cherry-pick Nicolai's patches from his image support branch. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable DCC on handle export if expecting write accessMarek Olšák2016-03-091-0/+12
| | | | | | | This should be okay except that sampler views and images are not re-set. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add DCC decompression (v2)Bas Nieuwenhuizen2016-03-094-9/+23
| | | | | | | | | | | | This is currently not needed but will be necessary when we have features that do not work with DCC enabled, such as image stores and sharing non-scanout surfaces. v2: Marek: rebase, remove decompression from si_flush_resource (not needed) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allocate DCC in the same backing buffer as the textureMarek Olšák2016-03-096-27/+15
| | | | | | | To allow sharing textures with DCC enabled. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2)Marek Olšák2016-03-091-1/+11
| | | | | | | v2: remove the list of all contexts Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: eliminate fast color clear before sharingMarek Olšák2016-03-091-1/+1
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGINGOded Gabbay2016-03-031-5/+1
| | | | | | | | | | | | | | | | | | There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: also dump shaders on a VM faultMarek Olšák2016-03-011-2/+1
| | | | Reviewed-by: Christian König <[email protected]>
* radeonsi: dump full shader disassemblies into ddebug logsMarek Olšák2016-03-011-9/+9
| | | | | | including prolog and epilog disassemblies Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow dumping shader disassemblies to a fileMarek Olšák2016-03-013-22/+29
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use re-ZMarek Olšák2016-03-013-6/+21
| | | | | | | | | This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). v2: add comments Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement binary shaders & shader cache in memory (v2)Marek Olšák2016-02-215-7/+259
| | | | | | | v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move some struct si_shader members to new struct si_shader_infoMarek Olšák2016-02-213-68/+71
| | | | | | This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use smaller types for some si_shader membersMarek Olšák2016-02-212-3/+8
| | | | | | | | in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable compiling one variant per shaderMarek Olšák2016-02-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print full shader name before disassemblyMarek Olšák2016-02-211-1/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compile non-GS middle parts of shaders immediately if enabledMarek Olšák2016-02-213-18/+87
| | | | | | | | | | | | | | Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <[email protected]>