summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: allocate DCC in the same backing buffer as the textureMarek Olšák2016-03-096-27/+15
| | | | | | | To allow sharing textures with DCC enabled. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: disable CMASK on handle export if sharing doesn't allow it (v2)Marek Olšák2016-03-091-1/+11
| | | | | | | v2: remove the list of all contexts Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: eliminate fast color clear before sharingMarek Olšák2016-03-091-1/+1
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Do colorformat endian swap for PIPE_USAGE_STAGINGOded Gabbay2016-03-031-5/+1
| | | | | | | | | | | | | | | | | | There is an old if statement (dated to 2011) that prevented doing endian swap for colorformat, in case the buffer is marked as PIPE_USAGE_STAGING. This is now wrong because st_ReadPixels() reads into a destination texture that is marked with PIPE_USAGE_STAGING. Therefore, even if the texture is rendered correctly to the monitor, when reading it back we get unswapped/wrong values. This patch makes the check_rgba() function in gl-1.0-readpixsanity piglit test pass in big-endian. Signed-off-by: Oded Gabbay <[email protected]> Cc: "11.1 11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: also dump shaders on a VM faultMarek Olšák2016-03-011-2/+1
| | | | Reviewed-by: Christian König <[email protected]>
* radeonsi: dump full shader disassemblies into ddebug logsMarek Olšák2016-03-011-9/+9
| | | | | | including prolog and epilog disassemblies Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow dumping shader disassemblies to a fileMarek Olšák2016-03-013-22/+29
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: use re-ZMarek Olšák2016-03-013-6/+21
| | | | | | | | | This can increase perf for shaders that kill pixels (kill, alpha-test, alpha-to-coverage). v2: add comments Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement binary shaders & shader cache in memory (v2)Marek Olšák2016-02-215-7/+259
| | | | | | | v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move some struct si_shader members to new struct si_shader_infoMarek Olšák2016-02-213-68/+71
| | | | | | This will be part of shader binaries. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use smaller types for some si_shader membersMarek Olšák2016-02-212-3/+8
| | | | | | | | in order to decrease the shader size for a shader cache. v2: add & use SI_MAX_VS_OUTPUTS Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable compiling one variant per shaderMarek Olšák2016-02-211-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Shader stats from VERDE: Default scheduler: Totals: SGPRS: 491272 -> 488672 (-0.53 %) VGPRS: 289980 -> 311093 (7.28 %) Code Size: 11091656 -> 11219948 (1.16 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 1732608 -> 2246656 (29.67 %) bytes per wave Max Waves: 78063 -> 77352 (-0.91 %) Wait states: 0 -> 0 (0.00 %) Looking at some of the worst regressions, I get: - The VGPR increase seems to be caused by the fact that if PS has used less than 16 VGPRs, now it will always use 16 VGPRs and sometimes even 20. However, the wave count remains at 10 if VGPRs <= 24, so no harm there. - The scratch increase seems to be caused by SGPR spilling. The unnecessary SGPR spilling has been an ongoing issue with the compiler and it's completely fixable by rematerializing s_loads or reordering instructions. SI scheduler: Totals: SGPRS: 374848 -> 374576 (-0.07 %) VGPRS: 284456 -> 307515 (8.11 %) Code Size: 11433068 -> 11535452 (0.90 %) bytes LDS: 97 -> 97 (0.00 %) blocks Scratch: 509952 -> 522240 (2.41 %) bytes per wave Max Waves: 79456 -> 78217 (-1.56 %) Wait states: 0 -> 0 (0.00 %) VGPRs - same story as before. The SI scheduler doesn't spill SGPRs so much and generally spills way less than the default scheduler. (522240 spills vs 2246656 spills) Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print full shader name before disassemblyMarek Olšák2016-02-211-1/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compile non-GS middle parts of shaders immediately if enabledMarek Olšák2016-02-213-18/+87
| | | | | | | | | | | | | | Still disabled. Only prologs & epilogs are compiled in draw calls, but each variant of those is compiled only once per process. VS is always compiled as hw VS. TES is always compiled as hw VS. LS and ES stages are always compiled on demand. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rework polygon stippling for PS prologMarek Olšák2016-02-211-39/+110
| | | | | | Don't use the pstipple module. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS prologMarek Olšák2016-02-215-2/+345
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS epilogMarek Olšák2016-02-214-2/+297
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add TCS epilogMarek Olšák2016-02-214-13/+155
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS epilogMarek Olšák2016-02-214-11/+171
| | | | | | | | | It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS prologMarek Olšák2016-02-214-1/+267
| | | | | | This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: first bits for non-monolithic shadersMarek Olšák2016-02-214-14/+45
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add code for dumping all shader parts together (v2)Marek Olšák2016-02-211-12/+34
| | | | | | v2: unify some code into si_get_shader_binary_size Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add code for combining and uploading shaders from 3 shader partsMarek Olšák2016-02-212-8/+36
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fail compilation if non-GS non-CS shaders have rodataMarek Olšák2016-02-211-0/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: separate 2 pieces of code from create_functionMarek Olšák2016-02-211-31/+51
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add samplemask parameter to si_export_mrt_colorMarek Olšák2016-02-211-3/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add start_instance parameter to get_instance_index_for_fetchMarek Olšák2016-02-211-4/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: separate out shader key bits for prologs & epilogsMarek Olšák2016-02-214-100/+140
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compute how many input VGPRs fragment shaders haveMarek Olšák2016-02-212-0/+43
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compute how many input SGPRs and VGPRs shaders haveMarek Olšák2016-02-212-0/+34
| | | | | | | Prologs (shader binaries inserted before the API shader binary) need to know this, so that they won't change the input registers unintentionally. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add basic code for setting shader return valuesMarek Olšák2016-02-211-3/+6
| | | | | | LLVMBuildInsertValue will be used on return_value. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/llvm: Set the target triple on the moduleTom Stellard2016-02-171-1/+1
| | | | | Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_IMAGESIlia Mirkin2016-02-151-0/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add PIPE_SHADER_CAP_SUPPORTED_IRSSamuel Pitoiset2016-02-131-0/+6
| | | | | | | | | | | | This cap indicates the supported representations of programs. It should be a mask of pipe_shader_ir bits. It will allow to enable ARB_compute_shader if the underlying driver supports TGSI. Changes from v2: - improve description of PIPE_SHADER_CAP_SUPPORTED_IRS Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add a new interface for pipe_context::launch_grid()Samuel Pitoiset2016-02-131-16/+15
| | | | | | | | | | | | | This introduces pipe_grid_info which contains all information to describe a launch_grid call. This will be used to implement indirect compute in the same fashion as indirect draw. Changes from v2: - correctly initialize pipe_grid_info for nv50/nvc0 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* radeonsi: fix build with LLVM 3.6Marek Olšák2016-02-121-1/+1
| | | | | | Broken by this cleanup: 3dc1cb0cc7605a2f3128311f5a6052f740fc7b0d Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: drop support for LLVM 3.5Marek Olšák2016-02-113-78/+7
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> v2: adjust the comment in the amdgpu winsys
* radeonsi: obtain commonly used LLVM types only onceMarek Olšák2016-02-111-215/+194
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: cleanup shader codegenMarek Olšák2016-02-111-425/+425
| | | | | | | | si_shader_ctx -> ctx type * ptr -> type *ptr si_shader_context *shader -> si_shader_context *ctx Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a crash when binding a sampler bufferMarek Olšák2016-02-111-1/+2
| | | | | | | | | Buffers don't contain r600_texture. Broken by 7aedbbacae6d3ec3d06735fff2eb66: "radeonsi: put image, fmask, and sampler descriptors into one array" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94091
* radeonsi: don't emit unnecessary NULL exports for unbound targets (v3)Marek Olšák2016-02-101-26/+68
| | | | | | | | v2: remove semantic index == 0 checks add the else statement to remove shadowing of args v3: fix fbo-alphatest-nocolor regression Reviewed-by: Nicolai Hähnle <[email protected]> (v2)
* radeonsi: put image, fmask, and sampler descriptors into one arrayMarek Olšák2016-02-106-116/+138
| | | | | | | | | | | | | | | | | | | The texture slot is expanded to 16 dwords containing 2 descriptors. Those can be: - Image and fmask, or - Image and sampler state By carefully choosing the locations, we can put all three into one slot, with the fmask and sampler state being mutually exclusive. This improves shaders in 2 ways: - 2 user SGPRs are unused, shaders can use them as temporary registers now - each pair of descriptors is always on the same cache line v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask at the same time Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable denorms for 64-bit and 16-bit floatsMarek Olšák2016-02-093-6/+29
| | | | | | | This fixes FP16 conversion instructions for VI, which has 16-bit floats, but not SI & CI, which can't disable denorms for those instructions. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compile geometry shaders immediatelyMarek Olšák2016-02-091-1/+2
| | | | | | they have only 1 variant Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split out code for deleting si_shaderMarek Olšák2016-02-091-29/+36
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move code writing tess factors into a separate functionMarek Olšák2016-02-091-9/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make LLVM IR dumping less messyMarek Olšák2016-02-093-9/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move a few r600_can_dump_shader calls to where they're neededMarek Olšák2016-02-091-5/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove useless code that handles dx10_clamp_modeMarek Olšák2016-02-093-14/+6
| | | | | | | "enable-no-nans-fp-math" is a wrong string and there was a disagreement about fixing it. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: dump SPI_PS_INPUT values along with shader statsMarek Olšák2016-02-091-0/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>