aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi/si_pipe.h
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: implement set_shader_buffersNicolai Hähnle2016-04-121-0/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: move scissor and viewport states into gallium/radeonMarek Olšák2016-04-121-23/+0
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Grigori Goronzy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: compute scissor from viewport in set_viewport_statesMarek Olšák2016-04-121-0/+8
| | | | | | | | | and clamp it right before emitting. This is a prerequisite for computing the guard band. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Grigori Goronzy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: move pipeline stat context flags to common codeMarek Olšák2016-04-121-3/+0
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement and rely on set_active_query_stateMarek Olšák2016-04-121-0/+4
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: expand the compressed color and depth texture masks to 64 bitsNicolai Hähnle2016-04-071-2/+2
| | | | | | | | | | | This is in preparation of raising the number of exposed sampler views to 32 bits, which will raise the total number of sampler views to 33 for the polygon stipple texture. That texture should never be compressed (and it's certainly not a depth texture), but this approach seems cleaner to me than special-casing the last slot in all affected code paths. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement set_shader_images (v2)Nicolai Hähnle2016-03-211-0/+7
| | | | | | | | | Whether DCC is disabled depends on the access flags with which the image is bound: image_load supports DCC, but store and atomic don't. v2: remove an unnecessary masking of images->desc.enabled_mask Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_decompress_textures to si_blit.cNicolai Hähnle2016-03-101-4/+1
| | | | | | | | | Since it is all about calling into blitter functions, it makes more sense here. This change also reduces the size of the interfaces between .c files. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove resource field from si_sampler_viewNicolai Hähnle2016-03-091-1/+0
| | | | | | | view->resource is redundant with view->base.texture, so get rid of it. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add DCC decompression (v2)Bas Nieuwenhuizen2016-03-091-0/+1
| | | | | | | | | | | | This is currently not needed but will be necessary when we have features that do not work with DCC enabled, such as image stores and sharing non-scanout surfaces. v2: Marek: rebase, remove decompression from si_flush_resource (not needed) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allocate DCC in the same backing buffer as the textureMarek Olšák2016-03-091-1/+0
| | | | | | | To allow sharing textures with DCC enabled. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement binary shaders & shader cache in memory (v2)Marek Olšák2016-02-211-0/+16
| | | | | | | v2: handle _mesa_hash_table_insert failure other cosmetic changes Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS prologMarek Olšák2016-02-211-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add PS epilogMarek Olšák2016-02-211-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add TCS epilogMarek Olšák2016-02-211-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS epilogMarek Olšák2016-02-211-0/+1
| | | | | | | | | It only exports the primitive ID. Also used by TES when it's compiled as VS. The VS input location of the primitive ID input is v2. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS prologMarek Olšák2016-02-211-0/+3
| | | | | | This is disabled with use_monolithic_shaders = true. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: first bits for non-monolithic shadersMarek Olšák2016-02-211-0/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: put image, fmask, and sampler descriptors into one arrayMarek Olšák2016-02-101-1/+0
| | | | | | | | | | | | | | | | | | | The texture slot is expanded to 16 dwords containing 2 descriptors. Those can be: - Image and fmask, or - Image and sampler state By carefully choosing the locations, we can put all three into one slot, with the fmask and sampler state being mutually exclusive. This improves shaders in 2 ways: - 2 user SGPRs are unused, shaders can use them as temporary registers now - each pair of descriptors is always on the same cache line v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask at the same time Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement forcing per-sample_interpolation using the shader key onlyMarek Olšák2016-02-091-2/+0
| | | | | | | | | | | It was partly a state and partly emulated by shader code, but since we want to do this in a fragment shader prolog, we need to put it into the shader key, which will be used to generate the prolog. This also removes the spi_ps_input states and moves the registers to the PS state. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename cb_target_mask state to cb_render_stateMarek Olšák2016-02-021-1/+1
| | | | | | | | and rename a variable in the function. SX_PS_DOWNCONVERT will be emitted here. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use all SPI color formatsMarek Olšák2016-01-221-0/+4
| | | | | | | | | because not using SPI_SHADER_32_ABGR doubles fill rate. We should also get optimal performance if alpha isn't needed or blending isn't enabled. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use SPI_SHADER_COL_FORMAT fields instead of export_16bpcMarek Olšák2016-01-221-1/+1
| | | | | | | | | | | | | This does change the behavior slightly: If a shader writes COLOR[i] and that color buffer isn't bound, the shader will export MRT_NULL instead and discard the IR tree that calculates the output. The only exception is alpha-to-coverage, which requires an alpha export. v2: - update a comment about 16BPC - account for MRTZ when when fixing alpha-test/kill Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add RADEON_REPLACE_SHADERS debug optionNicolai Hähnle2015-12-291-0/+1
| | | | | | | | | | | This option allows replacing a single shader by a pre-compiled ELF object as generated by LLVM's llc, for example. This can be useful for debugging a deterministically occuring error in shaders (and has in fact helped find the causes of https://bugs.freedesktop.org/show_bug.cgi?id=93264). v2: drop the debug flag, use DEBUG_GET_ONCE_OPTION instead Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement fast stencil clearMarek Olšák2015-12-111-0/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: implement AMD_performance_monitor for CIK+Nicolai Hähnle2015-11-251-0/+3
| | | | | | | | | | | | | | | | | | Expose most of the performance counter groups that are exposed by Catalyst. Ideally, the driver will work with GPUPerfStudio at some point, but we are not quite there yet. In any case, this is the reason for grouping multiple instances of hardware blocks in the way it is implemented. The counters can also be shown using the Gallium HUD. If one is interested to see how work is distributed across multiple shader engines, one can set the environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance counter groups. Part of the implementation is in radeon because an implementation for older hardware would largely follow along the same lines, but exposing a different set of blocks which are programmed slightly differently. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: calculate optimal GS ring sizes to fix GS hangs on TongaMarek Olšák2015-11-131-0/+1
| | | | | | | | | | | | | | I discovered that increasing the ESGS ring size fixes GS hangs on Tonga, so let's do it properly. There is now a separate init_config_gs_rings state that is not immutable, because GS rings are resized when needed. This also saves some memory. Most apps won't need more than 1MB per ring per shader engine. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: prevent recursion in si_context_gfx_flushMarek Olšák2015-11-131-0/+1
| | | | | | The recursion can only occur if you modify need_cs_space to always flush. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename cache flushing flags once moreMarek Olšák2015-11-131-9/+6
| | | | | | | | | | | | | | | KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Enable DCC.Bas Nieuwenhuizen2015-10-241-0/+1
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: add another requirement for PARTIAL_ES_WAVEMarek Olšák2015-10-241-0/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: allow unbinding pixel shaders and remove the dummy shaderMarek Olšák2015-10-241-3/+0
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: support thread-safe shaders shared by multiple contextsMarek Olšák2015-10-201-6/+15
| | | | | | | | | | | | The "current" shader pointer is moved from the CSO to the context, so that the CSO is mostly immutable. The only drawback is that the "current" pointer isn't saved when unbinding a shader and it must be looked up when the shader is bound again. This is also a prerequisite for multithreaded shader compilation. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: fix a GS hang on VIMarek Olšák2015-10-071-0/+1
| | | | | | | Broken by one of the cleanups: 0d46c3bc9d09b376d74f7399e1a2d1b0a923640b Not applicable to stable. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: implement the simple case of force_persample_interpMarek Olšák2015-10-031-0/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate stateMarek Olšák2015-10-031-0/+1
| | | | | | | This will be a derived state used for changing center->sample and centroid->sample at runtime. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: only do depth-only or stencil-only in-place decompressionMarek Olšák2015-10-031-1/+3
| | | | | | | instead of always doing both. Usually, only depth is needed, so stencil decompression is useless. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: dump buffer lists while debuggingMarek Olšák2015-10-031-0/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add an option for debugging VM faultsMarek Olšák2015-10-031-0/+2
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: rework uploading border colorsMarek Olšák2015-09-011-3/+5
| | | | | | | | | | | | | | The border colors are uploaded only once when the state is created. This brings truly immutable sampler descriptors, because they don't have to be updated every time a sampler state is re-bound. It also moves the TA_BC_BASE_ADDR registers to init_config, removing one more state. The catch is there is now a limit: only 4096 border colors can be used by one context. I don't think that will be a problem. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: reorder si_context variablesMarek Olšák2015-09-011-40/+45
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't send IB dword usage to si_need_cs_spaceMarek Olšák2015-09-011-1/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: don't count IB space for states, just use an upper boundMarek Olšák2015-09-011-7/+0
| | | | | | | | Since we don't put any resource descriptors in IBs, the space used by draw calls is quite small. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert SPI state to an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert CB_TARGET_MASK setup to an atomMarek Olšák2015-09-011-0/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert stencil ref state into an atomMarek Olšák2015-09-011-1/+1
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert blend color state into an atomMarek Olšák2015-09-011-0/+6
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert sample mask state into an atomMarek Olšák2015-09-011-0/+6
| | | | | Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: convert clip state into an atomMarek Olšák2015-09-011-0/+6
| | | | | | | Reducing calloc overhead. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: avoid redundant CB and DB register updatesMarek Olšák2015-09-011-0/+2
| | | | | | | | | | The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly when those colorbuffers aren't used. This is mainly for glamor. Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once. Reviewed-by: Alex Deucher <[email protected]> Acked-by: Christian König <[email protected]>