summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: don't enable scratch just for SGPR spillsMarek Olšák2016-06-081-2/+17
| | | | | | | | | Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"Marek Olšák2016-06-081-0/+1
| | | | | | This reverts commit ffd54d1936fcd07424265b780e1d049222a01e94. No, it doesn't work. The test case is "glxgears -samples 2".
* radeonsi: re-enable PBO ReadPixels accelerationMarek Olšák2016-06-081-3/+6
| | | | | | disabled by 4f1cccf570112f93265a4cace504eb763fa8f73e Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow MSAA resolving into a texture that has DCC enabledMarek Olšák2016-06-082-4/+23
| | | | | | | | Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: move DCC clearing into a separate functionMarek Olšák2016-06-082-5/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: allow direct hw MSAA resolve for scanout surfacesMarek Olšák2016-06-081-1/+0
| | | | | | | No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't allocate DCC for the temporary MSAA resolve surfaceMarek Olšák2016-06-083-2/+5
| | | | | | | Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't enable DCC in the sampler if first_level doesn't have itMarek Olšák2016-06-083-7/+21
| | | | | | | | If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* winsys/amdgpu: enable DCC for mipmapped texturesMarek Olšák2016-06-082-3/+8
| | | | | | | | Also add dcc_fast_clear_size for clearing only the necessary subset of DCC. For no AA, it's equal to the size of the whole DCC level. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't disable DCC because of SDMAMarek Olšák2016-06-081-20/+3
| | | | | | | | We want to keep DCC enabled to save bandwidth. It was a bad idea to disable it here. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabledMarek Olšák2016-06-081-2/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add per-level dcc_enabled flagsMarek Olšák2016-06-085-8/+17
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: compute DCC register parameters in si_emit_framebuffer_stateMarek Olšák2016-06-084-14/+12
| | | | | | | | This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: add an assertion checking the validity of PIPE_BIND_SCANOUTMarek Olšák2016-06-081-3/+10
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/radeon: don't allocate DCC for non-renderable texture formatsMarek Olšák2016-06-082-0/+6
| | | | | | | | | | | R9G9B9E5 is the only uncompressed one hopefully. This fixes incorrect rendering not discovered (due to a lack of tests) until DCC mipmapping was enabled. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable WQM in PS prolog when neededNicolai Hähnle2016-06-072-0/+10
| | | | | | | | | | | | WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr: fix provoking vertexTim Rowley2016-06-077-12/+77
| | | | | | | | | | Use rasterizer provoking vertex API. Fix rasterizer provoking vertex for tristrips and quad list/strips. v2: make provoking vertex tables static const Reviewed-by: Bruce Cherniak <[email protected]>
* gk104/ir: fix conditions for adding a texbarIlia Mirkin2016-06-071-4/+6
| | | | | | | | | | | | Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "12.0 11.2" <[email protected]>
* radeonsi: keep track of dirty descriptor setsNicolai Hähnle2016-06-072-4/+36
| | | | | | | Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_descriptors into a per-context arrayNicolai Hähnle2016-06-073-83/+166
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_disable_shader_imageNicolai Hähnle2016-06-071-4/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: access descriptor sets via local variablesNicolai Hähnle2016-06-071-31/+41
| | | | | | | This will simplify moving them to a per-context array. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_rw_buffer to be used for internal descriptorsNicolai Hähnle2016-06-073-14/+15
| | | | | | | | So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_shader_imageNicolai Hähnle2016-06-071-5/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_sampler_viewNicolai Hähnle2016-06-071-4/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move descriptor set begin_new_cs handling into a separate functionNicolai Hähnle2016-06-071-21/+15
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move enabled_mask out of si_descriptorsNicolai Hähnle2016-06-074-30/+34
| | | | | | | | This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add support for sharing textures with DCC between processesMarek Olšák2016-06-073-4/+51
| | | | | | v2: use a function for calculating WORD1 of bo metadata Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't discard DCC if an external user can write to itMarek Olšák2016-06-073-12/+31
| | | | | | | | | We don't import textures with DCC now, but soon we will. v2: if we can't disable DCC for image writes, at least decompress DCC at bind time Reviewed-by: Nicolai Hähnle <[email protected]>
* i915: fix typo CAP.Dave Airlie2016-06-071-1/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* nvc0: add support for VOTE tgsi opcodesIlia Mirkin2016-06-065-24/+77
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowedIlia Mirkin2016-06-0615-0/+15
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nv50/ir: use round toward 0 when converting doubles to integersSamuel Pitoiset2016-06-061-1/+3
| | | | | | | | | | | | Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* gallium/radeon: don't re-set BO metadata after CMASK deallocationMarek Olšák2016-06-061-1/+0
| | | | | | CMASK has no effect on metadata, because it's not sharable. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a performance tweak for 4 SE partsMarek Olšák2016-06-061-0/+11
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify PRIMGROUP_SIZE computation for tessellationMarek Olšák2016-06-061-9/+1
| | | | | | | | Ported from Vulkan. v2: keep the comment Reviewed-by: Nicolai Hähnle <[email protected]>
* r600g: use hw MSAA resolve for non-trivial resolvesMarek Olšák2016-06-061-9/+53
| | | | | | This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use hw MSAA resolve for non-trivial resolvesMarek Olšák2016-06-061-10/+54
| | | | | | This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set descriptor dirty mask on shader buffer unbindNicolai Hähnle2016-06-061-0/+1
| | | | | | | | Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* svga: print shader linkage info when tgsi debug bit is onCharmaine Lee2016-06-061-2/+5
| | | | | | | | When TGSI debug flag is enabled, print the shader linkage info as well. Tested with mesa demos with SVGA_DEBUG=tgsi Reviewed-by: Brian Paul <[email protected]>
* nv50,nvc0: fix BGR10_A2UI vertex formatIlia Mirkin2016-06-051-1/+1
| | | | | | | | This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: do not clear surfaces bins in the validate functionSamuel Pitoiset2016-06-052-5/+2
| | | | | | | | | We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: re-validate images after launching a grid on FermiSamuel Pitoiset2016-06-051-0/+3
| | | | | | | | | | | | | | | | | | | | | Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* radeonsi: fix images with level > 0Marek Olšák2016-06-051-1/+1
| | | | | | | | | | This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43 radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nvc0: reduce overhead from always marking images dirtyIlia Mirkin2016-06-041-9/+36
| | | | | | | | | | We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: reduce overhead from always marking buffers dirtyIlia Mirkin2016-06-041-6/+20
| | | | | | | | | | We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: fix memory barrier flag handlingIlia Mirkin2016-06-041-9/+16
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: mark bound buffer range validIlia Mirkin2016-06-043-0/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* gallium/radeon: don't use the DMA ring for pipelined buffer uploadsMarek Olšák2016-06-041-5/+4
| | | | | | | | | | | | | | | | | | | | Submitting a DMA IB flushes the GFX IB and all GPU caches. Vedran Miletić said: "On Tonga 380X, this improves The Talos Principle from 8.3 fps to 28.3 fps (all graphics settings Ultra, 4xAA, 1080p resolution with downsampling from 1200p)." Some anonymous dude said: R9 390 results: Tomb Raider (normal settings): 80 -> 88 FPS Talos Principle (custom settings): 23 -> 56 FPS Metro Last Light Redux (default benchmark settings): 39 -> 40 FPS Reviewed-by: Alex Deucher <[email protected]> Tested-by: Vedran Miletić <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* r600g: don't flush caches when binding shader resourcesMarek Olšák2016-06-044-31/+26
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>