summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* gallium: make constant_buffer constRob Clark2016-06-202-5/+4
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make shader_buffers constRob Clark2016-06-201-2/+2
| | | | | | | Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use trapezoid distribution for tess on Fiji and PolarisNicolai Hähnle2016-06-202-8/+24
| | | | | | | This yields a small performance improvement in Unigine Heaven. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/sid: add Fiji+ tesselation distribution modeNicolai Hähnle2016-06-201-3/+7
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: emit PA_SC_RASTER_CONFIG_1 only onceNicolai Hähnle2016-06-201-16/+17
| | | | | | | It is the same for all SEs. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix calculation of valid RB mask per SENicolai Hähnle2016-06-201-4/+9
| | | | | | | | The old calculation treated too many RBs as disabled. Cc: 11.0 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: raise SI_PM4_MAX_DWNicolai Hähnle2016-06-201-1/+1
| | | | | | | | | The old limit, introduced in commit afa752d3f03ac6697581ff5d324e8ac0512ef513, was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all driversIlia Mirkin2016-06-181-0/+1
| | | | | | | | This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: fix undefined left-shift into sign bitNicolai Hähnle2016-06-151-1/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add driver queries for compute/dma call stats and spillsMarek Olšák2016-06-143-0/+9
| | | | | | also print the average count per frame Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't generate "ret void undef"Marek Olšák2016-06-141-6/+14
| | | | | | | Use LLVMBuildRetVoid in epilogs and the GS copy shader and si_llvm_build_ret otherwise. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clearMarek Olšák2016-06-141-1/+19
| | | | | | | | | We could also do MSAA resolve in a compute shader like Vulkan and remove these workarounds. v2: comment the magic numbers Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clarify the MSAA resolve limitation with scanoutMarek Olšák2016-06-141-1/+5
| | | | | | this is the correct hw requirement Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable scratch coalescingMarek Olšák2016-06-131-2/+10
| | | | | | | | This makes one particular compute shader 8x faster. Latest LLVM git is required. Reviewed-by: Nicolai Hähnle <[email protected]>
* Android: move libdrm settings to top-level Android.common.mkRob Herring2016-06-131-1/+1
| | | | | | | | | | | | | | Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <[email protected]> Acked-by: Emil Velikov <[email protected]>
* radeonsi: convert to 64-bitness checks instead of doubles.Dave Airlie2016-06-111-14/+14
| | | | | | | | This converts to testing for 64-bit types and renames some things in anticipation of 64-bit integer support. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: Reinitialize all descriptors in CE preamble.Bas Nieuwenhuizen2016-06-103-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: improve the computation and comment of scratch_wavesMarek Olšák2016-06-081-4/+18
| | | | | | 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the number of spilled VGPRsMarek Olšák2016-06-081-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove dead code creating LLVMTargetMachineMarek Olšák2016-06-081-3/+1
| | | | | | | | This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't enable scratch just for SGPR spillsMarek Olšák2016-06-081-2/+17
| | | | | | | | | Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"Marek Olšák2016-06-081-0/+1
| | | | | | This reverts commit ffd54d1936fcd07424265b780e1d049222a01e94. No, it doesn't work. The test case is "glxgears -samples 2".
* radeonsi: re-enable PBO ReadPixels accelerationMarek Olšák2016-06-081-3/+6
| | | | | | disabled by 4f1cccf570112f93265a4cace504eb763fa8f73e Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow MSAA resolving into a texture that has DCC enabledMarek Olšák2016-06-082-4/+23
| | | | | | | | Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: allow direct hw MSAA resolve for scanout surfacesMarek Olšák2016-06-081-1/+0
| | | | | | | No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't allocate DCC for the temporary MSAA resolve surfaceMarek Olšák2016-06-081-1/+2
| | | | | | | Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't enable DCC in the sampler if first_level doesn't have itMarek Olšák2016-06-083-7/+21
| | | | | | | | If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't flag renderbuffer feedback loop if DCC has just been disabledMarek Olšák2016-06-081-2/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add per-level dcc_enabled flagsMarek Olšák2016-06-083-5/+13
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: compute DCC register parameters in si_emit_framebuffer_stateMarek Olšák2016-06-081-10/+12
| | | | | | | | This will get more complicated with mipmapped DCC or when DCC is enabled after allocation. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable WQM in PS prolog when neededNicolai Hähnle2016-06-072-0/+10
| | | | | | | | | | | | WQM is needed when the PS prolog computes a VGPR that is consumed by a shader with (implicit or explicit) derivatives. Depends on http://reviews.llvm.org/D20839 / LLVM r272063 for this to be effective (otherwise it's just a no-op). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: keep track of dirty descriptor setsNicolai Hähnle2016-06-072-4/+36
| | | | | | | Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_descriptors into a per-context arrayNicolai Hähnle2016-06-073-83/+166
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_disable_shader_imageNicolai Hähnle2016-06-071-4/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: access descriptor sets via local variablesNicolai Hähnle2016-06-071-31/+41
| | | | | | | This will simplify moving them to a per-context array. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_rw_buffer to be used for internal descriptorsNicolai Hähnle2016-06-073-14/+15
| | | | | | | | So that callers outside of si_descriptors.c need to worry less about the details of descriptor handling. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_shader_imageNicolai Hähnle2016-06-071-5/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass shader stage to si_set_sampler_viewNicolai Hähnle2016-06-071-4/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move descriptor set begin_new_cs handling into a separate functionNicolai Hähnle2016-06-071-21/+15
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move enabled_mask out of si_descriptorsNicolai Hähnle2016-06-074-30/+34
| | | | | | | | This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add support for sharing textures with DCC between processesMarek Olšák2016-06-071-1/+34
| | | | | | v2: use a function for calculating WORD1 of bo metadata Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't discard DCC if an external user can write to itMarek Olšák2016-06-071-2/+8
| | | | | | | | | We don't import textures with DCC now, but soon we will. v2: if we can't disable DCC for image writes, at least decompress DCC at bind time Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowedIlia Mirkin2016-06-061-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: add a performance tweak for 4 SE partsMarek Olšák2016-06-061-0/+11
| | | | | | Ported from Vulkan. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify PRIMGROUP_SIZE computation for tessellationMarek Olšák2016-06-061-9/+1
| | | | | | | | Ported from Vulkan. v2: keep the comment Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use hw MSAA resolve for non-trivial resolvesMarek Olšák2016-06-061-10/+54
| | | | | | This improves MSAA resolve performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set descriptor dirty mask on shader buffer unbindNicolai Hähnle2016-06-061-0/+1
| | | | | | | | Found randomly while skimming the code. This might have caused VM faults in robustness tests. Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix images with level > 0Marek Olšák2016-06-051-1/+1
| | | | | | | | | | This should fix spec@arb_shader_image_load_store@level. Broken by: Commit: 95c5bbae66af3ca1f805d94f6fe8d8e4ba2c9c43 radeonsi: set some image descriptor fields at bind time Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* r600g: fix CP DMA hazard with index buffer fetches (v3)Marek Olšák2016-06-041-1/+1
| | | | | | | | | v3: use PFP_SYNC_ME on EG-CM only when supported by the kernel, otherwise use MEM_WRITE + WAIT_REG_MEM to emulate that Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/u_suballoc: allow different alignment for each allocationMarek Olšák2016-06-042-2/+2
| | | | | | | | | Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>