summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: make si_is_format_supported staticMarek Olšák2016-06-253-11/+6
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: boolean -> bool, TRUE -> true, FALSE -> falseMarek Olšák2016-06-254-15/+15
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use r600_resource_referenceMarek Olšák2016-06-253-8/+4
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Implement POLYGON_OFFSET_UNITS_UNSCALEDAxel Davy2016-06-252-15/+19
| | | | | | | | | | | | | | | | Empirical tests show that the polygon offset behaviour is entirely determined by the content of the PA_SU_POLY_OFFSET states, and not by the depth buffer format bound. PA_SU_POLY_OFFSET seems to directly set the parameters of the polygon offset formula, and setting 0 for PA_SU_POLY_OFFSET_DB_FMT_CNTL (ie setting the unorm depth bias behaviour with a scale of 2^0 = 1.0f) gives the unscaled behaviour. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move PA_SU_POLY_OFFSET_DB_FMT_CNTL to poly offset statesAxel Davy2016-06-251-23/+8
| | | | | | | | | Emit PA_SU_POLY_OFFSET_DB_FMT_CNTL with rasterizer poly_offset states. This will be useful to implement PIPE_CAP_POLYGON_OFFSET_UNITS_UNSCALED. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Add a cap for offset_units_unscaledAxel Davy2016-06-251-0/+1
| | | | | | | | | | | | | | D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix fractional odd tessellation spacing for PolarisMarek Olšák2016-06-244-1/+23
| | | | | | | ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set some VGT context registers on SI-CIMarek Olšák2016-06-241-0/+3
| | | | | | the kernel sets them, but other UMDs can change them Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: optimize rendering to linear color buffersMarek Olšák2016-06-242-1/+12
| | | | | | loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set almost optimal settings in SC_MODE_CNTL_1Marek Olšák2016-06-241-1/+10
| | | | | | ported from Vulkan Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: let drivers specify SC_MODE_CNTL_1 fieldsMarek Olšák2016-06-241-1/+5
| | | | | | radeonsi will set more fields Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: disable complicated point clipping against user clip planesMarek Olšák2016-06-241-1/+0
| | | | | | Nothing in the GL spec says that we should expand points to triangles. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix a compute shader hang with big threadgroups on SI & CIMarek Olšák2016-06-241-0/+18
| | | | | | | ported from Vulkan Cc: 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: drop the DRAW_PREAMBLE packet on PolarisNicolai Hähnle2016-06-241-1/+6
| | | | | | | | It will be removed from the firmware for the Polaris. Cc: 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use DRAW_(INDEX_)INDIRECT_MULTI on PolarisNicolai Hähnle2016-06-241-10/+36
| | | | | | | | The non-MULTI variants will be removed in Polaris firmware. Cc: 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: report a failure to parse dmesg instead of assertingNicolai Hähnle2016-06-241-1/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon: check VM faults from DMA flushNicolai Hähnle2016-06-243-4/+40
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move gfx fence wait out of si_check_vm_faultsNicolai Hähnle2016-06-242-6/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract IB and bo list saving into separate functionsNicolai Hähnle2016-06-244-54/+23
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: set LLVM denormal flagsMarek Olšák2016-06-241-2/+5
| | | | | | | | | - make sure FP32 denormals will stay disabled in LLVM in the future (the current default is disabled) - tell LLVM that FP64 denormals are enabled Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: add a debug flag for unsafe math LLVM optimizationsMarek Olšák2016-06-211-0/+16
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use u_blitter for mipmap generationMarek Olšák2016-06-212-1/+32
| | | | | | | | This reduces time spend in glGenerateMipmap by a half. v2: don't decompress the levels to be overwritten Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make image_view constRob Clark2016-06-201-3/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make constant_buffer constRob Clark2016-06-202-5/+4
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make shader_buffers constRob Clark2016-06-201-2/+2
| | | | | | | Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use trapezoid distribution for tess on Fiji and PolarisNicolai Hähnle2016-06-202-8/+24
| | | | | | | This yields a small performance improvement in Unigine Heaven. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/sid: add Fiji+ tesselation distribution modeNicolai Hähnle2016-06-201-3/+7
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: emit PA_SC_RASTER_CONFIG_1 only onceNicolai Hähnle2016-06-201-16/+17
| | | | | | | It is the same for all SEs. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix calculation of valid RB mask per SENicolai Hähnle2016-06-201-4/+9
| | | | | | | | The old calculation treated too many RBs as disabled. Cc: 11.0 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: raise SI_PM4_MAX_DWNicolai Hähnle2016-06-201-1/+1
| | | | | | | | | The old limit, introduced in commit afa752d3f03ac6697581ff5d324e8ac0512ef513, was exceeded by 4 SE configurations which hit si_write_harvested_raster_configs. Cc: 11.1 11.2 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all driversIlia Mirkin2016-06-181-0/+1
| | | | | | | | This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* radeonsi: fix undefined left-shift into sign bitNicolai Hähnle2016-06-151-1/+2
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add driver queries for compute/dma call stats and spillsMarek Olšák2016-06-143-0/+9
| | | | | | also print the average count per frame Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't generate "ret void undef"Marek Olšák2016-06-141-6/+14
| | | | | | | Use LLVMBuildRetVoid in epilogs and the GS copy shader and si_llvm_build_ret otherwise. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clearMarek Olšák2016-06-141-1/+19
| | | | | | | | | We could also do MSAA resolve in a compute shader like Vulkan and remove these workarounds. v2: comment the magic numbers Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clarify the MSAA resolve limitation with scanoutMarek Olšák2016-06-141-1/+5
| | | | | | this is the correct hw requirement Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable scratch coalescingMarek Olšák2016-06-131-2/+10
| | | | | | | | This makes one particular compute shader 8x faster. Latest LLVM git is required. Reviewed-by: Nicolai Hähnle <[email protected]>
* Android: move libdrm settings to top-level Android.common.mkRob Herring2016-06-131-1/+1
| | | | | | | | | | | | | | Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <[email protected]> Acked-by: Emil Velikov <[email protected]>
* radeonsi: convert to 64-bitness checks instead of doubles.Dave Airlie2016-06-111-14/+14
| | | | | | | | This converts to testing for 64-bit types and renames some things in anticipation of 64-bit integer support. Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi: Reinitialize all descriptors in CE preamble.Bas Nieuwenhuizen2016-06-103-3/+15
| | | | | | | | | | | | | | | | | | | | | | | | | This fixes a problem with the CE preamble and restoring only stuff in the preamble when needed. To illustrate suppose we have two graphics IB's 1 and 2, which are submitted in that order. Furthermore suppose IB 1 does not use CE ram, but IB 2 does, and we have a context switch at the start of IB 1, but not between IB 1 and IB 2. The old code put the CE RAM loads in the preamble of IB 2. As the preamble of IB 1 does not have the loads and the preamble of IB 2 does not get executed, the old values are not load into CE RAM. Fix this by always restoring the entire CE RAM. v2: - Just load all descriptor set buffers instead of load and store the entire CE RAM. - Leave the ce_ram_dirty tracking in place for the non-preamble case. v3: - Fixed parameter alignment. - Rebased to master (Nicolai's descriptor series). Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: improve the computation and comment of scratch_wavesMarek Olšák2016-06-081-4/+18
| | | | | | 2% isn't much. If you think the number should be decreased, please speak up. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print the number of spilled VGPRsMarek Olšák2016-06-081-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove dead code creating LLVMTargetMachineMarek Olšák2016-06-081-3/+1
| | | | | | | | This was for some old unsupported LLVM version. Only si_create_context creates the target machine now. r600g doesn't use this function. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't enable scratch just for SGPR spillsMarek Olšák2016-06-081-2/+17
| | | | | | | | | Diff from shader-db: Scratch: 3221504 -> 17408 (-99.46 %) bytes per wave v2: add "break;" Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces"Marek Olšák2016-06-081-0/+1
| | | | | | This reverts commit ffd54d1936fcd07424265b780e1d049222a01e94. No, it doesn't work. The test case is "glxgears -samples 2".
* radeonsi: re-enable PBO ReadPixels accelerationMarek Olšák2016-06-081-3/+6
| | | | | | disabled by 4f1cccf570112f93265a4cace504eb763fa8f73e Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allow MSAA resolving into a texture that has DCC enabledMarek Olšák2016-06-082-4/+23
| | | | | | | | Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: allow direct hw MSAA resolve for scanout surfacesMarek Olšák2016-06-081-1/+0
| | | | | | | No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't allocate DCC for the temporary MSAA resolve surfaceMarek Olšák2016-06-081-1/+2
| | | | | | | Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: don't enable DCC in the sampler if first_level doesn't have itMarek Olšák2016-06-083-7/+21
| | | | | | | | If first_level > 0 and DCC is disabled for that level, let's skip DCC reads entirely. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>