aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: remove si_resource_create_customMarek Olšák2016-10-265-20/+11
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: stop using PIPE_BIND_CUSTOMMarek Olšák2016-10-265-12/+9
| | | | | | it has no effect whatsoever Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't do (fmask.size && cmask.size)Marek Olšák2016-10-261-1/+1
| | | | | | fmask implies that cmask is present too. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: rename bo_size -> surf_size, bo_alignment -> surf_alignmentMarek Olšák2016-10-261-1/+1
| | | | | | these names were misleading. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove unnecessary fields from radeon_surf_levelMarek Olšák2016-10-261-4/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: stop using some input fields from radeon_surfaceMarek Olšák2016-10-261-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use r600_gfx_write_event_eop everywhereMarek Olšák2016-10-261-9/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make r600_gfx_write_fence more genericMarek Olšák2016-10-261-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable SDMA on Carrizo and all CIK chips againMarek Olšák2016-10-261-10/+0
| | | | | | | | SDMA might be fixed by: "winsys/amdgpu: fix radeon_surf::macro_tile_index for imported textures" Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_STREAM_OUTPUT_INTERLEAVE_BUFFERSIlia Mirkin2016-10-221-0/+1
| | | | | | | | | | | | | | This allows the driver to signal that it can't handle random interleaving of attributes across buffers. This is required for ARB_transform_feedback3, and it's initialized to whatever the previous value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where it is disabled. Note that the proprietary drivers never expose ARB_transform_feedback3 on any GT21x's (where nouveau previously did), and after some effort I was unable to get it to work. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix a regression in si_eliminate_const_outputNicolai Hähnle2016-10-211-4/+3
| | | | | | | | | | A constant value of float type is not necessarily a ConstantFP: it could also be a constant expression that for some reason hasn't been folded. This fixes a regression in GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 that was introduced by commit 3ec9975555d1cc5365413ad9062f412904f944a3. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix build of si_eliminate_const_vs_outputs on LLVM <= 3.8Marek Olšák2016-10-201-3/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix 64-bit loads from LDSNicolai Hähnle2016-10-201-1/+1
| | | | | | | | | Fixes spec/arb_tessellation_shader/execution/dvec[23]-vs-tcs-tes, among others. Cc: "12.0 13.0" <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: eliminate trivial constant VS outputsMarek Olšák2016-10-193-2/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These constant value VS PARAM exports: - 0,0,0,0 - 0,0,0,1 - 1,1,1,0 - 1,1,1,1 can be loaded into PS inputs using the DEFAULT_VAL field, and the VS exports can be removed from the IR to save export & parameter memory. After LLVM optimizations, analyze the IR to see which exports are equal to the ones listed above (or undef) and remove them if they are. Targeted use cases: - All DX9 eON ports always clear 10 VS outputs to 0.0 even if most of them are unused by PS (such as Witcher 2 below). - VS output arrays with unused elements that the GLSL compiler can't eliminate (such as Batman below). The shader-db deltas are quite interesting: (not from upstream si-report.py, it won't be upstreamed) PERCENTAGE DELTAS Shaders PARAM exports (affected only) batman_arkham_origins 589 -67.17 % bioshock-infinite 1769 -0.47 % dirt-showdown 548 -2.68 % dota2 1747 -3.36 % f1-2015 776 -4.94 % left_4_dead_2 1762 -0.07 % metro_2033_redux 2670 -0.43 % portal 474 -0.22 % talos_principle 324 -3.63 % warsow 176 -2.20 % witcher2 1040 -73.78 % ---------------------------------------- All affected 991 -65.37 % ... 9681 -> 3353 ---------------------------------------- Total 26725 -10.82 % ... 58490 -> 52162 v2: treat Undef as both 0 and 1 Reviewed-by: Nicolai Hähnle <[email protected]> (v1) Tested-by: Edmondo Tommasina <[email protected]> (v1)
* radeonsi: remove cb0_is_integer handlingMarek Olšák2016-10-193-13/+3
| | | | | | st/mesa does this for us. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename prefixes from radeon to siMarek Olšák2016-10-184-157/+157
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: merge radeon_llvm_context and si_shader_contextMarek Olšák2016-10-184-317/+290
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: import all TGSI->LLVM code from gallium/radeonMarek Olšák2016-10-186-4/+1513
| | | | | | Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: move LLVM ALU codegen into radeonsiMarek Olšák2016-10-184-6/+1054
| | | | | | Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: unify the constant load pathsNicolai Hähnle2016-10-171-28/+11
| | | | | | Remove the split between direct and indirect. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix indirect loads of 64 bit constantsNicolai Hähnle2016-10-171-2/+2
| | | | | | | This fixes GL45-CTS.compute_shader.fp64-case3. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: shorten "shader->selector" to "sel" in si_shader_createMarek Olšák2016-10-171-7/+8
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clear DB_RENDER_OVERRIDEMarek Olšák2016-10-171-3/+1
| | | | | | | Vulkan doesn't set these fields even though it doesn't use HiS. HiS is disabled by programming DB_SRESULTS_COMPARE_STATEn to 0. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: adjust and clean up Z_ORDER and EXEC_ON_x settingsMarek Olšák2016-10-132-22/+32
| | | | | | | The table was copied from the Vulkan driver. The comment lines are as long as the table for cosmetic reasons. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable ReZMarek Olšák2016-10-131-7/+4
| | | | | | | | | This is a serious performance fix. Discovered by luck. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354 Cc: 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement TC-compatible HTILEMarek Olšák2016-10-135-10/+68
| | | | | | | | | | | | | | | | | | | | | so that decompress blits aren't needed and depth texturing needs less memory bandwidth. Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16. The format promotion is not visible to state trackers. This is part of TC-compatible renderbuffer compression, which has 3 parts: DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now. I don't see a measurable increase in performance though. (I tested Talos Principle and DiRT: Showdown, the latter is improved by 0.5%, which is almost noise, and it originally used layered Z16, so at least we know that Z16 promoted to Z32F isn't slower now) Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix regression in image atomicsNicolai Hähnle2016-10-131-1/+1
| | | | Caused by a bad rebase when pushing commit 76a940893.
* radeonsi: fix the coordinate overloading of llvm.amdgcn.image.atomic.cmpswap.*Nicolai Hähnle2016-10-131-2/+7
| | | | | | | Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic* Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: enable ARB_enhanced_layouts and turn the cap onNicolai Hähnle2016-10-121-1/+1
| | | | | | | v2: mark llvmpipe & softpipe properly as well (Jason Wood) Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add PIPE_CAP_TGSI_ARRAY_COMPONENTSNicolai Hähnle2016-10-121-0/+1
| | | | | | | | This is a screen cap because drivers are expected to support it either for all shader types or for none of them. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: Use the new image load/store intrinsic signaturesTom Stellard2016-10-121-14/+45
| | | | | | This patch requires LLVM r284024 or newer. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Add function for converting LLVM type to intrinsic stringTom Stellard2016-10-121-10/+32
| | | | | | The existing function only worked for integer types. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Refactor image store/load intrinsic name creationTom Stellard2016-10-121-11/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix R600_DEBUG=precompile for shader-dbMarek Olšák2016-10-121-0/+6
| | | | | | | radeonsi no longer supports pixel shaders without interpolation optimizations, which led to assertion failures in si_shader_ps when running shader-db. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use TC write-back instead of full cache invalidationMarek Olšák2016-10-123-13/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement TC L2 write-back (flush) without cache invalidationMarek Olšák2016-10-122-28/+74
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't invalidate VMEM L1 for memory barriers for index buffersMarek Olšák2016-10-121-3/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: emit TA_CS_BC_BASE_ADDR on SI only if the kernel allows itMarek Olšák2016-10-111-1/+6
| | | | | | | Reviewed-by: Edmondo Tommasina <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make more use of si_have_tgsi_computeNicolai Hähnle2016-10-101-3/+1
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: support ARB_compute_variable_group_sizeNicolai Hähnle2016-10-103-16/+42
| | | | | | | | Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix texture border colors for compute shadersMarek Olšák2016-10-051-0/+12
| | | | | | | | There are VM faults without this. Cc: 12.0 <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix interpolateAt opcodes for .zw componentsMarek Olšák2016-10-051-1/+1
| | | | | | | | | | Not returning garbage in .zw seems pretty important. This fixes: GL45-CTS.shader_multisample_interpolation.render.interpolate_at_*_check.* Cc: 11.2 12.0 <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add assertions to validate interpolation flagsMarek Olšák2016-10-051-0/+34
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: interpolate colors after interpolation weight shufflingMarek Olšák2016-10-051-48/+48
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: implement set_device_reset_callbackNicolai Hähnle2016-10-051-0/+3
| | | | | | | | | Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: optionally run the LLVM IR verifier passNicolai Hähnle2016-10-041-7/+21
| | | | | | | | This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't declare LDS in PS when ds_bpermute is usedMarek Olšák2016-10-043-4/+7
| | | | | | | | I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interpMarek Olšák2016-10-041-49/+7
| | | | | | | We can finally do this, because the opcodes are scalar now. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: simplify si_llvm_emit_ddxyMarek Olšák2016-10-041-51/+29
| | | | | | | | si_llvm_emit_ddxy is called once per element, so we don't have to generate code for 4 elements at once. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VIMarek Olšák2016-10-041-5/+9
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>