summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* turnip: Deconflict vk_format_table regenerationBas Nieuwenhuizen2019-03-161-3/+3
| | | | | | | | | | | | Avoids src/freedreno/vulkan/meson.build:42:0: ERROR: Tried to create target "vk_format_table.c", but a target of that name already exists. when building both radv and turnip. Fixes: 26380b3a9f8 "turnip: Add driver skeleton (v2)" Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* radv: always load 3 channels for formats that need to be shuffledSamuel Pitoiset2019-03-151-9/+14
| | | | | | | | | This fixes a rendering issue with Hellblade and DXVK. Fixes: a66b186bebf ("radv: use typed buffer loads for vertex input fetches") Reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always initialize HTILE when the src layout is UNDEFINEDSamuel Pitoiset2019-03-141-2/+1
| | | | | | | | | | | | | HTILE should always be initialized when transitioning from VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise, if an app does a transition from UNDEFINED to GENERAL, the driver doesn't initialize HTILE and it tries to decompress the depth surface. For some reasons, this results in VM faults. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use the raw tbuffer version for 16-bit SSBO loadsSamuel Pitoiset2019-03-133-6/+3
| | | | | | | vindex is always 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_{struct,raw}_tbuffer_load() helpersSamuel Pitoiset2019-03-133-23/+75
| | | | | | | The struct version sets IDXEN=1, while the raw version sets IDXEN=0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use typed buffer loads for vertex input fetchesSamuel Pitoiset2019-03-134-53/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | This drastically reduces the number of SGPRs because the driver now uses descriptors per vertex binding, instead of per vertex attribute format. 29077 shaders in 15096 tests Totals: SGPRS: 1354285 -> 1282109 (-5.33 %) VGPRS: 909896 -> 908800 (-0.12 %) Spilled SGPRs: 24840 -> 24811 (-0.12 %) Code Size: 49221144 -> 48986628 (-0.48 %) bytes Max Waves: 243930 -> 244229 (0.12 %) Totals from affected shaders: SGPRS: 390648 -> 318472 (-18.48 %) VGPRS: 288432 -> 287336 (-0.38 %) Spilled SGPRs: 94 -> 65 (-30.85 %) Code Size: 11548412 -> 11313896 (-2.03 %) bytes Max Waves: 86460 -> 86759 (0.35 %) This gives a really tiny boost. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store more vertex attribute infos as pipeline keysSamuel Pitoiset2019-03-133-0/+37
| | | | | | | They are required for using typed buffer loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: rework typed buffers loads for LLVM 7Samuel Pitoiset2019-03-133-57/+83
| | | | | | | Be more generic, this will be used by an upcoming series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set the maximum number of IBs per submit to 192Samuel Pitoiset2019-03-122-1/+8
| | | | | | | | | This fixes random SteamVR corruption, see https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181 Fixes: 4d30f2c6f42 ("radv/winsys: remove the max IBs per submit limit for the fallback path") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix 16-bit ssbo storesRhys Perry2019-03-121-0/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix pointSizeRange limitsSamuel Pitoiset2019-03-121-1/+1
| | | | | | | | | | The values should match the ones that are emitted. This fixes new CTS dEQP-VK.rasterization.primitive_size.points.*. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/xfb: adding varyings on nir_xfb_info and gather_infoAlejandro PiƱeiro2019-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In order to be used for OpenGL (right now for ARB_gl_spirv). This commit adds two new structures: * nir_xfb_varying_info: that identifies each individual varying. For each one, we need to know the type, buffer and xfb_offset * nir_xfb_buffer_info: as now for each buffer, in addition to the stride, we need to know how many varyings are assigned to it. For this patch, the only case where num_outputs != num_varyings is with the case of doubles, that for dvec3/4 could require more than one output. There are more cases though (like aoa), that will be handled on following patches. v2: updated after new nir general XFB support introduced for "anv: Add support for VK_EXT_transform_feedback" v3: compute num_varyings beforehand for allocating, instead of relying on num_outputs as approximate value (Timothy Arceri) Reviewed-by: Timothy Arceri <[email protected]>
* Revert "radv: execute external subpass barriers after ending subpasses"Samuel Pitoiset2019-03-081-2/+2
| | | | | | | | | | | | | | | This changes is actually wrong because we have to sync before doing image layout transitions. This fixes rendering issues in Batman, Path of Exile and probably more titles. This reverts commit 76c17cfd8da017ebd19be33ba6cef888957a6758. Fixes: 76c17cfd8da ("radv: execute external subpass barriers after ending subpasses") Cc: 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable lower_mul_2x32_64Samuel Pitoiset2019-03-061-0/+1
| | | | | | Fixes: 58bcebd987b ("spirv: Allow [i/u]mulExtended to use new nir opcode") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set num_components on vulkan_resource_index intrinsicLionel Landwerlin2019-03-063-10/+20
| | | | | | | | | | In 61e009d2c4e4df we changed the number of components in the vulkan_resource_index intrinsic and forgot the update Radv's code for it. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 61e009d2c4e4df ("spirv: Use the same types for resource indices as pointers") Reviewed-by: Samuel Pitoiset [email protected]
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-062-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-061-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* radv: properly align the fence and EOP bug VA on GFX9Samuel Pitoiset2019-03-051-2/+5
| | | | | | | | | | If alignement is 0, offets returned by radv_cmd_buffer_upload_alloc() are always 0. These two virtual addresses were pointing at the same location. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate enough space in cmdbuf when starting a subpassSamuel Pitoiset2019-03-051-1/+1
| | | | | | | | | | | | This fixes some CTS crashes with: dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_* Ideally, we should check cmd_buffer->cs->max_dw because there is likely enough space (the internal clear draws allocate space), but keep that way for consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* rav: use 32_AR instead of 32_ABGR when alpha coverage is requiredSamuel Pitoiset2019-03-041-1/+1
| | | | | | | | This export format is faster. Seems to improve performance in Wreckfest. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Interpolate less aggressively.Bas Nieuwenhuizen2019-02-261-9/+12
| | | | | | | | | | | | | | Seems like dxvk used integer builtins without setting the flat interpolation decoration. I believe in the current spec the app is required to set these, but in the meantime to avoid breaking things in stable releases (and so close to release for 19.0), only expand the interpolation to float16 and struct (which cannot be builtins as our spirv parser lowers the builtin block). Fixes: f3247841040 "radv: Allow interpolation on non-float types." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: don't copy buffer descriptors list for samplersSamuel Pitoiset2019-02-261-1/+5
| | | | | | | | | | | Sampler descriptors don't have a buffer list. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*.sampler_*. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix out-of-bounds access when copying descriptors BO listSamuel Pitoiset2019-02-261-2/+0
| | | | | | | | | | | We shouldn't increment the buffer list pointers twice. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix clearing attachments in secondary command buffersSamuel Pitoiset2019-02-251-10/+43
| | | | | | | | | | | If no framebuffer is bound, get the number of samples and the image format from the render pass. This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Allow interpolation on non-float types.Bas Nieuwenhuizen2019-02-221-10/+9
| | | | | | | | | | In particular structs containing floats and 16-bit floating point types. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Fixes: da295946361 "spirv: Only split blocks" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix float16 interpolation set up.Bas Nieuwenhuizen2019-02-226-16/+94
| | | | | | | | float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Disable depth clamping even without EXT_depth_range_unrestricted.Bas Nieuwenhuizen2019-02-201-2/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement VK_EXT_depth_clip_enable.Bas Nieuwenhuizen2019-02-203-2/+16
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Handle clip+cull distances more generally as compact arrays.Bas Nieuwenhuizen2019-02-204-99/+83
| | | | | | | | | | | | Needed for https://gitlab.freedesktop.org/mesa/mesa/merge_requests/248 . That MR keeps the clip and cull arrays split. So we have to handle - compact arrays with location_frac != 0 - VARYING_SLOT_CLIP_DIST1 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Clean up a bunch of compiler warnings.Bas Nieuwenhuizen2019-02-203-7/+0
| | | | | | Random unused vars. Reviewed-by: Timothy Arceri <[email protected]>
* radv: Sync ETC2 whitelisted devices.Bas Nieuwenhuizen2019-02-203-5/+11
| | | | | Fixes: 4bb6c49375e "radv: Allow ETC2 on RAVEN and VEGA10 instead of all GFX9." Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: Go back to using llvm.pow intrinsic for nir_op_fpowKenneth Graunke2019-02-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | ARB_vertex_program and ARB_fragment_program define 0^0 = 1 (while GLSL leaves it undefined). Performing fpow lowering in NIR would break this behavior, preventing us from using prog_to_nir. According to llvm/lib/Target/AMDGPU/SIInstructions.td, POW_common expands to <V_LOG_F32_e32, V_EXP_F32_e32, V_MUL_LEGACY_F32_e32>, which presumably does a zero-wins multiply. Lowering in NIR results in a non-legacy multiply, where: pow(0, 0) = 2^(log2(0) * 0) = 2^(-INF * 0) = 2^(-NaN) = -NaN which isn't the desired result. This reverts: - commit d6b75392067712908bdc372f1007e085439bf9f5 (ac/nir: remove emission of nir_op_fpow) - commit 22430224fec31591432d4a3e65c6f457ba1c1653 (radeonsi/nir: enable lowering of fpow) and prevents a regression in gl-1.0-spot-light with AMD_DEBUG=nir after enabling prog_to_nir in st/mesa later in this series. Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: implement half-float nir_op_ldexpRhys Perry2019-02-191-1/+3
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement half-float nir_op_frsqRhys Perry2019-02-191-2/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement half-float nir_op_frcpRhys Perry2019-02-191-2/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_fdiv support 16-bit floatsRhys Perry2019-02-191-1/+1
| | | | | | | v2: don't use ac_get_onef() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_isign work on all bit sizesRhys Perry2019-02-191-23/+4
| | | | | | | v2: don't use ac_get_zero(), ac_get_one() and ac_int_of_size() Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: make ac_build_clamp work on all bit sizesRhys Perry2019-02-191-4/+9
| | | | | | | | v2: don't use ac_get_zerof() and ac_get_onef() v3: rename "intr" to "name" Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: fix 64-bit nir_op_f2f16_rtzRhys Perry2019-02-191-0/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: implement 8-bit nir_load_const_instrRhys Perry2019-02-191-0/+4
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: ensure export arguments are always floatRhys Perry2019-02-191-5/+1
| | | | | | | | | | | | | So that the signature is correct and consistent, the inputs to a export intrinsic should always be 32-bit floats. This and the previous commit fixes a large amount crashes from dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.input_output_int_* tests Fixes: b722b29f10d ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: bitcast 16-bit outputs to integersRhys Perry2019-02-191-2/+2
| | | | | | | | | 16-bit outputs are stored as 16-bit floats in the outputs array, so they have to be bitcast. Fixes: b722b29f10d ('radv: add support for 16bit input/output') Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix writing the alpha channel of MRT0 when alpha coverage is enabledSamuel Pitoiset2019-02-181-7/+8
| | | | | | | | This version is better and safer. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused variable in gather_push_constant_info()Samuel Pitoiset2019-02-181-1/+0
| | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: write the alpha channel of MRT0 when alpha coverage is enabledSamuel Pitoiset2019-02-181-0/+8
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109597 Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsic when loading 16-bit valuesSamuel Pitoiset2019-02-181-14/+27
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_llvm8_tbuffer_load() helperSamuel Pitoiset2019-02-182-0/+52
| | | | | | | It uses the new LLVM intrinsics. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix invalid element type when filling vertex input default valuesSamuel Pitoiset2019-02-161-1/+3
| | | | | | | | | | The elements added into a vector should have the same type as the first one, otherwise this hits an assertion in LLVM. Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches") reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use correct num formats to detect whether we should be use 1.0 or 1.Bas Nieuwenhuizen2019-02-151-1/+2
| | | | | | | normalized and scaled formats also return floats. Fixes: 4b3549c0846 ("radv: reduce the number of loaded channels for vertex input fetches") Reviewed-by: Samuel Pitoiset <[email protected]>