summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* anv/radv: release memory allocated by glsl types during spirv_to_nirTapani Pälli2019-03-211-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | Fixes leaks for each glsl_type generated: ==32470== 384 bytes in 3 blocks are possibly lost in loss record 18 of 18 ==32470== at 0x483880B: malloc (vg_replace_malloc.c:309) ==32470== by 0x4C43F4A: ralloc_size (ralloc.c:119) ==32470== by 0x4C44014: rzalloc_size (ralloc.c:151) ==32470== by 0x4C44258: rzalloc_array_size (ralloc.c:215) ==32470== by 0x4D38957: glsl_type::glsl_type(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:114) ==32470== by 0x4D3BEED: glsl_type::get_struct_instance(glsl_struct_field const*, unsigned int, char const*) (glsl_types.cpp:1146) ==32470== by 0x4D42ECC: glsl_struct_type (nir_types.cpp:501) ==32470== by 0x4CDB5A1: vtn_handle_type (spirv_to_nir.c:1269) ==32470== by 0x4CE53DD: vtn_handle_variable_or_type_instruction (spirv_to_nir.c:4018) ==32470== by 0x4CD8CFF: vtn_foreach_instruction (spirv_to_nir.c:365) ==32470== by 0x4CE5E6B: spirv_to_nir (spirv_to_nir.c:4490) ==32470== by 0x497AF10: anv_shader_compile_to_nir (anv_pipeline.c:173) v2: move release call to vkDestroyInstance v3: apply fix also to radv driver Signed-off-by: Tapani Pälli <[email protected]> Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]>
* anv,radv,turnip: Lower TG4 offsets with nir_lower_texJason Ekstrand2019-03-211-0/+1
| | | | | | v2: turn on for turnip as well (Karol Herbst) Reviewed-by: Karol Herbst <[email protected]>
* radv: Implement VK_EXT_pipeline_creation_feedback.Bas Nieuwenhuizen2019-03-205-9/+107
| | | | | | | | | Does what it says on the tin. The per stage time is only an approximation due to linking and the Vega merged stages. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: use new LLVM 8 intrinsics in ac_build_buffer_store_dword()Samuel Pitoiset2019-03-201-40/+26
| | | | | | | New buffer intrinsics have a separate soffset parameter. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsic when storing 16-bit valuesSamuel Pitoiset2019-03-203-21/+33
| | | | | | | vindex is always 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_{struct,raw}_tbuffer_store() helpersSamuel Pitoiset2019-03-202-0/+156
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use new LLVM 8 intrinsics in ac_build_buffer_load()Samuel Pitoiset2019-03-201-0/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use ac_build_buffer_store_dword() for SSBO store operationsSamuel Pitoiset2019-03-201-14/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use ac_build_buffer_load() for SSBO load operationsSamuel Pitoiset2019-03-201-29/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: use new LLVM 8 intrinsics for SSBO atomic operationsSamuel Pitoiset2019-03-201-24/+42
| | | | | | | Use the raw version (ie. IDXEN=0) because vindex is unused. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: remove one useless check in visit_store_ssbo()Samuel Pitoiset2019-03-201-6/+3
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_buffer_store_format() helperSamuel Pitoiset2019-03-203-21/+119
| | | | | | | Similar to ac_build_buffer_load_format(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: set attrib flags for SSBO and image store operationsSamuel Pitoiset2019-03-201-3/+6
| | | | | | | For consistency regarding other store operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: make use of ac_get_store_intr_attribs() where possibleSamuel Pitoiset2019-03-201-6/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix binding transform feedback buffersSamuel Pitoiset2019-03-201-1/+1
| | | | | | | | | | | | The mask should be accumulated if two calls are used for binding two buffers at different indexes. Otherwise, the driver only accounts for the last one. Noticed while glancing at this code. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use llvm.amdgcn.fract intrinsic for nir_op_ffractSamuel Pitoiset2019-03-201-5/+4
| | | | | | | | | | | | | | | | | | | | | | Noticed with a Doom shader. 29077 shaders in 15096 tests Totals: SGPRS: 1282125 -> 1282133 (0.00 %) VGPRS: 908716 -> 908616 (-0.01 %) Spilled SGPRs: 24811 -> 24779 (-0.13 %) Code Size: 49048176 -> 48936488 (-0.23 %) bytes Max Waves: 244232 -> 244226 (-0.00 %) Totals from affected shaders: SGPRS: 229584 -> 229592 (0.00 %) VGPRS: 163268 -> 163168 (-0.06 %) Spilled SGPRs: 8682 -> 8650 (-0.37 %) Code Size: 12819572 -> 12707884 (-0.87 %) bytes Max Waves: 24398 -> 24392 (-0.02 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use correct image view comparison for fast clears.Bas Nieuwenhuizen2019-03-191-1/+1
| | | | | | | | The if is actually returning true on success, enabling fast clears, so we need to have the test succeed when the iview dimensions are right. Fixes: d5400a5ec2a "radv: provide a helper for comparing an image extents." Reviewed-by: Dave Airlie <[email protected]>
* anv,radv: Implement VK_KHR_surface_capability_protectedJason Ekstrand2019-03-181-0/+1
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: Implement VK_EXT_host_query_reset.Bas Nieuwenhuizen2019-03-183-0/+29
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* radv: Allow fast clears with concurrent queue mask for some layouts.Bas Nieuwenhuizen2019-03-181-6/+5
| | | | | | | | | | | | | | For VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL and VK_IMAGE_LAYOUT_DEPTH_STENCIL_ATTACHMENT_OPTIMAL we do not care about the queue mask because 1) using these is only allowed on the gfx queue 2) transitions for these are only allowed on the gfx queue. This enables some fast clears for Doom that uses VK_SHARING_MODE_CONCURRENT. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir_to_llvm: add assert to emit_bcsel()Timothy Arceri2019-03-181-0/+2
| | | | | | | nir to llvm assumes we have already split vectors to scalars via nir_lower_alu_to_scalar(). Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* turnip: Deconflict vk_format_table regenerationBas Nieuwenhuizen2019-03-161-3/+3
| | | | | | | | | | | | Avoids src/freedreno/vulkan/meson.build:42:0: ERROR: Tried to create target "vk_format_table.c", but a target of that name already exists. when building both radv and turnip. Fixes: 26380b3a9f8 "turnip: Add driver skeleton (v2)" Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* radv: always load 3 channels for formats that need to be shuffledSamuel Pitoiset2019-03-151-9/+14
| | | | | | | | | This fixes a rendering issue with Hellblade and DXVK. Fixes: a66b186bebf ("radv: use typed buffer loads for vertex input fetches") Reported-by: Philip Rebohle <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always initialize HTILE when the src layout is UNDEFINEDSamuel Pitoiset2019-03-141-2/+1
| | | | | | | | | | | | | HTILE should always be initialized when transitioning from VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise, if an app does a transition from UNDEFINED to GENERAL, the driver doesn't initialize HTILE and it tries to decompress the depth surface. For some reasons, this results in VM faults. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: use the raw tbuffer version for 16-bit SSBO loadsSamuel Pitoiset2019-03-133-6/+3
| | | | | | | vindex is always 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_{struct,raw}_tbuffer_load() helpersSamuel Pitoiset2019-03-133-23/+75
| | | | | | | The struct version sets IDXEN=1, while the raw version sets IDXEN=0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use typed buffer loads for vertex input fetchesSamuel Pitoiset2019-03-134-53/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | This drastically reduces the number of SGPRs because the driver now uses descriptors per vertex binding, instead of per vertex attribute format. 29077 shaders in 15096 tests Totals: SGPRS: 1354285 -> 1282109 (-5.33 %) VGPRS: 909896 -> 908800 (-0.12 %) Spilled SGPRs: 24840 -> 24811 (-0.12 %) Code Size: 49221144 -> 48986628 (-0.48 %) bytes Max Waves: 243930 -> 244229 (0.12 %) Totals from affected shaders: SGPRS: 390648 -> 318472 (-18.48 %) VGPRS: 288432 -> 287336 (-0.38 %) Spilled SGPRs: 94 -> 65 (-30.85 %) Code Size: 11548412 -> 11313896 (-2.03 %) bytes Max Waves: 86460 -> 86759 (0.35 %) This gives a really tiny boost. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: store more vertex attribute infos as pipeline keysSamuel Pitoiset2019-03-133-0/+37
| | | | | | | They are required for using typed buffer loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: rework typed buffers loads for LLVM 7Samuel Pitoiset2019-03-133-57/+83
| | | | | | | Be more generic, this will be used by an upcoming series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set the maximum number of IBs per submit to 192Samuel Pitoiset2019-03-122-1/+8
| | | | | | | | | This fixes random SteamVR corruption, see https://github.com/ValveSoftware/SteamVR-for-Linux/issues/181 Fixes: 4d30f2c6f42 ("radv/winsys: remove the max IBs per submit limit for the fallback path") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix 16-bit ssbo storesRhys Perry2019-03-121-0/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: fix pointSizeRange limitsSamuel Pitoiset2019-03-121-1/+1
| | | | | | | | | | The values should match the ones that are emitted. This fixes new CTS dEQP-VK.rasterization.primitive_size.points.*. Fixes: f4e499ec791 ("radv: add initial non-conformant radv vulkan driver") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/xfb: adding varyings on nir_xfb_info and gather_infoAlejandro Piñeiro2019-03-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | In order to be used for OpenGL (right now for ARB_gl_spirv). This commit adds two new structures: * nir_xfb_varying_info: that identifies each individual varying. For each one, we need to know the type, buffer and xfb_offset * nir_xfb_buffer_info: as now for each buffer, in addition to the stride, we need to know how many varyings are assigned to it. For this patch, the only case where num_outputs != num_varyings is with the case of doubles, that for dvec3/4 could require more than one output. There are more cases though (like aoa), that will be handled on following patches. v2: updated after new nir general XFB support introduced for "anv: Add support for VK_EXT_transform_feedback" v3: compute num_varyings beforehand for allocating, instead of relying on num_outputs as approximate value (Timothy Arceri) Reviewed-by: Timothy Arceri <[email protected]>
* Revert "radv: execute external subpass barriers after ending subpasses"Samuel Pitoiset2019-03-081-2/+2
| | | | | | | | | | | | | | | This changes is actually wrong because we have to sync before doing image layout transitions. This fixes rendering issues in Batman, Path of Exile and probably more titles. This reverts commit 76c17cfd8da017ebd19be33ba6cef888957a6758. Fixes: 76c17cfd8da ("radv: execute external subpass barriers after ending subpasses") Cc: 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable lower_mul_2x32_64Samuel Pitoiset2019-03-061-0/+1
| | | | | | Fixes: 58bcebd987b ("spirv: Allow [i/u]mulExtended to use new nir opcode") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set num_components on vulkan_resource_index intrinsicLionel Landwerlin2019-03-063-10/+20
| | | | | | | | | | In 61e009d2c4e4df we changed the number of components in the vulkan_resource_index intrinsic and forgot the update Radv's code for it. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 61e009d2c4e4df ("spirv: Use the same types for resource indices as pointers") Reviewed-by: Samuel Pitoiset [email protected]
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-062-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* glsl: rename record_location_offset() -> struct_location_offset()Timothy Arceri2019-03-061-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/record_location_offset(/struct_location_offset(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* radv: properly align the fence and EOP bug VA on GFX9Samuel Pitoiset2019-03-051-2/+5
| | | | | | | | | | If alignement is 0, offets returned by radv_cmd_buffer_upload_alloc() are always 0. These two virtual addresses were pointing at the same location. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate enough space in cmdbuf when starting a subpassSamuel Pitoiset2019-03-051-1/+1
| | | | | | | | | | | | This fixes some CTS crashes with: dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_* Ideally, we should check cmd_buffer->cs->max_dw because there is likely enough space (the internal clear draws allocate space), but keep that way for consistency. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* rav: use 32_AR instead of 32_ABGR when alpha coverage is requiredSamuel Pitoiset2019-03-041-1/+1
| | | | | | | | This export format is faster. Seems to improve performance in Wreckfest. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Interpolate less aggressively.Bas Nieuwenhuizen2019-02-261-9/+12
| | | | | | | | | | | | | | Seems like dxvk used integer builtins without setting the flat interpolation decoration. I believe in the current spec the app is required to set these, but in the meantime to avoid breaking things in stable releases (and so close to release for 19.0), only expand the interpolation to float16 and struct (which cannot be builtins as our spirv parser lowers the builtin block). Fixes: f3247841040 "radv: Allow interpolation on non-float types." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: don't copy buffer descriptors list for samplersSamuel Pitoiset2019-02-261-1/+5
| | | | | | | | | | | Sampler descriptors don't have a buffer list. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*.sampler_*. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix out-of-bounds access when copying descriptors BO listSamuel Pitoiset2019-02-261-2/+0
| | | | | | | | | | | We shouldn't increment the buffer list pointers twice. This fixes some crashes with new CTS dEQP-VK.binding_model.descriptor_copy.*. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix clearing attachments in secondary command buffersSamuel Pitoiset2019-02-251-10/+43
| | | | | | | | | | | If no framebuffer is bound, get the number of samples and the image format from the render pass. This fixes new CTS dEQP-VK.geometry.layered.*.secondary_cmd_buffer. Cc: 18.3 19.0 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Allow interpolation on non-float types.Bas Nieuwenhuizen2019-02-221-10/+9
| | | | | | | | | | In particular structs containing floats and 16-bit floating point types. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Fixes: da295946361 "spirv: Only split blocks" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109735 Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Fix float16 interpolation set up.Bas Nieuwenhuizen2019-02-226-16/+94
| | | | | | | | float16 types can have non-flat interpolation so set up the HW correctly for that. Fixes: 62024fa7750 "radv: enable VK_KHR_16bit_storage extension / 16bit storage features" Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Disable depth clamping even without EXT_depth_range_unrestricted.Bas Nieuwenhuizen2019-02-201-2/+1
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Implement VK_EXT_depth_clip_enable.Bas Nieuwenhuizen2019-02-203-2/+16
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>