summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8Samuel Pitoiset2018-03-232-18/+24
| | | | | | | | | | | | | | | | | | | | | The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_calc_decompress_on_z_planes() helperSamuel Pitoiset2018-03-231-14/+37
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_image_is_tc_compat_htile() helperSamuel Pitoiset2018-03-231-11/+45
| | | | | | | Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Rename image intrinsics to image_varJason Ekstrand2018-03-235-47/+47
| | | | | | | | | | | Generated with git grep -l nir_intrinsic_image | xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
* radv: autotools: add radv_extensions.h in the generated VULKAN listJuan A. Suarez Romero2018-03-221-1/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* anv/radv: autotools: include vulkan_*.h headersJuan A. Suarez Romero2018-03-221-0/+4
| | | | Reviewed-by: Emil Velikov <[email protected]>
* radv: remove unused radv_pipeline::needs_data_cache variableSamuel Pitoiset2018-03-221-1/+0
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* ac/nir_to_llvm: add frexp supportTimothy Arceri2018-03-221-0/+11
| | | | | | | | | | | | | | Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: compute tile swizzle for GFX9Marek Olšák2018-03-212-3/+88
| | | | Tested-by: Dieter Nützel <[email protected]>
* radv: add support for VK_EXT_depth_range_unrestrictedSamuel Pitoiset2018-03-202-0/+23
| | | | | | | | | | | | | | | | This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only enable one channel when exporting prim idSamuel Pitoiset2018-03-201-1/+1
| | | | | | | It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't lower indirects until after opts have runTimothy Arceri2018-03-201-1/+8
| | | | | | | | Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't export NULL layer.Dave Airlie2018-03-191-1/+1
| | | | | | | | | | | | We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: d4c74aed7a8 (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: lower constant initializers on output variables earlierDave Airlie2018-03-191-0/+5
| | | | | | | | | | | | | | | | | If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: 99b57daf4a anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview timestamp queries.Dave Airlie2018-03-191-36/+43
| | | | | | | | For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview queries properly. (v3)Dave Airlie2018-03-191-0/+19
| | | | | | | | | | | | | | | | | | | For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: split out begin/end query emissionDave Airlie2018-03-191-41/+57
| | | | | | | This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/multiview: mark layer_input if we have input attachments.Dave Airlie2018-03-191-1/+3
| | | | | | | | This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: handle exporting view index to fragment shader. (v1.1)Dave Airlie2018-03-194-2/+24
| | | | | | | | | | | | | | | | The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: e3265c10c89 (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: make vk_format_description structures staticGrazvydas Ignotas2018-03-171-1/+1
| | | | | | No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix stale comment in generated vk_format_table.cGrazvydas Ignotas2018-03-171-1/+1
| | | | | | It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: run nir_opt_move_load_uboSamuel Pitoiset2018-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: drop geometry stride user sgpr.Dave Airlie2018-03-163-28/+19
| | | | | | | This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: get rid of geometry user sgpr for num entries.Dave Airlie2018-03-162-16/+8
| | | | | | | | This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: migrate lds size calculations to shader gen.Dave Airlie2018-03-163-25/+38
| | | | | | | | This moves the lds_size calcs into the shader so we have all the size stuff in one file. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop scanning the tess shader in the nir code.Dave Airlie2018-03-163-42/+3
| | | | | | | | This drops the now unneeded scanning and results in favour of the ones in the info. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: use num_patches output from tcs shader.Dave Airlie2018-03-161-28/+2
| | | | | | | Instead of recalculating the value, use the shader calculated value. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/tess: remove last chunk of tess sgprsDave Airlie2018-03-163-53/+19
| | | | | | | This removes the last TES-specifc user sgpr. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: pass num_patches to tes from tcsDave Airlie2018-03-163-2/+9
| | | | | | | TES needs num_patches to do some of the calculations. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop tess offchip layout for tcs.Dave Airlie2018-03-164-38/+90
| | | | | | | This removes the last TCS specific user sgpr. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop tcs_out_offsetsDave Airlie2018-03-162-20/+29
| | | | | | | Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop tcs_out_layoutDave Airlie2018-03-162-15/+15
| | | | | | | Move all calculations to shader generation. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/tess: drop tcs_in_layout setting completely.Dave Airlie2018-03-163-15/+24
| | | | | | | Inline all calcs at shader creation. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop ls_out_layout const.Dave Airlie2018-03-163-37/+4
| | | | | | | We can precalculate input_vertex_size at compile time. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/shader_info: start gathering tess output info (v2)Dave Airlie2018-03-162-2/+52
| | | | | | | | | | | | | | | | | | This gathers the ls outputs written by the vertex shader, and the tcs outputs, these are needed to calculate certain tcs parameters. These have to be separate for combined gfx9 shaders. This is a bit pessimistic compared to the nir pass, as we don't work out the individual slots for tcs outputs, but I actually thing it should be fine to just mark the whole thing used here. v2: move to radv, handle clip dist (Samuel), handle compacts and patchs properly. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: migrate unique index info shader info (v2)Dave Airlie2018-03-162-22/+21
| | | | | | | | | | This just moves this function to an inline so the shader_info pass can use it. v2: use inline (Samuel) Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: dump LLVM IR when a hang is detectedSamuel Pitoiset2018-03-151-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: record LLVM IR when debugging shadersSamuel Pitoiset2018-03-153-0/+12
| | | | | | | | If AMD_shader_info or RADV_TRACE_FILE is used we might need to keep trace of LLVM IR. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add dump_shader to the NIR compiler optionsSamuel Pitoiset2018-03-154-22/+19
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: pass the NIR compiler options to ac_compile_llvm_module()Samuel Pitoiset2018-03-151-5/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: print some information when RADV_TRACE_FILE is setSamuel Pitoiset2018-03-153-1/+9
| | | | | | | | Just to be sure all options are enabled when trying to generate a hang report. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only display options that are enabledSamuel Pitoiset2018-03-151-12/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv/radv: add AMD_gcn_shader capability, remove current extensionsAlejandro Piñeiro2018-03-151-3/+1
| | | | | | | | | | | | | | | | | | | | | | So now, during spirv_to_nir, it uses the capability instead of the extension. Note that we are really doing here is treating SPV_AMD_gcn_shader as other supported extensions. SPV_AMD_gcn_shader is not the first SPV extension supported. For example, the capability draw_parameters infers if the extension SPV_KHR_shader_draw_parameters is supported or not. This could be seen as counter-intuitive, and that it would be easier to define which extensions are supported, and based our checks on that, but we need to take into account that some capabilities are optional from core, and others came from new extensions. Also this commit would make the implementation of ARB_spirv_extensions easier. v2: AMD_gcn_shader capability renamed to gcn_shader (Daniel Schürmann) Reviewed-by: Daniel Schürmann <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Fix CmdCopyImage between uncompressed and compressed imagesAlex Smith2018-03-141-6/+17
| | | | | | | | | | | | | | | | | From the spec: "When copying between compressed and uncompressed formats the extent members represent the texel dimensions of the source image and not the destination." However, as per 7b890a36, we must still use the destination image type when clamping the extent so that we copy the correct number of layers for 2D to 3D copies. Fixes: 7b890a36 "radv: Fix vkCmdCopyImage for 2d slices into 3d Images" Cc: <[email protected]> Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix vkGetDeviceQueue2() when create flags don't matchSamuel Pitoiset2018-03-142-2/+22
| | | | | | | | This fixes CTS: dEQP-VK.api.device_init.create_device_queue2_unmatched_flags Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: drop assert on bindingDescriptorCount > 0Dave Airlie2018-03-141-1/+0
| | | | | | | | | | | The spec is pretty clear that this can be 0, and that it operates as a reserved binding. Fixes: dEQP-VK.binding_model.descriptor_update.empty_descriptor.uniform_buffer Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: mark all tess output for an indirect access.Dave Airlie2018-03-141-8/+13
| | | | | | | | | | | If a shader does a tcs store with an indirect access, we were only marking the first spot as used. For indirect access we always now mark all slots used by the variable. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: pass the nir variable through tcs loading.Dave Airlie2018-03-143-17/+11
| | | | | | | | | | | | I was going to have to add another parameter to this monster, so we should just pass the nir_variable in, I can't find any reason this would be a bad idea. This needed for the next fix. Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: get correct offset into LDS for indexed vars.Dave Airlie2018-03-141-1/+1
| | | | | | | | | | | | This seems more correct to me, since if we have an array of floats they'll be vec4 aligned, and if we do af[2], we want the const index to increase by 2 slots in the non compact case. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464 Fixes: 94f9591995 (radv/ac: add support for TCS/TES inputs/outputs.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Use lower_vote_eq_to_ballot instead of ac_nir_lower_subgroupsJason Ekstrand2018-03-135-99/+0
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>