summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: don't use the SPI barrier management bug workaroundSamuel Pitoiset2018-04-041-0/+5
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: mask out high VM address bits in registers where neededSamuel Pitoiset2018-04-043-19/+19
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable VK_EXT_shader_viewport_index_layerSamuel Pitoiset2018-04-032-0/+2
| | | | | | | | The driver already supports exporting the Layer and ViewportIndex built-ins from vertex or tessellation shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Fix include for LLVMAddPromoteMemoryToRegisterPassMike Lothian2018-04-021-0/+3
| | | | | | | | | Include llvm-c/Transforms/Utils.h with the newest LLVM 7 Signed-of-by: Mike Lothian <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: move FMASK shader logic to shared codeMarek Olšák2018-04-022-0/+59
| | | | | | We'll need it for FBFETCH in both TGSI and NIR paths. Tested-by: Dieter Nützel <[email protected]>
* ac/gpu_info: print GB_ADDR_CONFIGMarek Olšák2018-04-022-0/+51
|
* ac/gpu_info: reorder the fields and print them nicelyMarek Olšák2018-04-022-55/+76
|
* ac/gpu_info: rename has_virtual_memory -> r600_has_virtual_memoryMarek Olšák2018-04-022-2/+2
|
* ac/gpu_info: don't print irrelevant fieldsMarek Olšák2018-04-021-5/+0
|
* radv: set SAMPLE_RATE to the number of samples of the current fbSamuel Pitoiset2018-03-303-4/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* util: Include bitscan.h directlyIan Romanick2018-03-291-0/+1
| | | | | | | | | | | | | | | Previously bitset.h would include u_math.h to get bitscan.h. u_math.h lives in src/gallium/auxiliary/util while both bitset.h and bitscan.h live in src/util. Having the one file directly include another file that lives in the same directory makes much more sense. As a side-effect, several files need to directly include standard header files that were previously indirectly included. v2: Fix build break in src/amd/common/ac_nir_to_llvm.c. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick2018-03-293-5/+5
| | | | | | | | | | | util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* radv: fix scanning output_usage_mask with structsSamuel Pitoiset2018-03-291-4/+52
| | | | | | | | | | | | To fix a regression in: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.struct And the following regressions (Polaris only): dEQP-VK.glsl.indexing.varying_array.* Fixes: f3275ca01c ("ac/nir: only enable used channels when exporting parameters") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: enable VK_AMD_shader_trinary_minmax extensionDaniel Schürmann2018-03-292-0/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: add support for trinary_minmax instructionsDaniel Schürmann2018-03-291-0/+54
| | | | | | | v2: Add missing break (Bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Add workaround for GFX9 buffer views.Bas Nieuwenhuizen2018-03-295-7/+70
| | | | | | | | | | | | | | | | | | | | | On GFX9 whether the buffer size is interpreted as elements or bytes depends on whether IDXEN is enabled in the instruction. If the index is a constant zero, LLVM optimizes IDXEN to 0. Now the size in elements is interpreted in bytes which of course results in out of bounds accesses. The correct fix is most likely to disable the LLVM optimization, but we need something to work with LLVM <= 6.0. radeonsi does the max between stride and element count on the CPU but that results in the size intrinsics returning the wrong size for the buffer. This would cause CTS errors for radv. v2: Also include the store changes. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/surface: set AddrSurfInfoIn.format = ADDR_FMT_8 for stencil, add assertionsMarek Olšák2018-03-281-0/+8
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105738 Tested-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable VK_EXT_sampler_filter_minmaxSamuel Pitoiset2018-03-281-0/+1
| | | | | | | Only enable for CIK+ because it's buggy on SI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for VK_EXT_sampler_filter_minmaxSamuel Pitoiset2018-03-282-1/+70
| | | | | | | The driver only supports the required formats for now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rename VEGA10 device nameSamuel Pitoiset2018-03-281-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for Vega12Samuel Pitoiset2018-03-283-1/+6
| | | | | | | Based on RadeonSI. Untested. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: add support for Vega12Marek Olšák2018-03-283-7/+28
| | | | Reviewed-by: Alex Deucher <[email protected]>
* amd/addrlib: update to the latest version for Vega12Marek Olšák2018-03-2817-148/+439
| | | | Reviewed-by: Alex Deucher <[email protected]>
* ac/radeonsi: pass bindless bool to load_sampler_desc()Timothy Arceri2018-03-283-4/+13
| | | | | | | | We also fix the base_index for bindless by using the driver location. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir_to_llvm: fix component packing for double outputsTimothy Arceri2018-03-281-1/+3
| | | | | | | | | | We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't reallocate on DMABUF export if local BOs are disabledMarek Olšák2018-03-262-0/+3
|
* radv: enable TC-compat HTILE for 16-bit depth surfaces on GFX8Samuel Pitoiset2018-03-232-18/+24
| | | | | | | | | | | | | | | | | | | | | The hardware only supports 32-bit depth surfaces, but we can enable TC-compat HTILE for 16-bit depth surfaces if no Z planes are compressed. The main benefit is to reduce the number of depth decompression passes. Also, we don't need to implement DB->CB copies which is fine. This improves Serious Sam 2017 by +4%. Talos and F12017 are also affected but I don't see a performance difference. This also improves the shadowmapping Vulkan demo by 10-15% (FPS is now similar to AMDVLK). No CTS regressions on Polaris10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_calc_decompress_on_z_planes() helperSamuel Pitoiset2018-03-231-14/+37
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_image_is_tc_compat_htile() helperSamuel Pitoiset2018-03-231-11/+45
| | | | | | | Instead of that huge conditional that's going to be crazy. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Rename image intrinsics to image_varJason Ekstrand2018-03-235-47/+47
| | | | | | | | | | | Generated with git grep -l nir_intrinsic_image | xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
* radv: autotools: add radv_extensions.h in the generated VULKAN listJuan A. Suarez Romero2018-03-221-1/+2
| | | | Reviewed-by: Emil Velikov <[email protected]>
* anv/radv: autotools: include vulkan_*.h headersJuan A. Suarez Romero2018-03-221-0/+4
| | | | Reviewed-by: Emil Velikov <[email protected]>
* radv: remove unused radv_pipeline::needs_data_cache variableSamuel Pitoiset2018-03-221-1/+0
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* ac/nir_to_llvm: add frexp supportTimothy Arceri2018-03-221-0/+11
| | | | | | | | | | | | | | Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/surface: compute tile swizzle for GFX9Marek Olšák2018-03-212-3/+88
| | | | Tested-by: Dieter Nützel <[email protected]>
* radv: add support for VK_EXT_depth_range_unrestrictedSamuel Pitoiset2018-03-202-0/+23
| | | | | | | | | | | | | | | | This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only enable one channel when exporting prim idSamuel Pitoiset2018-03-201-1/+1
| | | | | | | It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't lower indirects until after opts have runTimothy Arceri2018-03-201-1/+8
| | | | | | | | Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't export NULL layer.Dave Airlie2018-03-191-1/+1
| | | | | | | | | | | | We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: d4c74aed7a8 (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: lower constant initializers on output variables earlierDave Airlie2018-03-191-0/+5
| | | | | | | | | | | | | | | | | If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: 99b57daf4a anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview timestamp queries.Dave Airlie2018-03-191-36/+43
| | | | | | | | For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview queries properly. (v3)Dave Airlie2018-03-191-0/+19
| | | | | | | | | | | | | | | | | | | For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: split out begin/end query emissionDave Airlie2018-03-191-41/+57
| | | | | | | This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/multiview: mark layer_input if we have input attachments.Dave Airlie2018-03-191-1/+3
| | | | | | | | This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: handle exporting view index to fragment shader. (v1.1)Dave Airlie2018-03-194-2/+24
| | | | | | | | | | | | | | | | The fragment shader was trying to read this, but nothing was exporting it from the vertex shader. This handles it like the prim id export. Fixes: dEQP-VK.multiview.secondary_cmd_buffer.* dEQP-VK.multiview.index.fragment_shader.* v1.1: updated to use 0x1 (Samuel) Fixes: e3265c10c89 (radv: Implement multiview draws.) Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: make vk_format_description structures staticGrazvydas Ignotas2018-03-171-1/+1
| | | | | | No need to bother the linker about them. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix stale comment in generated vk_format_table.cGrazvydas Ignotas2018-03-171-1/+1
| | | | | | It seems to be a leftover from u_format_table.py. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: run nir_opt_move_load_uboSamuel Pitoiset2018-03-161-0/+1
| | | | | | | | | | | | | | | | | | | | Polaris10: SGPRS: 108560 -> 107856 (-0.65 %) VGPRS: 74576 -> 74520 (-0.08 %) Spilled SGPRs: 7375 -> 7113 (-3.55 %) Code Size: 4273464 -> 4274364 (0.02 %) bytes Max Waves: 9434 -> 9446 (0.13 %) Vega10: Totals from affected shaders: SGPRS: 108264 -> 107576 (-0.64 %) VGPRS: 69068 -> 69000 (-0.10 %) Spilled SGPRs: 7221 -> 6959 (-3.63 %) Code Size: 3800796 -> 3801496 (0.02 %) bytes Max Waves: 10687 -> 10709 (0.21 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: drop geometry stride user sgpr.Dave Airlie2018-03-163-28/+19
| | | | | | | This removes the other geometry specific user sgpr. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: get rid of geometry user sgpr for num entries.Dave Airlie2018-03-162-16/+8
| | | | | | | | This drops one of the geometry specific user sgprs, we can work this out at compile time. Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>