summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radv: fix setting VGT_REUSE_OFF for TES on GFX10Samuel Pitoiset2019-07-091-2/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix computing the number of ES VGPRS for TES on GFX10Samuel Pitoiset2019-07-091-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: set max workgroup size to 128 for TES as NGG on GFX10Samuel Pitoiset2019-07-091-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix allocating USER SGPRs on GFX10Samuel Pitoiset2019-07-091-7/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* v3d: Early return with handle 0 when getting a bo on the simulatorAlejandro Piñeiro2019-07-091-0/+3
| | | | | | | | | | | | | Until now we were just asking entries on the bo hash table, and don't worry if the handle was NULL, as we were just expecting to get a NULL in return. It seems that now the hash table assert with some reserverd pointers, included NULL. This commit just early returns with handle 0. This change fixes several crashes on vk-gl-cts GLES tests when using the v3d simulator, like: KHR-GLES3.core.internalformat.copy_tex_image.* Reviewed-by: Eric Anholt <[email protected]>
* vulkan/overlay: use a single macro to lookup objectsLionel Landwerlin2019-07-091-37/+54
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vulkan/overlay: add queue present timing measurementLionel Landwerlin2019-07-092-1/+15
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv/gfx10: Enable tess.Bas Nieuwenhuizen2019-07-091-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Add pipeline state support for tess.Bas Nieuwenhuizen2019-07-092-10/+45
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Only set HW edge flags with gs & tess disabled.Bas Nieuwenhuizen2019-07-091-1/+2
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Add tess eval ngg shader support.Bas Nieuwenhuizen2019-07-091-8/+17
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Use correct gs_out with tessellation.Bas Nieuwenhuizen2019-07-091-0/+3
| | | | | | | | We should use the primitives output by the TES in that case. There is always a separate TES if there is no GS. Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Use correct count of max_offchip_buffers.Bas Nieuwenhuizen2019-07-091-1/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv/gfx10: Load global pointers in correct userdata registers for hs/gs.Bas Nieuwenhuizen2019-07-091-2/+2
| | | | | | Fixes: cfaad5e3cad "radv/gfx10: implement radv_emit_global_shader_pointers()" Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: update function name in commentTimothy Arceri2019-07-091-1/+1
| | | | | | This was missed in 2361558eb71d Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* r600: remove query/apply_opaque_metadata callbacksTimothy Arceri2019-07-092-17/+0
| | | | | | | | | | Theses seem to have been radeonsi specific callbacks that are no longer needed now that these drivers no longer share this code path. These callbacks were removed from radeonsi in c0d44fe0e91c. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vulkan/overlay: fix crash on freeing NULL command bufferLionel Landwerlin2019-07-081-0/+4
| | | | | | | | | | | It is legal to call vkFreeCommandBuffers() on NULL command buffers. This fix requires eb41ce1b012f24 ("util/hash_table: Properly handle the NULL key in hash_table_u64"). Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 4438188f492e1f ("vulkan/overlay: record stats in command buffers and accumulate on exec/submit") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vulkan: bump headers & registry to 1.1.114Lionel Landwerlin2019-07-092-6/+102
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: only use specialised 3D meta paths on GFX9.Dave Airlie2019-07-092-16/+16
| | | | | | GFX10 appears to act like GFX8 here. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa: Set minimum possible GLSL versionIan Romanick2019-07-081-0/+11
| | | | | | | | | | | | | | | | | | Set the absolute minimum possible GLSL version. API_OPENGL_CORE can mean an OpenGL 3.0 forward-compatible context, so that implies a minimum possible version of 1.30. Otherwise, the minimum possible version 1.20. Since Mesa unconditionally advertises GL_ARB_shading_language_100 and GL_ARB_shader_objects, every driver has GLSL 1.20... even if they don't advertise any extensions to enable any shader stages (e.g., GL_ARB_vertex_shader). Converts about 2,500 piglit tests from crash to skip on NV18. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109524 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110955 Cc: [email protected]
* anv: Set maxComputeSharedMemorySize to 64kCaio Marcelo de Oliveira Filho2019-07-081-1/+1
| | | | | | | This value is supported since gen7. See also 8514c75a26e "i965: Set compute shader shared memory max to 64k". Reviewed-by: Jason Ekstrand <[email protected]>
* intel/vec4: Delete vec4_visitor::emit_lrpIan Romanick2019-07-083-32/+5
| | | | | | | | Effectivley unused since dd7135d55d5 ("intel/compiler: Use the flrp lowering pass for all stages on Gen4 and Gen5"). I had intended to remove this code as part of that series, but I forgot. Reviewed-by: Matt Turner <[email protected]>
* nir: Allow nir_ssa_alu_instr_src_components to operate on non-SSA destinationsIan Romanick2019-07-081-6/+3
| | | | | | | | | | | | | | | | Existing users only operate on instructions with SSA destinations. Some later patches add new direct calls and indirect calls (via existing NIR functions) on instructions after going out of SSA. At the very least, these calls are added by: intel/vec4: Try to emit a VF source in try_immediate_source intel/vec4: Try to emit a single load for multiple 3-src instruction operands The first commit adds direct calls, and the second adds calls via nir_alu_srcs_equal and nir_alu_srcs_negative_equal. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Handle swizzle in nir_alu_srcs_negative_equalIan Romanick2019-07-083-4/+110
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I added this function, I was not sure if swizzles of immediate values were a thing that occurred in NIR. The only existing user of these functions is the partial redundancy elimination for compares. Since comparison instructions are inherently scalar, this does not occur. However, a couple later patches, "nir/algebraic: Recognize open-coded flrp(-1, 1, a) and flrp(1, -1, a)" combined with "intel/vec4: Try to emit a single load for multiple 3-src instruction operands", collaborate to create a few thousand instances. No shader-db changes on any Intel platform. v2: Handle the swizzle in nir_alu_srcs_negative_equal and leave nir_const_value_negative_equal unchanged. Suggested by Jason. v3: Correctly handle write masks. Add note (and assertion) that the caller is responsible for various compatibility checks. The single existing caller only calls this for combinations of scalar fadd and float comparison instructions, so all of the requirements are met. A later patch (intel/vec4: Try to emit a single load for multiple 3-src instruction operands) will call this for sources of the same instruction, so all of the requirements are met. v4: Add unit test for nir_opt_comparison_pre that is fixed by this commit. Reviewed-by: Matt Turner <[email protected]>
* nir: nir_const_value_negative_equal compares one value at a timeIan Romanick2019-07-083-92/+24
| | | | | | Reviewed-by: Jason Ekstrand <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Port some const_value_negative_equal tests to alu_src_negative_equalIan Romanick2019-07-081-0/+82
| | | | | | | The next commit will make the existing tests irrelevant. Reviewed-by: Matt Turner <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir: Pass fully qualified type to nir_const_value_negative_equalIan Romanick2019-07-083-217/+169
| | | | | | Reviewed-by: Jason Ekstrand <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Use nir_src_bit_size instead of alu1->dest.dest.ssa.bit_sizeIan Romanick2019-07-082-1/+218
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is important because, for example nir_op_fne has dest.dest.ssa.bit_size == 1, but the source operands can be 16-, 32-, or 64-bits. Fixing this helps partial redundancy elimination for compares in a few more shaders. v2: Add unit tests for nir_opt_comparison_pre that are fixed by this commit. All Intel platforms had similar results. total instructions in shared programs: 17179408 -> 17179081 (<.01%) instructions in affected programs: 43958 -> 43631 (-0.74%) helped: 118 HURT: 2 helped stats (abs) min: 1 max: 5 x̄: 2.87 x̃: 2 helped stats (rel) min: 0.06% max: 4.12% x̄: 1.19% x̃: 0.81% HURT stats (abs) min: 6 max: 6 x̄: 6.00 x̃: 6 HURT stats (rel) min: 5.83% max: 6.06% x̄: 5.94% x̃: 5.94% 95% mean confidence interval for instructions value: -3.08 -2.37 95% mean confidence interval for instructions %-change: -1.30% -0.85% Instructions are helped. total cycles in shared programs: 360959066 -> 360942386 (<.01%) cycles in affected programs: 774274 -> 757594 (-2.15%) helped: 111 HURT: 4 helped stats (abs) min: 1 max: 1591 x̄: 169.49 x̃: 36 helped stats (rel) min: <.01% max: 24.43% x̄: 8.86% x̃: 2.24% HURT stats (abs) min: 1 max: 2068 x̄: 533.25 x̃: 32 HURT stats (rel) min: 0.02% max: 5.10% x̄: 3.06% x̃: 3.56% 95% mean confidence interval for cycles value: -200.61 -89.47 95% mean confidence interval for cycles %-change: -10.32% -6.58% Cycles are helped. Reviewed-by: Jason Ekstrand <[email protected]> [v1] Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]> Fixes: be1cc3552bc ("nir: Add nir_const_value_negative_equal")
* intel/vec4: Reswizzle VF immediates tooIan Romanick2019-07-081-1/+23
| | | | | | | | | | | | | | | | | Previously, an instruction like mul(8) vgrf29.xy:F, vgrf25.yxxx:F, [-1F, 1F, 0F, 0F] would get rewritten as mul(8) vgrf0.yz:F, vgrf25.yyxx:F, [-1F, 1F, 0F, 0F] The latter does not produce the correct result. The VF immediate in the second should be either [-1F, -1F, 1F, 1F] or [0F, -1F, 1F, 0F]. This commit produces the former. Fixes: 1ee1d8ab468 ("i965/vec4: Reswizzle sources when necessary.") Reviewed-by: Matt Turner <[email protected]>
* nir: Add unit tests for nir_opt_comparison_preIan Romanick2019-07-084-1/+334
| | | | | | | | | Each tests has a comment with the expected before and after NIR. The tests don't actually check this. The tests only check whether or not the optimization pass reported progress. I couldn't think of a robust, future-proof way to check the before and after code. Reviewed-by: Matt Turner <[email protected]>
* anv: disable repacking for compression for applicable genDongwon Kim2019-07-081-0/+18
| | | | | | | | set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* iris: disable repacking for compression for applicable genDongwon Kim2019-07-081-0/+11
| | | | | | | | set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: disable repacking for compression for applicable genDongwon Kim2019-07-082-0/+10
| | | | | | | | set bit15 (Disable Repacking for Compression) of CACHE_MODE_0 register if the gen attribute, 'disable_ccs_repack' is set. Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: add disable_ccs_repack to gen_device_infoDongwon Kim2019-07-082-0/+4
| | | | | | | | | | add a new attribute, 'disable_ccs_repack' to gen_device info, which indicates whether repacking of components in certain pixel formats before compression needs to be disabled to keep the compatibility with decompression capability of display controller (gen11+) Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/genxml: correct bit fields in CACHE_MODE_0 reg for gen11Dongwon Kim2019-07-081-16/+14
| | | | | | | correct bit fields information of CACHE_MODE_0 reg in current gen11.xml Signed-off-by: Dongwon Kim <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nir: print ptr_stride for deref_castsCaio Marcelo de Oliveira Filho2019-07-081-0/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* anv: Advertise VK_EXT_shader_demote_to_helper_invocationCaio Marcelo de Oliveira Filho2019-07-084-0/+9
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Implement SPV_EXT_demote_to_helper_invocationCaio Marcelo de Oliveira Filho2019-07-082-0/+27
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Update the headers from latest Khronos masterCaio Marcelo de Oliveira Filho2019-07-082-164/+258
| | | | | | | This corresponds to 29c11140baaf9f7fdaa39a583672c556bf1795a1 in https://github.com/KhronosGroup/SPIRV-Headers. Acked-by: Jason Ekstrand <[email protected]>
* intel/fs: Implement "demote to helper invocation"Caio Marcelo de Oliveira Filho2019-07-081-1/+23
| | | | | | | | | | | | | | | | | The "demote" intrinsic works like "discard" but don't change the control flow, allowing derivative operations to work. This is the semantics of D3D discard. The "is_helper_invocation" intrinsic will return true for helper invocations -- both the ones that started as helpers and the ones that where demoted. This is needed to avoid changing the behavior of gl_HelperInvocation which is an input (so not expected to change during shader execution). v2: Emit the discard jump and comment why it is safe. (Jason) Rework the is_helper_invocation() that was stomping f0.1. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add demote and is_helper_invocation intrinsicsCaio Marcelo de Oliveira Filho2019-07-082-0/+11
| | | | | | | | | From SPV_EXT_demote_to_helper_invocation. Demote will be implemented as a variant of discard, so mark uses_discard if it is used. v2: Add CAN_ELIMINATE flag to the new intrinsic. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* radv: do not emit VGT_FLUSH on GFX10Samuel Pitoiset2019-07-081-2/+5
| | | | | | | We don't need it. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Remove now-unused interp_deref handlingConnor Abbott2019-07-081-149/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: Use NIR barycentric intrinsicsConnor Abbott2019-07-084-69/+143
| | | | | | | | This is simpler than radv, since the driver_location is already assigned for us. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: Delete unreachable codeConnor Abbott2019-07-081-11/+0
| | | | | | | | We always get gl_FragCoord as a system value, not a varying, so this is never hit. We already set PIXEL_CENTER_INTEGER elsewhere. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* compiler: Add color system valueConnor Abbott2019-07-084-0/+18
| | | | | | | | This is nice to have with radeonsi, where color varyings are handled specially to avoid recompiles. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: Use NIR barycentric intrinsicsConnor Abbott2019-07-083-191/+156
| | | | | | | | | We have to add a few lowering to deal with things that used to be dealt with inline when creating inputs. We also move the code that fills out the radv_shader_variant_info struct for linking purposes to radv_shader.c, as it's no longer tied to the NIR->LLVM lowering. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Implement barycentric intrinsicsConnor Abbott2019-07-081-0/+198
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel/nir: Extract add_const_offset_to_baseConnor Abbott2019-07-083-74/+100
| | | | | | | | Pretty much every driver using nir_lower_io_to_temporaries followed by nir_lower_io is going to want this. In particular, radv and radeonsi in the next commits. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/lower_io_to_temporaries: Handle interpolation intrinsicsConnor Abbott2019-07-081-0/+166
| | | | | | | These weren't properly supported. This does pretty much the same thing that the radv code did. Reviewed-by: Bas Nieuwenhuizen <[email protected]>