aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
...
* i965/gen10: Remove unnecessary workaround.Rafael Antognolli2019-07-291-16/+0
| | | | | | | | | | | | | | | | | | | | In fact, the description of the workaround states that the mask field doesn't work correctly on gen10, and we need to set it to 0xffff even we we only want to update a single field: "The mask bits are not implemented properly on 3DSTATE_3D_MODE. Driver must always program bits 31:16 of DW1 a value of 0xFFFF. This means if it is only updating 1 field, it must update all the fields to the correct value." So unless we want to change any of the fields of 3DSTATE_3D_MODE, there's not need to emit. Additionally, it seems this workaround is not required on gen11. And last but not least, this workaround is not implemented on iris or anv, and it doesn't seem to be missed there. So let's just remove the whole thing. Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Fix SO offset to be 32-bit in DrawTransformFeedback handlingKenneth Graunke2019-07-291-1/+1
| | | | | | | | | | We accidentally started copying a full 64-bit value rather than copying a 32-bit offset and zeroing the top 32-bits. This caused us to compute bogus vertex counts which could lead to GPU hangs in some cases. Thanks to Clayton Craft for catching the regressions! Fixes: 0e24d10ff5c ("iris: Use gen_mi_builder to handle CS ALU operations.")
* intel: Use a system value for gl_FragCoordJason Ekstrand2019-07-2911-52/+19
| | | | | | | | | | | | It's kind-of an anomaly that the Intel drivers are still treating gl_FragCoord as an input. It also makes zero sense because we have to special-case it in the back-end. Because ANV is the only user of nir_lower_wpos_center, we go ahead and just update it to look for nir_intrinsic_load_frag_coord as part of this patch. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Treat gl_FragCoord as a varying even when it's a system valueJason Ekstrand2019-07-291-1/+3
| | | | | | | This fixes glsl-fcoord-invariant-pass.shader_test on drivers that set GLSLFragCoordIsSysVal which includes radeonsi among others. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa/spirv: Set frag_coord_is_sysval to GLSLFragCoordIsSysValJason Ekstrand2019-07-291-0/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Remove calculate_urb_setup from fs_visitorJason Ekstrand2019-07-292-14/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/a6xx: fix MSAA resolve hangsRob Clark2019-07-291-11/+4
| | | | | | | | | Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size. Fixes intermittent GPU hangs triggered by some of the three.js samples on https://threejs.org/ Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix for array/reg store vs meta instructionsRob Clark2019-07-291-1/+4
| | | | | | | | | | | | | | | fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <[email protected]>
* meson: bump required version to 0.46Eric Engestrom2019-07-291-1/+1
| | | | | | | | | | | 0.45 has a few annoying bugs (like the one in !358 [1]), and 0.46 is well over a year old by now, so let's move to it. [1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/358 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radeon/vcn/vp9: add Arcturus VP9 supportLeo Liu2019-07-291-3/+3
| | | | | | | | Arcturus CHIP enum is less than Navi10, since it's still gfx9, but its VCN version belongs to VCN2.x Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeon/vcn: add Arcturus decode supportLeo Liu2019-07-291-1/+11
| | | | | | | different internal registers offset from previous HW Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* amd: add support for ArcturusMarek Olšák2019-07-294-0/+11
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add AMD_DEBUG=nogfx for testingMarek Olšák2019-07-292-0/+5
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add support for compute-only chipsMarek Olšák2019-07-296-6/+22
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: add compute shaders for deint yuvSonny Jiang2019-07-294-31/+403
| | | | | | Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Signed-off-by: James Zhu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: don't call gfx functions on compute-only chipsSonny Jiang2019-07-291-75/+83
| | | | | | Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Signed-off-by: James Zhu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositorJames Zhu2019-07-292-64/+65
| | | | | | | | Init graphic shader Only when PIPE_CAP_GRAPHICS is true. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium: create multimedia contexts as compute-only if graphics is unsupportedMarek Olšák2019-07-299-12/+21
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium: add PIPE_CAP_GRAPHICSMarek Olšák2019-07-293-0/+4
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radv: implement VK_EXT_index_type_uint8Samuel Pitoiset2019-07-293-6/+61
| | | | | | | Natively supported on VI+. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: implement VK_EXT_index_type_uint8Lionel Landwerlin2019-07-294-22/+66
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan: Bump headers to 1.1.117Lionel Landwerlin2019-07-292-24/+250
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* include/vulkan: bump vk_android_native_bufferLionel Landwerlin2019-07-291-15/+54
| | | | | | | Taken off https://android.googlesource.com/platform/frameworks/native/+/refs/tags/android-9.0.0_r45/vulkan/include/vulkan/vk_android_native_buffer.h Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/mi: only resolve to a temp register if source isn't in memoryEric Engestrom2019-07-291-1/+1
| | | | | | | | | aka. fix a s/||/&&/ typo Fixes: 74063ee61aadd1371a9b ("intel/mi: Add a new gen_mi_store_if() helper.") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gitlab-ci: Enable freedreno shader-db runs.Eric Anholt2019-07-291-3/+5
| | | | | | | | Now that helgrind is less upset and I've completed many successful full shader-db runs, we should be able to enable freedreno shader-db runs for Mesa checkins on the tiny public shader-db. Reviewed-by: Rob Clark <[email protected]>
* nir: Fix helgrind complaints about data race in trivial_swizzle init.Eric Anholt2019-07-291-3/+3
| | | | | | | | | | Even if the data race wasn't real (I'm not great at reasoning about this), helgrind is a nice enough tool that keeping noise out of it is probably worthwhile. Besides, typing out the numbers keeps the data in the read-only data section instead of emitting code to initialize it every time. Reviewed-by: Iago Toral Quiroga <[email protected]>
* freedreno: Fix data race on making the shader's id.Eric Anholt2019-07-291-1/+2
| | | | | | | The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Take a lock around shader variant creation.Eric Anholt2019-07-292-0/+7
| | | | | | | | | Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix data races with allocating/freeing struct ir3.Eric Anholt2019-07-291-1/+1
| | | | | | | | | | | | | | There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: 8fe20762433d ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix helgrind complaint on shader-db key setup.Eric Anholt2019-07-291-2/+1
| | | | | | | If the variable's going to be static, we shouldn't be memsetting it from every thread and instead just have it in the data section. Reviewed-by: Rob Clark <[email protected]>
* radv: Take variable descriptor counts into account for buffer entries.Bas Nieuwenhuizen2019-07-291-1/+10
| | | | | | Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <[email protected]>
* anv: Don't claim support for 24 and 48-bit formats on IVBJason Ekstrand2019-07-291-0/+8
| | | | Cc: [email protected]
* isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSWJason Ekstrand2019-07-291-1/+5
| | | | | | | | | On Haswell, the format works but it doesn't properly do an sRGB decode. It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this format so this only affects Vulkan on HSW. Cc: [email protected] Reviewed-by: Eric Engestrom <[email protected]>
* pan/midgard: Fix alpha test w.r.t new indexingAlyssa Rosenzweig2019-07-291-1/+2
| | | | | | Fixes: 9beb3391b55 ("pan/midgard: Tag SSA/reg") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* softpipe: Don't draw when rasterizer_discard is setGert Wollny2019-07-291-0/+3
| | | | | | | | | | | | | Fixes: dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Fix cube arrays layer selectionGert Wollny2019-07-291-8/+8
| | | | | | | | | | | To select the correct layer the z-coordinate must be rounded before it is multiplied by six. Fixes a number of tests out of dEQP-GLES31.functional.texture.filtering.cube_array.formats.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vulkan/wsi/wayland: implement acquire timeoutLionel Landwerlin2019-07-291-25/+51
| | | | | | | | | | | | | | | v2: Eric's nits v3: Reuse timespec utils (Daniel) Deal with ppoll being interrupted by a signal (Daniel) v4: Remove unnecessary time check v5: Deal with EAGAIN from wl_display_prepare_read_queue() (Daniel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) Reviewed-by: Daniel Stone <[email protected]>
* util: add a timespec helperLionel Landwerlin2019-07-295-0/+673
| | | | | | | | Copied from Weston, upon Daniel's suggestion Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Daniel Stone <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* intel: replace large stack buffer with heap allocationEric Engestrom2019-07-293-31/+37
| | | | | | | | | For now, this keeps the "100 bytes" allocation; we can try to figure out the correct size as a follow up. Suggested-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* radv/gfx10: do not use the fast depth or stencil clear bytes pathSamuel Pitoiset2019-07-291-2/+3
| | | | | | | | | It causes issues on GFX10. This fixes rendering issues with vkmark and Wreckfest at least. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]
* ac: do not crash when the buffer data format is invalidSamuel Pitoiset2019-07-291-0/+1
| | | | | | | | | | This might happen when a pipeline doesn't define the vertex input state, so the buffer data format is 0 (aka INVALID). This fixes crashes when compiling some shaders on GFX10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix txf_ms with an offsetRhys Perry2019-07-291-2/+2
| | | | | | | | | | Seems to fix some hair artifacts in Max Payne 3: https://github.com/daniel-schuermann/mesa/issues/76 Signed-off-by: Rhys Perry <[email protected]> Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Delete unused local variables in optimization loopConnor Abbott2019-07-291-0/+2
| | | | | | | | | | | | | | | | Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 620 -> 560 (-9.68 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 292 -> 292 (0.00 %) dwords per thread Code Size: 20024 -> 20144 (0.60 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 25 -> 25 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/find_array_copies: Handle wildcards and overlapping copiesConnor Abbott2019-07-293-185/+405
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Print array deref indices as decimalConnor Abbott2019-07-291-1/+1
| | | | | | | We print the size as decimal too, and using hex without a leading "0x" was very confusing. Reviewed-by: Jason Ekstrand <[email protected]>
* lima/gpir/sched: Handle more special ops in can_use_complex()Connor Abbott2019-07-281-5/+24
| | | | | | | | | | | | | We were missing handling for a few other ops that rearrange their sources somehow in codegen, namely complex2 and select. This should fix [email protected]@execution@built-in-functions@vs-asin-vec3 and possibly other random regressions from the new scheduler which were supposed to be fixed in the commit right after. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
* lima/gp: Clean up lima_program_optimize_vs_nir() a littleConnor Abbott2019-07-281-1/+1
| | | | | | | | | | Remove an unnecessary nir_lower_regs_to_ssa as that should be done by the state tracker, and add a missing DCE pass after running copy propagation in order to remove the dead copies. This shouldn't fix anything but the second part will reduce shader sizes. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir/sched: Don't try to spill when something else has succeededConnor Abbott2019-07-281-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In try_node(), we assume that the node we pick can still be scheduled successfully after speculatively trying all the other nodes. Normally we always undo every node after speculating it, so that when we finally schedule best_node the scheduler state is exactly the same and it succeeds. However, we also try to spill nodes, which can change the state and in a corner case that can make scheduling best_node fail. In particular, the following sequence of events happened with piglit shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a register store node S, which is a use of N, was created and then later the other uses of N were scheduled, so that S is now ready and N is partially ready. First we try to schedule S and succeed, then we try to schedule another node M, which fails, so we try to spill the remaining uses of N. This succeeds, but scheduling M still fails so that best_node is still S. However since one of the uses of N is one cycle ago, and therefore we inserted a read dependent on S one cycle ago when spilling N, S can no longer be scheduled as read-after-write latency is three cycles. While we could ad-hoc try to catch cases like this, or (the best option but very complicated) treat the spill as speculative and roll it back if we decide not to schedule the node, a simpler solution is to just give up on spilling if we've already successfully speculatively scheduled another node. We'd give up a few cases where we discover that by spilling even harder we could schedule a more desirable node, but that seems like it would be pretty rare in practice. With this we guarantee that nothing has been touched after best_node was successfully scheduled. We also cut down on pointless spilling, since if we already scheduled a node it's unlikely that spilling harder will let us schedule an even better node, and hence any spilling at this point is probably useless. While we're here, clean up the code around spilling by flattening the two if's and getting rid of the second unnecessary check for INT_MIN. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Acked-by: Qiang Yu <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nv50/ir: don't consider the main compute function as taking argumentsIlia Mirkin2019-07-271-1/+1
| | | | | | | | | | | | With OpenCL, kernels can take arguments and return values (?). However in practice, there is no more TGSI compute implementation, and even if there were, it would probably have named functions and no explicit main. This improves RA considerably for compute shaders, since temps are not kept around as return values. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nv50/ir: handle insn not being there for definition of CVT argIlia Mirkin2019-07-271-2/+3
| | | | | | | | | This can happen if it's e.g. a uniform or a function argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]