aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: fix for array/reg store vs meta instructionsRob Clark2019-07-291-1/+4
| | | | | | | | | | | | | | | fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <[email protected]>
* meson: bump required version to 0.46Eric Engestrom2019-07-291-1/+1
| | | | | | | | | | | 0.45 has a few annoying bugs (like the one in !358 [1]), and 0.46 is well over a year old by now, so let's move to it. [1] https://gitlab.freedesktop.org/mesa/mesa/merge_requests/358 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radeon/vcn/vp9: add Arcturus VP9 supportLeo Liu2019-07-291-3/+3
| | | | | | | | Arcturus CHIP enum is less than Navi10, since it's still gfx9, but its VCN version belongs to VCN2.x Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeon/vcn: add Arcturus decode supportLeo Liu2019-07-291-1/+11
| | | | | | | different internal registers offset from previous HW Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* amd: add support for ArcturusMarek Olšák2019-07-294-0/+11
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add AMD_DEBUG=nogfx for testingMarek Olšák2019-07-292-0/+5
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add support for compute-only chipsMarek Olšák2019-07-296-6/+22
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: add compute shaders for deint yuvSonny Jiang2019-07-294-31/+403
| | | | | | Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Signed-off-by: James Zhu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: don't call gfx functions on compute-only chipsSonny Jiang2019-07-291-75/+83
| | | | | | Signed-off-by: Sonny Jiang <[email protected]> Reviewed-by: Signed-off-by: James Zhu <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium/auxiliary/vl: add PIPE_CAP_GRAPHICS check for vl compositorJames Zhu2019-07-292-64/+65
| | | | | | | | Init graphic shader Only when PIPE_CAP_GRAPHICS is true. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium: create multimedia contexts as compute-only if graphics is unsupportedMarek Olšák2019-07-299-12/+21
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* gallium: add PIPE_CAP_GRAPHICSMarek Olšák2019-07-293-0/+4
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radv: implement VK_EXT_index_type_uint8Samuel Pitoiset2019-07-293-6/+61
| | | | | | | Natively supported on VI+. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: implement VK_EXT_index_type_uint8Lionel Landwerlin2019-07-294-22/+66
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan: Bump headers to 1.1.117Lionel Landwerlin2019-07-292-24/+250
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* include/vulkan: bump vk_android_native_bufferLionel Landwerlin2019-07-291-15/+54
| | | | | | | Taken off https://android.googlesource.com/platform/frameworks/native/+/refs/tags/android-9.0.0_r45/vulkan/include/vulkan/vk_android_native_buffer.h Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/mi: only resolve to a temp register if source isn't in memoryEric Engestrom2019-07-291-1/+1
| | | | | | | | | aka. fix a s/||/&&/ typo Fixes: 74063ee61aadd1371a9b ("intel/mi: Add a new gen_mi_store_if() helper.") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* gitlab-ci: Enable freedreno shader-db runs.Eric Anholt2019-07-291-3/+5
| | | | | | | | Now that helgrind is less upset and I've completed many successful full shader-db runs, we should be able to enable freedreno shader-db runs for Mesa checkins on the tiny public shader-db. Reviewed-by: Rob Clark <[email protected]>
* nir: Fix helgrind complaints about data race in trivial_swizzle init.Eric Anholt2019-07-291-3/+3
| | | | | | | | | | Even if the data race wasn't real (I'm not great at reasoning about this), helgrind is a nice enough tool that keeping noise out of it is probably worthwhile. Besides, typing out the numbers keeps the data in the read-only data section instead of emitting code to initialize it every time. Reviewed-by: Iago Toral Quiroga <[email protected]>
* freedreno: Fix data race on making the shader's id.Eric Anholt2019-07-291-1/+2
| | | | | | | The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Take a lock around shader variant creation.Eric Anholt2019-07-292-0/+7
| | | | | | | | | Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix data races with allocating/freeing struct ir3.Eric Anholt2019-07-291-1/+1
| | | | | | | | | | | | | | There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: 8fe20762433d ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix helgrind complaint on shader-db key setup.Eric Anholt2019-07-291-2/+1
| | | | | | | If the variable's going to be static, we shouldn't be memsetting it from every thread and instead just have it in the data section. Reviewed-by: Rob Clark <[email protected]>
* radv: Take variable descriptor counts into account for buffer entries.Bas Nieuwenhuizen2019-07-291-1/+10
| | | | | | Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <[email protected]>
* anv: Don't claim support for 24 and 48-bit formats on IVBJason Ekstrand2019-07-291-0/+8
| | | | Cc: [email protected]
* isl/formats: R8G8B8_UNORM_SRGB isn't supported on HSWJason Ekstrand2019-07-291-1/+5
| | | | | | | | | On Haswell, the format works but it doesn't properly do an sRGB decode. It appears to act identically to R8G8B8_UNORM. Only Vulkan uses this format so this only affects Vulkan on HSW. Cc: [email protected] Reviewed-by: Eric Engestrom <[email protected]>
* pan/midgard: Fix alpha test w.r.t new indexingAlyssa Rosenzweig2019-07-291-1/+2
| | | | | | Fixes: 9beb3391b55 ("pan/midgard: Tag SSA/reg") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* softpipe: Don't draw when rasterizer_discard is setGert Wollny2019-07-291-0/+3
| | | | | | | | | | | | | Fixes: dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_points dEQP-GLES3.functional.rasterizer_discard.basic.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_depth_points dEQP-GLES3.functional.rasterizer_discard.fbo.write_stencil_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_depth_points dEQP-GLES3.functional.rasterizer_discard.scissor.write_stencil_points Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Fix cube arrays layer selectionGert Wollny2019-07-291-8/+8
| | | | | | | | | | | To select the correct layer the z-coordinate must be rounded before it is multiplied by six. Fixes a number of tests out of dEQP-GLES31.functional.texture.filtering.cube_array.formats.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vulkan/wsi/wayland: implement acquire timeoutLionel Landwerlin2019-07-291-25/+51
| | | | | | | | | | | | | | | v2: Eric's nits v3: Reuse timespec utils (Daniel) Deal with ppoll being interrupted by a signal (Daniel) v4: Remove unnecessary time check v5: Deal with EAGAIN from wl_display_prepare_read_queue() (Daniel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) Reviewed-by: Daniel Stone <[email protected]>
* util: add a timespec helperLionel Landwerlin2019-07-295-0/+673
| | | | | | | | Copied from Weston, upon Daniel's suggestion Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Daniel Stone <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* intel: replace large stack buffer with heap allocationEric Engestrom2019-07-293-31/+37
| | | | | | | | | For now, this keeps the "100 bytes" allocation; we can try to figure out the correct size as a follow up. Suggested-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* radv/gfx10: do not use the fast depth or stencil clear bytes pathSamuel Pitoiset2019-07-291-2/+3
| | | | | | | | | It causes issues on GFX10. This fixes rendering issues with vkmark and Wreckfest at least. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]
* ac: do not crash when the buffer data format is invalidSamuel Pitoiset2019-07-291-0/+1
| | | | | | | | | | This might happen when a pipeline doesn't define the vertex input state, so the buffer data format is 0 (aka INVALID). This fixes crashes when compiling some shaders on GFX10. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: fix txf_ms with an offsetRhys Perry2019-07-291-2/+2
| | | | | | | | | | Seems to fix some hair artifacts in Max Payne 3: https://github.com/daniel-schuermann/mesa/issues/76 Signed-off-by: Rhys Perry <[email protected]> Fixes: f4e499ec791 ('radv: add initial non-conformant radv vulkan driver') Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Delete unused local variables in optimization loopConnor Abbott2019-07-291-0/+2
| | | | | | | | | | | | | | | | Totals from affected shaders: SGPRS: 376 -> 376 (0.00 %) VGPRS: 620 -> 560 (-9.68 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 292 -> 292 (0.00 %) dwords per thread Code Size: 20024 -> 20144 (0.60 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 25 -> 25 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/find_array_copies: Handle wildcards and overlapping copiesConnor Abbott2019-07-293-185/+405
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit rewrites opt_find_array_copies to be able to handle an array copy sequence with other intervening operations in between. In particular, this handles the case where we OpLoad an array of structs and then OpStore it, which generates code like: foo[0].a = bar[0].a foo[0].b = bar[0].b foo[1].a = bar[1].a foo[1].b = bar[1].b ... that wasn't recognized by the previous pass. In order to correctly handle copying arrays of arrays, and in particular to correctly handle copies involving wildcards, we need to use a tree structure similar to lower_vars_to_ssa so that we can walk all the partial array copies invalidated by a particular write, including ones where one of the common indices is a wildcard. I actually think that when factoring in the needed hashing/comparing code, a hash table based approach wouldn't be a lot smaller anyways. All of the changes come from tessellation control shaders in Strange Brigade, where we're able to remove the DXVK-inserted copy at the beginning of the shader. These are the result for radv: Totals from affected shaders: SGPRS: 4576 -> 4576 (0.00 %) VGPRS: 13784 -> 5560 (-59.66 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 8696 -> 6876 (-20.93 %) dwords per thread Code Size: 329940 -> 263268 (-20.21 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 330 -> 898 (172.12 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Print array deref indices as decimalConnor Abbott2019-07-291-1/+1
| | | | | | | We print the size as decimal too, and using hex without a leading "0x" was very confusing. Reviewed-by: Jason Ekstrand <[email protected]>
* lima/gpir/sched: Handle more special ops in can_use_complex()Connor Abbott2019-07-281-5/+24
| | | | | | | | | | | | | We were missing handling for a few other ops that rearrange their sources somehow in codegen, namely complex2 and select. This should fix [email protected]@execution@built-in-functions@vs-asin-vec3 and possibly other random regressions from the new scheduler which were supposed to be fixed in the commit right after. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
* lima/gp: Clean up lima_program_optimize_vs_nir() a littleConnor Abbott2019-07-281-1/+1
| | | | | | | | | | Remove an unnecessary nir_lower_regs_to_ssa as that should be done by the state tracker, and add a missing DCE pass after running copy propagation in order to remove the dead copies. This shouldn't fix anything but the second part will reduce shader sizes. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir/sched: Don't try to spill when something else has succeededConnor Abbott2019-07-281-7/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In try_node(), we assume that the node we pick can still be scheduled successfully after speculatively trying all the other nodes. Normally we always undo every node after speculating it, so that when we finally schedule best_node the scheduler state is exactly the same and it succeeds. However, we also try to spill nodes, which can change the state and in a corner case that can make scheduling best_node fail. In particular, the following sequence of events happened with piglit shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a register store node S, which is a use of N, was created and then later the other uses of N were scheduled, so that S is now ready and N is partially ready. First we try to schedule S and succeed, then we try to schedule another node M, which fails, so we try to spill the remaining uses of N. This succeeds, but scheduling M still fails so that best_node is still S. However since one of the uses of N is one cycle ago, and therefore we inserted a read dependent on S one cycle ago when spilling N, S can no longer be scheduled as read-after-write latency is three cycles. While we could ad-hoc try to catch cases like this, or (the best option but very complicated) treat the spill as speculative and roll it back if we decide not to schedule the node, a simpler solution is to just give up on spilling if we've already successfully speculatively scheduled another node. We'd give up a few cases where we discover that by spilling even harder we could schedule a more desirable node, but that seems like it would be pretty rare in practice. With this we guarantee that nothing has been touched after best_node was successfully scheduled. We also cut down on pointless spilling, since if we already scheduled a node it's unlikely that spilling harder will let us schedule an even better node, and hence any spilling at this point is probably useless. While we're here, clean up the code around spilling by flattening the two if's and getting rid of the second unnecessary check for INT_MIN. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Acked-by: Qiang Yu <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nv50/ir: don't consider the main compute function as taking argumentsIlia Mirkin2019-07-271-1/+1
| | | | | | | | | | | | With OpenCL, kernels can take arguments and return values (?). However in practice, there is no more TGSI compute implementation, and even if there were, it would probably have named functions and no explicit main. This improves RA considerably for compute shaders, since temps are not kept around as return values. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nv50/ir: handle insn not being there for definition of CVT argIlia Mirkin2019-07-271-2/+3
| | | | | | | | | This can happen if it's e.g. a uniform or a function argument. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nouveau: flip DEBUG -> !NDEBUGIlia Mirkin2019-07-2710-49/+15
| | | | | | | | | | | | | The meson conversion chose to change the meaning of DEBUG to "used for debugging" to be "used for expensive things for debugging", primarily for nir_validate. Flip things over so that we get nice things with optimizations enabled. While we're at it, also kill off nouveau_statebuf.h which is unused (and has a mention of DEBUG which is how I found it). Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nvc0: allow a non-user buffer to be bound at position 0Ilia Mirkin2019-07-271-18/+27
| | | | | | | | | | | Previously the code only handled it for positions 1 and up (as would be for UBO's in GL). It's not a lot of trouble to handle this, and vl or vdpau want this. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* nv50,nvc0: update sampler/view bind functions to accept NULL arrayIlia Mirkin2019-07-272-14/+18
| | | | | | | | | Apparently vl (or vdpau) wants to pass that in now. Handle it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Cc: [email protected]
* gallium/vl: fix compute tgsi shaders to not process undefined componentsIlia Mirkin2019-07-271-52/+52
| | | | | | | | | | | | | This caused nouveau's function handling logic to think that the MAIN function was due to receive external parameters, and cascaded some failures after that. Instead avoid having the undefined components in the first place. Fixes: f6ac0b5d71 (gallium/auxiliary/vl: Add compute shader to support video compositor render) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111213 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111217 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* pan/midgard: Introduce invert fieldAlyssa Rosenzweig2019-07-265-11/+90
| | | | | | | | | | | This will enable us to fuse inverts in various ways. Marginal hurt: total instructions in shared programs: 3610 -> 3611 (0.03%) instructions in affected programs: 67 -> 68 (1.49%) helped: 0 HURT: 1 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Tag SSA/regAlyssa Rosenzweig2019-07-267-18/+28
| | | | | | | | Rather than putting registers after SSA in the MIR indexing, put them side-by-side, shifted 1, using the bottom bit as the SSA/reg select. This will allow us to generate SSA temps in the compiler. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* radeon/vcn: enable rate control for hevc encodingBoyuan Zhang2019-07-261-1/+7
| | | | | | | | | | | | | Set cu_qp_delta_enable_flag on when rate control is enabled, and set it off when rate control is disabled (e.g. constant qp). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110673 Cc: [email protected] V2: fix typo and add bugzilla info Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Leo Liu <[email protected]>