aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* mesa/x86: improve SSE-checks for MSVCErik Faye-Lund2019-09-021-2/+2
| | | | | | | This enables some more SSE optimizations on MSVC builds. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* util: do not assume MSVC implies SSEErik Faye-Lund2019-09-021-4/+3
| | | | | | | This is not true for MSVC on ARM. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* util: fix SSE-version needed for double opcodesErik Faye-Lund2019-09-021-1/+1
| | | | | | | | | This code generates CVTSD2SI, which requires SSE2. So let's fix the required SSE-version. Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 5de29ae (util: try to use SSE instructions with MSVC and 32-bit gcc) Reviewed-by: Matt Turner <[email protected]>
* mesa/main: remove unused includeErik Faye-Lund2019-09-021-1/+0
| | | | | | | | This has been unused since 183db3a6455 ("glsl: move half<->float convertion to util"), Oct 10 2015. Let's drop needlessly including it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: do not assume that the result of fexp2(a) is always an integralSamuel Pitoiset2019-09-021-0/+1
| | | | | | | | | It's only correct when 'a' is an integral greater or equal to 0. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111493 Fixes: 5544b2cbbd2 ("nir/algebraic: Use value range analysis to eliminate useless unary ops") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* egl: fix platform selectionLionel Landwerlin2019-09-021-2/+7
| | | | | | | | | | | | Add missing "device" platform v2: Add the missing platform (Eric) Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Jean Hertel <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111529 Fixes: d6edccee8d ("egl: add EGL_platform_device support") Reviewed-by: Eric Engestrom <[email protected]>
* iris: Lessen texture cache hack flush for blits/copies on Icelake.Kenneth Graunke2019-08-311-16/+34
| | | | | | | | | | | Lionel found actual documentation for this at long last. Apparently it actually is a sampler cache limitation that was mostly fixed on Icelake. Unfortunately, it seems there are still issues with ASTC and non-ASTC sampler views. Still, we can lessen the flush condition from "format mismatch" to "ASTC mismatch", which eliminates most of the flushing here. We also update the documentation to refer to the workaround name.
* util: Define strchrnul on macOS.Vinson Lee2019-08-311-1/+1
| | | | | | | | | | | strchrnul is not available on macOS. pipe_loader.c:141:14: error: implicit declaration of function 'strchrnul' is invalid in C99 [-Werror,-Wimplicit-function-declaration] next = strchrnul(library_paths, ':'); ^ Signed-off-by: Vinson Lee <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* gallium/auxiliary/indices: consistently apply start only to inputErik Faye-Lund2019-08-311-10/+10
| | | | | | | | | | | | | | | | | | The majority of these only apply the start argument to the input, but a few of them also does for the output-array. util_primconvert, the only user of this argument expects this pass a non-zero start-argument does not expect this to be applied to the output; if it is, it will write outside of allocated memory, leading to VRAM corruption. The reason this doesn't seem to have been noticed before, is that no driver currently use util_primconvert to convert a primitive-type to itself, which is the cases where this was broken. But for Zink, this will no longer be true, because we need to eliminate the use of 8-bit index-buffers. Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param") Reviewed-by: Rob Clark <[email protected]>
* travis: Fail build if any command in if statement fails.Vinson Lee2019-08-311-4/+4
| | | | | | | | | Travis is checking the exit code of the entire if statement. Fixes: 64ffc289be89 ("travis: add MacOS Scons build") Signed-off-by: Vinson Lee <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* swr: Fix build with llvm-9.0 again.Vinson Lee2019-08-313-0/+28
| | | | | | | | | | Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") unintentionally removed changes for llvm-9.0. Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* pan/midgard: Use shared psiz clamp passAlyssa Rosenzweig2019-08-304-82/+1
| | | | | | | We already had a perfectly cromulent pass for this, but one landed in common NIR code so let's switch and lighten our tree. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove mir_opt_post_move_eliminateAlyssa Rosenzweig2019-08-302-49/+0
| | | | | | | This optimization depended on RA running before scheduling. It therefore no longer applies and is now unused. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Schedule before RAAlyssa Rosenzweig2019-08-301-27/+29
| | | | | | | | | | | | | | | | | | | | | This is a tradeoff. Scheduling before RA means we don't do RA on what-will-become pipeline registers. Importantly, it means the scheduler is able to reorder instructions, as registers have not been decided yet. Unfortunately, it also complicates register spilling, since the spills themselves won't get bundled optimally and we can only spill twice per ALU bundle (only one spill per bundle allowed here). It also prevents us from eliminating dead moves introduced by register allocation, as they are not dead before RA. The shader-db regressions are from poor spilling choices introduced by the new bundling requirements. These could be solved by the combination of a post-scheduler (to combine adjacent spills into bundles) with a VLIW-aware spill cost calculation. Nevertheless, the change is small enough that I feel it's worth it to eat a tiny shader-db regression for the sake of flexibility. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle fragment writeout in RAAlyssa Rosenzweig2019-08-306-24/+49
| | | | | | | | | | Rather than using a pile of hacks and awkward constructs in MIR to ensure the writeout parameter gets written into r0, let's add a dedicated shadow register class for writeout (interfering with work register r0) so we can express the writeout condition succintly and directly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Do not propagate swizzles into writeoutAlyssa Rosenzweig2019-08-301-3/+5
| | | | | | | There's no slot for it; you'll end up writing into the void and clobbering stuff. Don't. do it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix misc. RA issuesAlyssa Rosenzweig2019-08-301-10/+15
| | | | | | | | When running the register allocator after scheduling, the MIR looks a little different, so we need to extend the RA to handle a few of these extra cases correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Print MIR by the bundleAlyssa Rosenzweig2019-08-301-2/+11
| | | | | | | After scheduling, we still have valid MIR, but we have additional bundling annotations which we would like to keep debug, so print these. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Print branches in MIRAlyssa Rosenzweig2019-08-301-1/+8
| | | | | | | Rather than a vague "br.??" line, annotate the branch with its target type (useful for disambiguating discards) and whether it was inverted. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove texture_indexAlyssa Rosenzweig2019-08-302-6/+0
| | | | | | This is deadcode. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Cleanup fragment writeout branchAlyssa Rosenzweig2019-08-301-2/+3
| | | | | | | I'm not sure if this is strictly necessary but it makes debugging easier and minimizes the diff with the experimental scheduler. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add scheduling barriersAlyssa Rosenzweig2019-08-301-38/+42
| | | | | | | | | | | | | | | | Scheduling occurs on a per-block basis, strongly assuming that a given block contains at most a single branch. This does not always map to the source NIR control flow, particularly when discard intrinsics are involved. The solution is to allow scheduling barriers, which will terminate a block early in code generation and open a new block. To facilitate this, we need to move some post-block processing to a new pass, rather than relying hackily on the current_block pointer. This allows us to cleanup some logic analyzing branches in other parts of the driver us well, now that the MIR is much more well-formed. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Track shader quadword count while schedulingAlyssa Rosenzweig2019-08-303-7/+7
| | | | | | | This allow multiblock blend shaders to compute constant colour offsets correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Allow NULL argument in mir_has_argAlyssa Rosenzweig2019-08-301-0/+3
| | | | | | | | It's sometimes convenient to call this with no instruction specified. By definition, a missing instruction cannot reference any argument, so let's check for NULL and shortciruit to false. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Improve mir_mask_of_read_componentsAlyssa Rosenzweig2019-08-301-2/+15
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend mir_special_index to writeoutAlyssa Rosenzweig2019-08-301-1/+2
| | | | | | | The branch has the writeout specified in its source list, making this special even if it's not explicitly part of r0. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: csel_swizzle with mir get swizzleAlyssa Rosenzweig2019-08-301-0/+3
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_insert_instruction*scheduled helpersAlyssa Rosenzweig2019-08-302-0/+91
| | | | | | | | | | | | | | | In order to run register allocation after scheduling, it is sometimes necessary to be able to insert instructions into an already-scheduled program. This is suboptimal, since it forces us to do a worst-case scheduling, but it is nevertheless required for correct handling of spills/fills. Let's add helpers to insert instructions as standalone bundles for use in spilling code. These helpers are minimal -- they *only* work on load/store ops or moves. They should not be used for anything but register spilling; any other instructions should be added prior to the schedule. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Track csel swizzleAlyssa Rosenzweig2019-08-302-4/+17
| | | | | | | | | While it doesn't matter with an unconditional move to the conditional register (r31), when we try to elide that move we'll need to track the swizzle explicitly, and there is no slot for that yet since ALU ops are normally binary. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Ensure fragment writeout is in the final blockAlyssa Rosenzweig2019-08-302-9/+6
| | | | | | | This ensures the block only has exactly one branch, which makes scheduling happy. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Document Midgard scheduling requirementsAlyssa Rosenzweig2019-08-301-0/+29
| | | | | | | Oh boy. Midgard scheduling is crazy... These are all just the requirements, not even the algorithm yet. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Include condition in branch->src[0]Alyssa Rosenzweig2019-08-301-0/+5
| | | | | | This will allow us to reference the condition while scheduling. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add post-schedule iteration helpersAlyssa Rosenzweig2019-08-301-0/+11
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix corner case in RAAlyssa Rosenzweig2019-08-301-1/+1
| | | | | | It doesn't really matter but... meh. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add OP_IS_CSEL_V helperAlyssa Rosenzweig2019-08-301-2/+6
| | | | | | ..to distinguish from scalar csel. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Expose mir_get/set_swizzleAlyssa Rosenzweig2019-08-302-2/+4
| | | | | | The scheduler would like to use these. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extract instruction sizing helperAlyssa Rosenzweig2019-08-301-15/+19
| | | | | | The scheduler shouldn't need to worry about this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Factor out mir_is_scalarAlyssa Rosenzweig2019-08-301-33/+42
| | | | | | This helper doesn't need to be in the giant loop. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Count shader-db stats by bundled instructionsAlyssa Rosenzweig2019-08-301-4/+3
| | | | | | | | | | | This does not affect shaders in any way. Rather, it makes the shader-db instruction count recorded in the compiler accurate with the in-order scheduler, matching up with what we calculate from pandecode. Though shaders are the same, instruction counts cannot be compared across this commit for this reason. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno/ir3: Link directly to Sethi-Ullman paperAlyssa Rosenzweig2019-08-301-1/+1
| | | | | | | | Allow a direct link to the PDF itself from the authors themselves, rather than a paywall splash page. Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Rob Clark <[email protected]>
* Revert "glx: Unset the direct_support bit for GLX_EXT_import_context"Adam Jackson2019-08-301-1/+1
| | | | | | | | | | The GLX extension strings are independent of any context, so abusing the direct_support bit to control this extension's visibility is wrong. This reverts commit 079d0717fc896bc8086b037d0ed22642274986c7. Reported-by: Michel Dänzer <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* panfrost: Add transient BOs to job batchesBoris Brezillon2019-08-302-1/+2
| | | | | | | | | | | | | | | | Memory allocated through panfrost_allocate_transient() is likely to come from the transient pool. Let's add the BO backing the allocated memory region to the job batch so the kernel can retain this BO while jobs are executed. In practice that has never been a problem because the transient pool is never shrinked, and even if it was, we still control the lifetime of the job, so there's no reason for this BO to be freed before the GPU is done executing the batch. But it still make sense to add the BO for debugging purpose. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: protect access to shared bo cache and transient poolRohan Garg2019-08-305-5/+23
| | | | | | | | | | Both the BO cache and the transient pool are shared across context's. Protect access to these with mutexes. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Jobs must be per context, not per screenRohan Garg2019-08-305-17/+14
| | | | | | | | | | | Jobs _must_ only be shared across the same context, having the last_job tracked in a screen causes use-after-free issues and memory corruptions. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* st/mesa: Allow zero as [level|layer]_overrideLepton Wu2019-08-304-17/+20
| | | | | | | | | | This fix two dEQP tests for virgl: dEQP-EGL.functional.image.create.gles2_cubemap_positive_x_rgba_texture dEQP-EGL.functional.image.render_multiple_contexts.gles2_cubemap_positive_x_rgba8_texture Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a3xx: fix sysmem <-> gmem tiles transferKhaled Emara2019-08-302-2/+3
| | | | | | | Tiling mode was missing from fd3_emit_gmem_restore_tex(). emit_gmem2mem_surf() used LINEAR exclusiveley. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix texture tiling parametersKhaled Emara2019-08-301-10/+21
| | | | | | | | * Fix 2D/2DArray/3D tiling parameters: There is a bottom threshold for width and height. * Renable tiling for Cubemap, after setting the right parameters. Reviewed-by: Rob Clark <[email protected]>
* gitlab-ci: Use new needs: keywordMichel Dänzer2019-08-301-0/+3
| | | | | | | | | | | | | | | | This way, the test jobs can start running before all build+test jobs have finished, once the meson-main job has. Idea suggested by Daniel Stone on IRC. See https://docs.gitlab.com/ce/ci/directed_acyclic_graph/ and https://docs.gitlab.com/ce/ci/yaml/README.html#needs for details. v2: * Improve commit log (Daniel Stone, Eric Engestrom) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gitlab-ci: Move up meson-main job definitionMichel Dänzer2019-08-301-29/+29
| | | | | | | In order to increase the chance of it running early. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.Dave Stevenson2019-08-301-8/+23
| | | | | | | | | | | | Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride." for v3d. Allows YUV buffers with a single buffer and plane offsets to be passed in. Signed-off-by: Dave Stevenson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>