aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir: add nir_intrinsic_elect to divergence analysisDaniel Schürmann2020-05-131-0/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* nir: Make "divergent" a property of an SSA valueJason Ekstrand2020-05-135-95/+122
| | | | | | | v2: fix usage in ACO (by Daniel Schürmann) Reviewed-by: Rhys Perry <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4062>
* gallium: remove more "state tracker" occurencesMarek Olšák2020-05-136-13/+11
| | | | | | Trivial. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: rename PIPE_RESOURCE_FLAG_ST_PRIV to FRONTEND_PRIVMarek Olšák2020-05-132-3/+3
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: change comments to remove 'state tracker'Marek Olšák2020-05-1397-193/+188
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* gallium: rename 'state tracker' to 'frontend'Marek Olšák2020-05-13444-174/+360
| | | | | | | Acked-by: Eric Anholt <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4902>
* tu: Implement fallback linear staging blit for CopyImageConnor Abbott2020-05-131-24/+173
| | | | | | | | | | | Also, rewrite the format decision code so that we correctly decide when the linear fallback is needed, even if UBWC is disabled. As part of that, I also moved around some of the code to handle compressed formats to make sure that copying compressed formats with a linear staging blit works (this is now possible since we started allowing tiled compressed textures). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add noubwc debug flag to disable UBWCConnor Abbott2020-05-133-1/+4
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add a "scratch bo" allocation mechanismConnor Abbott2020-05-132-0/+74
| | | | | | | | | This is simpler than a full-blown memory reuse mechanism, but is good enough to make sure that repeatedly doing a copy that requires the linear staging buffer workaround won't use excessive memory or be slowed down due to repeated allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* aco: improve phi affinities with p_split_vectorRhys Perry2020-05-131-0/+19
| | | | | | | | | | | | Totals from 5860 (4.59% of 127638) affected shaders: VGPRs: 460212 -> 460216 (+0.00%) CodeSize: 65554356 -> 65464816 (-0.14%) Instrs: 12655972 -> 12633578 (-0.18%) Copies: 1309994 -> 1292163 (-1.36%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: consider affinities when creating v_mac_f32Rhys Perry2020-05-131-2/+8
| | | | | | | | | | | | Totals from 8487 (6.65% of 127638) affected shaders: CodeSize: 62061988 -> 62058020 (-0.01%); split: -0.01%, +0.01% Instrs: 11910757 -> 11885409 (-0.21%); split: -0.21%, +0.00% Copies: 1065244 -> 1040945 (-2.28%); split: -2.30%, +0.02% Branches: 349665 -> 348914 (-0.21%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: mark phi definitions as last-seen phi operandsRhys Perry2020-05-131-14/+14
| | | | | | | | | | | | | | | Totals from 14340 (11.23% of 127638) affected shaders: SGPRs: 1251648 -> 1251512 (-0.01%) VGPRs: 994556 -> 994104 (-0.05%); split: -0.06%, +0.01% CodeSize: 122894528 -> 121099604 (-1.46%); split: -1.49%, +0.03% MaxWaves: 106039 -> 106103 (+0.06%); split: +0.06%, -0.00% Instrs: 23860066 -> 23414317 (-1.87%); split: -1.90%, +0.03% Copies: 2448228 -> 2049305 (-16.29%); split: -16.37%, +0.07% Branches: 789381 -> 757921 (-3.99%); split: -4.62%, +0.64% Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4990>
* aco: fix consecutively written vgprs from vmem instructionsRhys Perry2020-05-131-10/+26
| | | | | | | | | | | | | | | If one VMEM instruction uses a sampler and the other doesn't, we can't do this optimization. Totals from 47 (0.04% of 127638) affected shaders: CodeSize: 271744 -> 271656 (-0.03%); split: -0.04%, +0.01% Instrs: 52783 -> 52761 (-0.04%); split: -0.05%, +0.01% Cycles: 5547040 -> 5546952 (-0.00%); split: -0.00%, +0.00% VMEM: 10022 -> 9887 (-1.35%) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
* aco: simplify consecutive ordered vmem/lds writes optimizationRhys Perry2020-05-131-10/+2
| | | | | | | | This was unnecessary and messed with statistics Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4949>
* gitlab-ci: correct tracie behavior with replay errorsAndres Gomez2020-05-132-2/+4
| | | | | | | | | | | | | | | | | | | [dump_trace_images] Info: Dumping trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace... ERROR [dump_trace_images] Debug: === Failure log start === invalid literal for int() with base 16: 'in' [dump_trace_images] Debug: === Failure log end === [check_image] Trace /tmp/tracie.test.ap5pshYcsg/traces-db/trace1/magenta.testtrace couldn't be replayed. See above logs for more information. Traceback (most recent call last): File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 176, in <module> main() File "/tmp/tracie.test.ap5pshYcsg/tracie.py", line 164, in main ok, result = gitlab_check_trace(project_url, commit_id, args.device_name, trace, expectation) TypeError: cannot unpack non-iterable bool object Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format") Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Rohan Garg <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>
* gitlab-ci: create always the "results" directory with tracieAndres Gomez2020-05-132-0/+2
| | | | | | | | | | Otherwise, we will fail when the traces description file doesn't contain any checksum for the specified device. Fixes: efbbf8bb81e ("tracie: Print results in a machine readable format") Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Rohan Garg <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4839>
* radv: add a LLVM version string workaround for SotTR and ACOSamuel Pitoiset2020-05-133-3/+36
| | | | | | | | | | | | | When the LLVM version is too old or missing, SotTR applies shader workarounds and that reduces performance by 2-5% with ACO. SotTR workarounds are applied with LLVM 8 and older, so reporting LLVM 9.0.1 should be fine. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Edmondo Tommasina <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4984>
* turnip: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-204/+12
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* anv: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-335/+14
| | | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* radv: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-348/+13
| | | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* vulkan: import common code for generating extensionsSamuel Pitoiset2020-05-132-0/+370
| | | | | | | | | | | ANV and RADV have similar Python code for generating extensions and dispatch tables. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* radv: implement VK_EXT_private_dataSamuel Pitoiset2020-05-133-1/+53
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* radv: use the base object struct typesSamuel Pitoiset2020-05-1312-7/+138
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* radv: use the common base object type for VkDeviceSamuel Pitoiset2020-05-1312-90/+87
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4886>
* etnaviv: Disable seamless cube map on GC880Marek Vasut2020-05-133-3/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GC880 on iMX6DL indicates in it's minorFeatures2 register that it does support SEAMLESS_CUBE_MAP, however when the TE.SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit is set on GC880 on iMX6DL, the result is corrupted image. In particular, the following ~112 dEQPs are affected and fail: dEQP-GLES2.functional.texture.filtering.cube.* This only happens on MX6DL GC880, MX6Q GC2000 and STM32MP1 GC400(GCnano) do not report the minorFeatures2 SEAMLESS_CUBE_MAP bit and ignore the TE_SAMPLER_CONFIG1 VIVS_TE_SAMPLER_CONFIG1_SEAMLESS_CUBE_MAP bit (note that ss->seamless_cube_map is unconditionally set by mesa at times even PIPE_CAP_SEAMLESS_CUBE_MAP_PER_TEXTURE returns 0), so there is no visible problem and there are no failing dEQP tests on the GC2000 and GCnano. This might imply that the minorFeatures2 SEAMLESS_CUBE_MAP has some different meaning on GC880 or the SEAMLESS_CUBE_MAP behaves differently on the GC880. This patch does not set the SEAMLESS_CUBE_MAP bit on hardware which does not indicate support for seamless cube map and on GC880, which results in reduction in failed dEQPs: 635 to 186 on GC880, 274 to 270 on GC2000 and no change on GC400(GCnano). Fixes: 8dd26fa2f06 ("etnaviv: support GL_ARB_seamless_cubemap_per_texture") Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Marek Vasut <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4865>
* freedreno/a6xx: fix max-scissor optRob Clark2020-05-132-13/+10
| | | | | | | | | | On a6xx we need a 0,0 based scissor in the binning pass, but can use the blit-scissor to avoid restore/resolve of untouched pixels, and use the conditional execution if the IB to bin to skip bins with no geometry (due to the scissor). Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5021>
* freedreno/ir3/sched: try to avoid syncsRob Clark2020-05-131-13/+99
| | | | | | | | | | Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/sched: avoid scheduling outputsRob Clark2020-05-133-22/+101
| | | | | | | | | | | | If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: try to avoid (sy) syncsRob Clark2020-05-131-2/+19
| | | | | | | | | | Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: reset sfu_delay on syncRob Clark2020-05-132-4/+33
| | | | | | | | | Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: limit # of tex prefetch by shader sizeRob Clark2020-05-133-1/+40
| | | | | | | | | | | | It seems for short frag shaders, too much prefetch can be detrimental. I think what we *really* want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: fix indirect cb0 load_ubo loweringRob Clark2020-05-121-2/+2
| | | | | | | | | | | | | | | | We can no longer assume that `state->ranges[0]` is block 0. It *often* is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* freedreno/ir3: don't allow negative const_offsetRob Clark2020-05-121-3/+14
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* panfrost: Run dEQP-GLES3.functional.shaders.derivate.* on CIAlyssa Rosenzweig2020-05-122-264/+0
| | | | | | | | Should be stable now, and should pass except for MSAA tests (multisampling is still a todo overall). Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Fix derivative swizzleAlyssa Rosenzweig2020-05-121-4/+2
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Set types for derivativesAlyssa Rosenzweig2020-05-121-0/+2
| | | | | | | Closes #2900 Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Remove texture_op_countAlyssa Rosenzweig2020-05-124-15/+0
| | | | | | | | Was used as a crude approximation of the terminate flag, which we now can do properly. Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Use analysis to set .cont/.last flagsAlyssa Rosenzweig2020-05-121-10/+2
| | | | | | | | | | | | | | Corresponds roughly to what we analyze. Note that "terminate AND execute" is a contradiction (rather: it's equivalent to just terminating), hence why there are only three possibilities for the states of the flags: .cont = continue, don't execute .last = don't continue, don't execute .cont.last = continue and execute Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Use the helper invo analyze passesAlyssa Rosenzweig2020-05-121-0/+5
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Analyze helper execution requirementsAlyssa Rosenzweig2020-05-125-8/+99
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Analyze helper invocation terminationAlyssa Rosenzweig2020-05-125-0/+112
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* pan/mdg: Explain helper invocations dataflow theoryAlyssa Rosenzweig2020-05-121-0/+63
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5014>
* intel/compiler: fix alignment assert in nir_emit_intrinsicArcady Goldmints-Orlov2020-05-121-1/+1
| | | | | | | | Fixes: c643979228 (intel/fs: Choose memory message type based on bit size) Fixes: dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_i8vec2 Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5000>
* freedreno: Skip taking the lock for resource usage if it's already flagged.Eric Anholt2020-05-121-0/+5
| | | | | | | Improves nohw drawoverhead 8-ubos update throughput by 13.493% +/- 0.391444% (n=15). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5011>
* freedreno: Move the resource_read early out to an inline.Eric Anholt2020-05-123-15/+20
| | | | | | | | | Looking at perf, the drawoverhead test case was now spending 13% CPU (89% in that function) on stack management. nohw drawoverhead throughput 1.03902% +/- 0.380257% (n=13). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
* freedreno: Add an early out for preparing to read a resource.Eric Anholt2020-05-121-0/+7
| | | | | | nohw drawoverhead 8 UBOs test throughput 1.06093% +/- 0.363376% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
* freedreno: Split the fd_batch_resource_used by read vs write.Eric Anholt2020-05-125-47/+70
| | | | | | | | | | This is for an optimization I plan in a following commit. I found I had to add likely()s to avoid a perf regression from branch prediction. On the drawoverhead 8 UBOs test, the HW can't quite keep up with the CPU, but if I set nohw then this change is 1.32023% +/- 0.373053% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
* freedreno: Add a nohw flag to skip submitting to the kernel.Eric Anholt2020-05-123-0/+5
| | | | | | | | For some CPU-side-only optimizations, it can be nice to disable rendering so that we can see what the impact is even on cases where the GPU can't quite keep up. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4996>
* turnip: Execute ir3_nir_lower_gs pass againBrian Ho2020-05-122-2/+8
| | | | | | | | | | | | | | | This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
* freedreno/gmem: rework gmem layout algoRob Clark2020-05-121-29/+48
| | | | | | | | | And try a bit harder to find an optimal layout. Improves on a sub- optimal layout we arrive at in the 4 MRT pass in manhattan, picking up a bit more than 3%. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4976>