aboutsummaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3/cp: report progressRob Clark2020-05-192-7/+15
| | | | | | | | | Later when we do this pass iteratively, we can drop some of the internal iteration and just rely on this pass getting run until there is no more progress. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/cf: report progressRob Clark2020-05-192-10/+15
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/ir3/dce: report progressRob Clark2020-05-192-3/+6
| | | | | | | | Eventually we'll pull the iteration out of the pass itself, but the first step is to just report progress. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/ir3: juggle around ir3_debug_print()Rob Clark2020-05-192-11/+14
| | | | | | | | | In a later patch, this will get folded into an IR3_PASS() macro, at least for most passes. But to do that, it is better to standardize on printing the ir3 after the pass. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/ir3: remove Sethi-Ullman numbering passRob Clark2020-05-195-125/+1
| | | | | | | We haven't used this for a while. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>
* freedreno/drm: handle ancient kernelsRob Clark2020-05-182-1/+4
| | | | | | | | | | | | Older kernels did not support `MSM_INFO_GET_IOVA`. But this is only required for (a) clover (ie. `fd_set_global_binding()`) and drm paths that are limited to newer kernels. So move the location of the assert to fix new userspace on old kernels. Fixes: c9e8df61dc8 ("freedreno: Initialize the bo's iova at creation time.") Signed-off-by: Rob Clark <[email protected]> Tested-by: Ilia Mirkin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
* freedreno/drm: don't pass thru 'DUMP' flag on older kernelsRob Clark2020-05-181-1/+2
| | | | | | | | | | | | | "softpin" mode was introduced in the same kernel as the 'DUMP' flag. So if we are using the legacy non-softpin path, clear the dump flag. OTOH the 'DUMP' flag isn't quite so needed on older kernels, since we would get all cmdstream, even SDS stateobjs, dumped regardless, as they would have cmd table entries. Fixes: b2c23b1e48f ("freedreno: Mark all ringbuffer BOs as to be dumped on crash.") Signed-off-by: Rob Clark <[email protected]> Tested-by: Ilia Mirkin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5081>
* freedreno/fdperf: add dependency on generated headersRob Clark2020-05-181-1/+1
| | | | | | | | | To fix an issue reported here: https://bugs.chromium.org/p/chromium/issues/detail?id=1083815 Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5088>
* freedreno/a3xx: parameterize ubo optimizationIlia Mirkin2020-05-173-11/+27
| | | | | | | | | | | A3xx apparently has higher alignment requirements than later gens for indirect const uploads. It also has fewer of them. Add compiler parameters for both settings, and set accordingly for a3xx and a4xx+. This fixes all the ubo test failures caused by this optimization. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>
* freedreno/ir3: avoid applying (sat) on bary.fIlia Mirkin2020-05-171-0/+5
| | | | | | | | | | This causes failures on a3xx resulting in the non-sensical dEQP failures on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow (sat) on bary.f on all generations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>
* freedreno/a3xx: there's no r8i/ui rb format, only rg8i/rg8uiIlia Mirkin2020-05-171-2/+2
| | | | | | | | | | | | | | This fixes a number of dEQP tests: dEQP-GLES3.functional.fbo.blit.conversion.r8* dEQP-GLES3.texture.specification.basic_teximage2d.r8* and others. The reason why this enum showed up in traces for R8 is that it was an "upgraded" texture to R8G8. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5073>
* freedreno/uuid: Generate meaningful device and driver UUIDEduardo Lima Mitev2020-05-144-7/+58
| | | | | | | | | | | Device UUID becomes SHA1('freedreno' + gpu_id). Driver UUID becomes SHA1(mesa-version + git-head-sha1). v2: Don't use build_id for driver UUID since it generates different values for vulkan and gl shared objects. (Kristian) Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
* freedreno: Centralize UUID generation into new files freedreno_uuid.c/hEduardo Lima Mitev2020-05-147-15/+119
| | | | | | | | | The new files are created under a 'common' folder under 'src/freedreno', where shared functionality between GL and Vulkan drivers (that is not registers, layout or compiler) will be placed. Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
* tu: Advertise COLOR_ATTACHMENT_BLEND_BIT for blendable formatsConnor Abbott2020-05-142-0/+13
| | | | | | | | Whoops. After fixing dual-source blending, dEQP-VK.pipeline.blend.* all go from skipped to pass, and fixes a bunch of dEQP-VK.api.info.format_properties.* tests where blending is required. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
* tu: Implement dual-src blendingConnor Abbott2020-05-141-4/+50
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
* tu: Move RENDER_COMPONENTS setting to pipeline stateConnor Abbott2020-05-144-10/+8
| | | | | | | This needs to be pipeline state because it can change when dual-source blending is active. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
* ir3: Fixup dual-source blending slotConnor Abbott2020-05-141-0/+1
| | | | | | | | The hardware expects that where MRT0 and MRT1 would normally go are the dual sources for MRT0, whereas GLSL has an extra "index" parameter that indicates which source it is. Remap it when handling FS outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
* freedreno/a6xx: Document dual-src blending enable bitsConnor Abbott2020-05-141-0/+4
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>
* freedreno: Avoid duplicate BO relocs in FD_RINGBUFFER_OBJECTs.Eric Anholt2020-05-141-3/+17
| | | | | | | | | | | For the piglit drawoverhead case, 5/18 of the objects' relocs were duplicated. We can dedupe them at object create time (since objects are long-lived) and avoid repeated relocation work at emit time. nohw drawoverhead program statechange throughput 2.34082% +/- 0.645832% (n=10). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
* freedreno: Fix resource layout dump loop.Eric Anholt2020-05-141-1/+1
| | | | | | | Apparently I've never dumped a fully populated slices array, so the 0-init always terminated the loop. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5020>
* freedreno/ir3: use lower_wrmasks passRob Clark2020-05-134-35/+33
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: add helper to copy const_index[]Rob Clark2020-05-131-2/+1
| | | | | | | | | It seems less brittle to not assume they are in the same order for src and dst instructions. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: use const_index accessorsRob Clark2020-05-132-2/+2
| | | | | | | | Cleans up a couple spots that were still open-coding this. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: Drop wrmask for ir3 local and global store intrinsicsKristian H. Kristensen2020-05-132-43/+33
| | | | | | | These intrinsics are supposed to map to the underlying hardware instructions, which don't have wrmask. We use them when we lower store_output in the geometry pipeline and since store_output gets lowered to temps, we always see full wrmasks there.
* freedreno/a6xx: Use LDC for UBO loads.Eric Anholt2020-05-144-16/+25
| | | | | | | | | | | | It saves addressing math, but may cause multiple loads to be done and bcseled due to NIR not giving us good address alignment information currently. I don't have any workloads I know of using non-const-uploaded UBOs, so I don't have perf numbers for it This makes us match the GLES blob's behavior, and turnip (other than being bindful). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf.Eric Anholt2020-05-142-22/+29
| | | | | | | | | With the upcoming LDC usage in the GL driver, we don't want to be uploading descriptors for every UBO when they aren't actually in use. Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling elsewhere right now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges.Eric Anholt2020-05-142-14/+12
| | | | | | | | I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges would move things back in a broken way. We can just run this pass later and drop the _ir3 path. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift.Eric Anholt2020-05-141-4/+2
| | | | | | | Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3() after the non-offset src's definition, leading to nir_validate() failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa).Eric Anholt2020-05-141-1/+1
| | | | | | Just copy the src through. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>
* freedreno/a6xx: Fix UBWC mipmapping height alignment.Eric Anholt2020-05-132-6/+137
| | | | | | | After fixing the power of two sizing, pitches worked, but 1-pixel high and unaligned height miplevels were off. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Fix UBWC mipmap sizing.Eric Anholt2020-05-132-14/+95
| | | | | | | | | The HW requires a log2 width/height of the level 0 meta_* size in the descriptors, making it pretty clear that UBWC mipmapping is all power-of-two sized. Fixes a bunch of failures in the upcoming unit UBWC layout unit tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Fix UBWC blockheight for RG8.Eric Anholt2020-05-131-1/+4
| | | | | | | | Using texturator on a P3A at 1024x1024, RG8 has log2w/h of 6x7 instead of R16I/UI's 6x8. The other blockw/h I verified other than cpp=1 (R8/R8I/R8UI didn't use UBWC) and 32 (would need a bigger type). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno: Pull the tile_alignment lookup for a layout to a helper.Eric Anholt2020-05-131-20/+25
| | | | | | | The r8g8 case UBWC alignment will be changing in the next commit, so fdl6_get_ubwc_blockwidth needs to start paying attention to r8g8 too. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Add a testcase for UBWC buffer sharing.Eric Anholt2020-05-131-4/+22
| | | | | | | These offsets are hand-computed referencing msm_media_info.h, and match our driver's current behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a6xx: Improve layout testcase logging for UBWC fails.Eric Anholt2020-05-131-2/+2
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* freedreno/a4xx+: Increase max texture size to 16384.Eric Anholt2020-05-131-1/+1
| | | | | | | | Noticed when poking around with texture layouts and found that my big texture layout from the blob buffer overflowed. Values come from http://vulkan.gpuinfo.org for Adreno 418, 512, 630. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4931>
* tu: Implement fallback linear staging blit for CopyImageConnor Abbott2020-05-131-24/+173
| | | | | | | | | | | Also, rewrite the format decision code so that we correctly decide when the linear fallback is needed, even if UBWC is disabled. As part of that, I also moved around some of the code to handle compressed formats to make sure that copying compressed formats with a linear staging blit works (this is now possible since we started allowing tiled compressed textures). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add noubwc debug flag to disable UBWCConnor Abbott2020-05-133-1/+4
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* tu: Add a "scratch bo" allocation mechanismConnor Abbott2020-05-132-0/+74
| | | | | | | | | This is simpler than a full-blown memory reuse mechanism, but is good enough to make sure that repeatedly doing a copy that requires the linear staging buffer workaround won't use excessive memory or be slowed down due to repeated allocations. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
* turnip: use the common code for generating extensions and dispatch tablesSamuel Pitoiset2020-05-132-204/+12
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4987>
* freedreno/ir3/sched: try to avoid syncsRob Clark2020-05-131-13/+99
| | | | | | | | | | Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/sched: avoid scheduling outputsRob Clark2020-05-133-22/+101
| | | | | | | | | | | | If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: try to avoid (sy) syncsRob Clark2020-05-131-2/+19
| | | | | | | | | | Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3/postsched: reset sfu_delay on syncRob Clark2020-05-132-4/+33
| | | | | | | | | Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: limit # of tex prefetch by shader sizeRob Clark2020-05-133-1/+40
| | | | | | | | | | | | It seems for short frag shaders, too much prefetch can be detrimental. I think what we *really* want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>
* freedreno/ir3: fix indirect cb0 load_ubo loweringRob Clark2020-05-121-2/+2
| | | | | | | | | | | | | | | | We can no longer assume that `state->ranges[0]` is block 0. It *often* is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: fc850080ee3 ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* freedreno/ir3: don't allow negative const_offsetRob Clark2020-05-121-3/+14
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>
* turnip: Execute ir3_nir_lower_gs pass againBrian Ho2020-05-121-0/+3
| | | | | | | | | | | | | | | This commit fixes a GS regression introduced in !4562 where ir3's GS lowering pass was moved from common code (ir3_nir) to freedreno-specific code (ir3_shader). For GS support in turnip, we need to add the GS lowering pass back in, this time in tu_shader. As for the nir_gather_info change, the GS lowering pass has always introduced a discard_if intrinsic into the GS. Previously, we simply ran nir_shader_gather_info before GS lowering, but now since we lower the GS before we need to remove the assertion that only a FS can use the discard_if intrinsic. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4892>
* turnip: enable tiling for compressed formatsJonathan Marek2020-05-121-2/+5
| | | | | | | Now that layout code supports this, we can enable it. Signed-off-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>
* turnip: update "fetchsize" value to match fdl6_layout changesJonathan Marek2020-05-121-4/+1
| | | | | | | | | | | | | | It seems this is actually a "minimum pitch" value. For example TFETCH6_2_BYTE means a minimum pitch of 128 bytes for mipmap levels. This fixes breakage with compressed formats. For example this test: dEQP-VK.pipeline.sampler.view_type.2d.format.eac_r11_snorm_block.mipmap.linear.lod.equal_min_3_max_3 Fixes: a34b3fa198a4f ("freedreno/fdl: Align after dividing by block size") Signed-off-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5009>