aboutsummaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx.Eric Anholt2020-05-044-23/+38
| | | | | | | | | These come from the disasm tests, and fix our disasm of blob's uniform/nonuniform cat6 operands. We also now include human-readable names for all the modes we know about (though bindless gets distinguished by its .baseN, like Connor's original disasm). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
* freedreno/ir3: Sync some new changes from envytools.Eric Anholt2020-05-0411-49/+107
| | | | | | | | With this I also brought in a few new control flow instruction disasm tests that I'd made back when I wrote the disasm test, but which were too far from correct to include until now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
* freedreno/ir3: Add some more tests of cat6 disasm.Eric Anholt2020-05-041-0/+24
| | | | | | I put these together from traces I had while trying to do LDC for GL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>
* freedreno/ir3: Set up outputs for multi-slot varyings.Eric Anholt2020-05-011-20/+25
| | | | | | | | Necessary to avoid compiler assertion failures in: dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Stop initializing regid of so->outputs during setup.Eric Anholt2020-05-011-1/+0
| | | | | | It's unused and overwritten by ir3_compile_shader_nir(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Improve shader key normalization.Eric Anholt2020-05-012-56/+75
| | | | | | | | | | | | | We can remove a bunch of conditional code at key comparison time by computing a bitmask of used key bits at ir3_shader creation time. This also gives us a nice place to put additional key simplification to reduce how many variants we create (like skipping rastflat if we don't read colors in the FS, or skipping vclamp_color if we don't write colors). It does mean walking the whole key to AND it, but the key is just 28 bytes so far so that seems pretty fine. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno: Emit debug messages when doing draw-time recompiles of shaders.Eric Anholt2020-05-011-0/+8
| | | | | | Right now that's "always" unless you have shaderdb set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Remove unused half precision shader key flag.Eric Anholt2020-05-011-6/+0
| | | | | | | The code using it was removed in 4af86bd0b933 ("freedreno/ir3: remove half-precision output") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled.Eric Anholt2020-05-011-0/+15
| | | | | | | | | | | | We weren't filling in the tess mode of the key, or setting has_gs on GS shaders, resulting in assertion failures when NIR intrinsics didn't get lowered. We have to make a guess at prim mode for TCS, but it should be better to have some shader-db coverage than none, and it will avoid these failures happening when we start precompiling shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Skip tess epilogue if the program is missing stores.Eric Anholt2020-05-011-0/+3
| | | | | | | | Some of the negative API tests make shaders for tess stages that don't do all the stores they need to. Once we start precompiling (or doing shader-db of tess), we need to at least not segfault when generating them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Fix register allocation assertion failures.Eric Anholt2020-05-012-13/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | We were failing to tell the allocator about the restriction that scalar texture instructions (allocated as scalar regs) couldn't be allocated such that the start of the full unwritemasked vector started before r0. There was a patch in select_reg_callback on a6xx that tried to work around that, but you could still end up backed into a corner you shouldn't be because we didn't tell the RA what it needed. Fixes compiler assertion failures on a300-a400's blit_z shader, used for Z32F gmem blits. Looks like as a result we get tighter register allocation but more nops: instructions in affected programs: 757945 -> 760356 (0.32%) nops in affected programs: 317983 -> 320468 (0.78%) non-nops in affected programs: 27525 -> 27451 (-0.27%) mov in affected programs: 3098 -> 3023 (-2.42%) dwords in affected programs: 109664 -> 110656 (0.90%) last-baryf in affected programs: 112701 -> 112847 (0.13%) full in affected programs: 4326 -> 4011 (-7.28%) sstall in affected programs: 120550 -> 120836 (0.24%) (ss) in affected programs: 13939 -> 13918 (-0.15%) (sy) in affected programs: 3006 -> 2786 (-7.32%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Drop hack to clean up split varsKristian H. Kristensen2020-05-011-24/+0
| | | | | | | | | When the GS lowering was working on store_output intrinsics, we had to clean up the split vars to avoid getting confused. Now that we shadow the output vars instead, there's no confusion and we can drop this hack. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Lower GS builtins before lowering IOKristian H. Kristensen2020-05-014-61/+64
| | | | | | | | | | | | | We mostly got away with replacing a store_output with a store_var, but for complex types like structs, that doesn't work. Once the IO has been lowered from vars to intrinsic, we've lost the deref chains and can't properly shadow the outputs. This commits moves the GS lowering up so we do it before the output variables get lowered to store_output. This way the pass works much like nir_lower_io_to_temporaries() and cleanly shadows the outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Add ir3_nir_lower_to_explicit_input() passKristian H. Kristensen2020-05-013-46/+65
| | | | | | | | | | This pass lowers per-vertex input intrinsics to load_shared_ir3. This was open coded in the TCS and GS lowering passes before - this way we can share it. Furthermore, we'll need to run the rest of the GS lowering earlier (before lowering IO) so we need to split off this part that operates on the IO intrinsics first. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Rename ir3_nir_lower_to_explicit_ioKristian H. Kristensen2020-05-013-6/+6
| | | | | | | We rename it to ir3_nir_lower_to_explicit_output, since it only handles output and we'll add a lowering pass for input next. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Pass stream output info to ir3_shader_from_nirKristian H. Kristensen2020-05-012-2/+6
| | | | | | | | We need shader->stream_output filled out when we layout the push constants in ir3_setup_const_state(). Otherwise const_state->offsets.tfbo ends up as ~0, which doesn't work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Fix the a3xx TF outputs stores.Eric Anholt2020-05-011-1/+1
| | | | | | | We were trying to deref the vector-collected outputs[] array before it's been set up, but we want the per-component outputs anyway. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Set up the block predecessors for a3xx TFEric Anholt2020-05-012-2/+8
| | | | | | Fixes a segfault in ir3_legalize. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>
* freedreno/ir3: Leave bools as 1-bit, storing them in full regs.Eric Anholt2020-04-302-119/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | If use NIR's 1-bit bool representation , we get exactly the bool behavior the hardware provides: CMPS produces true or false, AND/OR/XOR work as intended without extra absnegs, and we can pass those half values directly to other CMPS. We emit an absneg for b2b1 ("turn a memory load into a 1-bit NIR boolean"), but we would have done so for the ir3_n2b() on the use of that value anyway. The most awkward bit is that inot(a@1) is now a sub(1, a), but we can encode the 1 as an immediate so it's fine. No significant changes to GL_TIME_ELAPSED on my set of traces (n=21). instructions in affected programs: 1570638 -> 1548702 (-1.40%) nops in affected programs: 624053 -> 611381 (-2.03%) non-nops in affected programs: 959061 -> 949797 (-0.97%) mov in affected programs: 5258 -> 5252 (-0.11%) cov in affected programs: 15099 -> 15902 (5.32%) dwords in affected programs: 469600 -> 452768 (-3.58%) last-baryf in affected programs: 162211 -> 154726 (-4.61%) full in affected programs: 4881 -> 4797 (-1.72%) sstall in affected programs: 173953 -> 174545 (0.34%) (ss) in affected programs: 10922 -> 10934 (0.11%) (sy) in affected programs: 728 -> 745 (2.34%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
* freedreno/ir3: Drop redundant IR3_REG_HALF setup in ALU ops.Eric Anholt2020-04-301-6/+0
| | | | | | It's set by ir3_put_dst() immediately after. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4518>
* freedreno: sync registers with envytoolsRob Clark2020-04-302-10/+22
| | | | | | | | Pull in the `SP_xS_BRANCH_COND` regs to keep the mesa and envytools copies from getting out of sync. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
* freedreno/a6xx: add OUT_PKT()Rob Clark2020-04-302-4/+21
| | | | | | | | | | | Similar to OUT_REG(), this has the benefits of: 1. No more messing up pkt size 2. Detects errors of mixing up the order of dwords in the packet 3. Optimizes to more efficient code Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
* freedreno/drm: inline the thingsRob Clark2020-04-303-65/+62
| | | | | | | | | The existing structure dates back to when this code was part of libdrm, and we wanted some of this not to be exposed as ABI between libdrm and mesa. Now that this is no longer a constraint, inline things. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
* freedreno/drm: drop atomic refcntsRob Clark2020-04-301-2/+2
| | | | | | | | Since we dropped the async flush_queue, we no longer need the refcnts to be atomic. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4813>
* freedreno/ir3: Initialize the unused dwords of the immediates consts.Eric Anholt2020-04-301-0/+3
| | | | | | | Avoids having spurious differences (and weird values to look at!) in traces from uninitialized memory. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4824>
* turnip: fix wrong substream size in parse_multisample_and_color_blend20.1-branchpointJonathan Marek2020-04-291-1/+1
| | | | | | | | | Missed updating this when adding tu6_emit_sample_locations Fixes: a92d2e11095 ("turnip: implement VK_EXT_sample_locations") Signed-off-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4795>
* freedreno/a6xx+tu: rename VSC_DATA/VSC_DATA2Rob Clark2020-04-285-78/+68
| | | | | | | | These are the draw-stream and primitive-stream, so lets give them more descriptive names. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4750>
* freedreno/ir3/ra: only assign array base in first passRob Clark2020-04-281-1/+2
| | | | | | | | | | In particular, we specifically don't want to let the base change between passes, as it could end up conflicting with registers assigned in the first pass. Mostly-closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2838 Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
* freedreno/ir3/ra: split out helper for array assignmentRob Clark2020-04-281-48/+58
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
* freedreno/ir3/ra: use ir3_debug_print helperRob Clark2020-04-281-8/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
* freedreno/ir3/ra: remove unused variableRob Clark2020-04-281-2/+0
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
* freedreno/computer: add script to test widening/narrowingRob Clark2020-04-281-0/+297
| | | | | | | | Just something I hacked together to help figure out which instructions can fold in a wideing/narrowing conversion. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4780>
* freedreno/ir3: Add support for disasm of cat2 float32 immediates.Eric Anholt2020-04-275-47/+86
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: Refactor out print_reg_src().Eric Anholt2020-04-271-10/+6
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: Convert remaining disasm src prints to reginfo.Eric Anholt2020-04-271-60/+92
| | | | | | More lines of code, but they're much more intelligible. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: Add a unit test for our disassembler.Eric Anholt2020-04-272-0/+141
| | | | | | | Makes sure that we can maintain consistent output from our disassembly as we refactor. I've only included stuff that matches qcom's disasm so far. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: Print a space after nop counts, like qcom's disasm.Eric Anholt2020-04-271-1/+1
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: Fix the disasm of half-float STG dests.Eric Anholt2020-04-271-1/+1
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4736>
* freedreno/ir3: run nir_lower_packJonathan Marek2020-04-271-0/+1
| | | | | | | | This lowers pack_32_2x16/unpack_32_2x16 into the scalar versions of those instructions. Signed-off-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
* nir: add pack_32_2x16_split/unpack_32_2x16_split loweringJonathan Marek2020-04-271-4/+2
| | | | | | | | | | | The new option replaces the two other _split lowering options, since there's no need for separate options. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4738>
* ir3: Use shared mediump output loweringAlyssa Rosenzweig2020-04-271-49/+1
| | | | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4716>
* tu: Don't invert point coordsConnor Abbott2020-04-251-1/+2
| | | | | | We shouldn't need to invert them, and the Vulkan blob doesn't either. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4733>
* ir3: Remove VARYING_SLOT_PNTC remapping hackConnor Abbott2020-04-251-12/+0
| | | | | | The st now does this for us. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4732>
* tu: Implement PrimID passthroughConnor Abbott2020-04-252-5/+14
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
* ir3: Skip missing VS outputs in VS out map when linkingConnor Abbott2020-04-252-22/+55
| | | | | | | | | | | | | The hardware is capable of automatically filling in certain values in the VPC without writing them from the last geometry stage, like gl_PointCoord or gl_PrimitiveID when there is no GS. However, we *do* have to enable these outputs (i.e. set the VPC_VAR_DISABLE bit to 0) as VPC_VAR_DISABLE is really about FS inputs rather than VS outputs. To do this, we move the computation of the enable bits to ir3_link_add(), which is also a nice refactor anyway. In addition we detect the PrimID case specifically so that the driver can program the location. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
* freedreno/a6xx: Document PrimID passthrough registersConnor Abbott2020-04-252-2/+15
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4704>
* freedreno/ir3: Print @tex write mask using 0x%xKristian H. Kristensen2020-04-251-1/+1
| | | | | | That way we can parse it again with the assembler. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
* freedreno/ir3: Reset lex line number when we start parsingKristian H. Kristensen2020-04-251-0/+2
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
* freedreno/ir3: Parse, but ignore @in, @out and @tex headersKristian H. Kristensen2020-04-252-0/+21
| | | | Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>
* freedreno/ir3: Move ir3 assembler to backend compilerKristian H. Kristensen2020-04-256-24/+21
| | | | | | For easier reuse. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4741>