summaryrefslogtreecommitdiffstats
path: root/src/panfrost
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Pack load/store masksAlyssa Rosenzweig2019-11-111-2/+30
| | | | | | | | | While most load/store operations on 32-bit/vec4 intriniscally, some are not and have special type-size-dependent semantics for the mask. We need to convert into this native format. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Implement nir_intrinsic_load_output_u8_as_fp16_panAlyssa Rosenzweig2019-11-111-0/+20
| | | | | | | | We can use the native Midgard ops for this, depending what chip we're on. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Identify ld_color_buffer_u8_as_fp16*Alyssa Rosenzweig2019-11-112-2/+7
| | | | | | | | | | There are two versions of this opcode, depending what version of the ISA you're using. I'm not sure if there's a semantic difference; I think there might be some slight subtleties but it's too early to know at this stage. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Switch base for vertex texturing on T720Alyssa Rosenzweig2019-11-081-11/+16
| | | | | | | | There aren't texture pipeline registers anymore; instead, space is shared with work and ldst registers for output and input respectively. We need to shift the base registers to represent this correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Pass shader stage to disassemblerAlyssa Rosenzweig2019-11-084-4/+7
| | | | | | | Vertex texturing behaves differently from fragment texturing on some GPUs. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Disassemble half-steps correctlyAlyssa Rosenzweig2019-11-081-3/+15
| | | | | | | The meaning of some bits shifts; we need to account for this to print swizzles sanely. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix printing of half-registers in texture opsAlyssa Rosenzweig2019-11-081-35/+32
| | | | | | | We were using old style half-registers; let's update that to be consistent, preparing us for more disassmbler changes in this area. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pipe the GPU ID into compiler and disassemblerTomeu Vizoso2019-11-077-26/+27
| | | | Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost: Print the right zero fieldTomeu Vizoso2019-11-061-1/+1
| | | | | | | | | Copy paste error. Signed-off-by: Tomeu Vizoso <[email protected]> Reported-by: Ilia Mirkin <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* panfrost: Decode blend shaders for SFBDTomeu Vizoso2019-11-061-22/+29
| | | | | | | Also set MALI_HAS_BLEND_SHADER as needed. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Rework format encoding on SFBDTomeu Vizoso2019-11-062-47/+109
| | | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add checksum fields to SFBD descriptorTomeu Vizoso2019-11-062-1/+10
| | | | | | | During tests on T720, these fields were discovered. Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend default_phys_reg to !32-bitAlyssa Rosenzweig2019-11-041-5/+5
| | | | | | We can pass through a size. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend swizzle packing for vec4/16-bitAlyssa Rosenzweig2019-11-041-3/+24
| | | | | | | | We would like to pack not just xyzw swizzles but also efgh swizzles. This should work for vec4/16-bit. More work will be needed to pack swizzles for vec8/16-bit and even more work for 8-bit, of course. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Extend offset_swizzle to non-32-bitAlyssa Rosenzweig2019-11-041-3/+4
| | | | | | We take a size parameter; use it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: offset_swizzle doesn't need dstsizeAlyssa Rosenzweig2019-11-041-9/+9
| | | | | | This argument should be omitted. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add bizarre corner caseAlyssa Rosenzweig2019-11-041-1/+8
| | | | | | Someone really needs to look into this. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Compute bundle interferenceAlyssa Rosenzweig2019-11-041-0/+57
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix quadword_count handlingAlyssa Rosenzweig2019-11-043-4/+8
| | | | | | Spilling can mess with this considerably. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Validate tags when branchingAlyssa Rosenzweig2019-11-041-6/+32
| | | | | | | | | | | | | | | | | | Midgard prefetches instructions based on tag (ALU, LD/ST, texture * size). To do so, the shader descriptor specifies the tag of the first instruction, all instructions specify the tag of the next linear instruction is, and all branches explicitly specify the tag of the branch target. If you mess this up, you get an INSTR_TYPE_MISMATCH, which unambiguously refers to this problem, but it's still annoying to try to work out all the branch targets in your head to debug. Instead, let's track the tags of various blocks over time, so we can automatically validate tags of branch targets, to make INSTR_TYPE_MISMATCH issues immediately obvious in a disassembly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: MALI_DEPTH_TEST is actually MALI_DEPTH_WRITEMASKBoris Brezillon2019-11-042-3/+3
| | | | | | | | | MALI_DEPTH_TEST should only be set when depth->writemask is true, not when the depth test is enabled. Let's rename the flag and patch panfrost_bind_depth_stencil_state() to do the right thing. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Eliminate blank_alu_srcAlyssa Rosenzweig2019-11-016-36/+22
| | | | | | We don't need it in practice, so this is some more cleanup. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Refactor swizzlesAlyssa Rosenzweig2019-11-0113-385/+258
| | | | | | | | Rather than having hw-specific swizzles encoded directly in the instructions, have a unified swizzle arary so we can manipulate swizzles generically. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add a dummy source for loadsAlyssa Rosenzweig2019-11-013-29/+11
| | | | | | | | | | | | | | We want symmetry between loads and stores, so we add a dummy source. So we get, e.g. st_int4 _, val, arg_1, arg_2 ld_int4 dest, _, arg_1, arg_2 Semantically, this dummy source represents the data itself, as if the load is simply a move. That means it has a swizzle that acts as a source. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Remove OP_IS_STORE_VARYAlyssa Rosenzweig2019-11-011-7/+0
| | | | | | Unused. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* android: Add panfrost support to build scriptsRobert Foss2019-10-317-1/+258
| | | | | | | | | | | | Currently the Android build system doesn't expose the panfrost driver. This patch enables the panfrost driver to be build on for the Android platform. Signed-off-by: Robert Foss <[email protected]> Reviewed-By: Rohan Garg <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Remove unused definitions in mali-job.hAlyssa Rosenzweig2019-10-291-9/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Cleanup _shader_upper -> shaderAlyssa Rosenzweig2019-10-292-13/+10
| | | | | | I don't believe this is actually a tagged pointer; warn if it is. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Express allocated registers as offsetsAlyssa Rosenzweig2019-10-251-104/+62
| | | | | | | | | | Rather than supplying a mask/swizzle to compose with the original, just supply the offset of the allocated register so we can directly offset the mask/swizzle, without resorting to composition. This is simpler, cleaner, and will generalize to non-32-bit. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Expose more typesize manipulation routinesAlyssa Rosenzweig2019-10-252-2/+4
| | | | | | These internal mir.c routines will help the RA. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_set_bytemask helperAlyssa Rosenzweig2019-10-252-0/+7
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* nir/lower_idiv: add new llvm-based pathRhys Perry2019-10-211-1/+1
| | | | | | | | | | | | | | | | | v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* pan/midgard: Implement SIMD-aware dead code eliminationAlyssa Rosenzweig2019-10-201-8/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would like to eliminate not just entire dead instructions, but also dead components, which increases scheduler flexibility (since some vector instructions can become scalar after eliminating dead components). This also will allow better RA in the future. Results are meh. total instructions in shared programs: 3453 -> 3451 (-0.06%) instructions in affected programs: 60 -> 58 (-3.33%) helped: 2 HURT: 0 total bundles in shared programs: 1826 -> 1824 (-0.11%) bundles in affected programs: 33 -> 31 (-6.06%) helped: 2 HURT: 0 total quadwords in shared programs: 3144 -> 3144 (0.00%) quadwords in affected programs: 0 -> 0 helped: 0 HURT: 0 total registers in shared programs: 321 -> 321 (0.00%) registers in affected programs: 45 -> 45 (0.00%) helped: 11 HURT: 11 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for registers value: -0.45 0.45 95% mean confidence interval for registers %-change: -1.87% 62.18% Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 445 -> 447 (0.45%) threads in affected programs: 2 -> 4 (100.00%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Create dependency graph bytewiseAlyssa Rosenzweig2019-10-201-12/+12
| | | | | | | This allows for vec16 dependencies in the scheduler, not that we have any yet (thankfully). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle nontrivial masks in texture RAAlyssa Rosenzweig2019-10-201-1/+1
| | | | | | The texture instruction has a mask we need to take into account. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement per-byte liveness trackingAlyssa Rosenzweig2019-10-201-3/+2
| | | | | | | Now that we have notion of byte masks, liveness tracking can be updated to reflect this extra granularity without loss of correctness. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Simplify mir_bytemask_of_read_componentsAlyssa Rosenzweig2019-10-201-18/+4
| | | | | | There are easy ways to iterate sources! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Report byte masks for read componentsAlyssa Rosenzweig2019-10-206-31/+31
| | | | | | | | | | | Read component masks don't have a particular type associated, since the type of the ALU operation may not match the type of the operands in question. So let's generate byte masks instead, and update the rest of the compiler to use byte masks when analyzing reads. Preparation for mixed types. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add helpers for manipulating byte masksAlyssa Rosenzweig2019-10-202-0/+177
| | | | | | | | | | | | | | | | | | | There are essentially two formats of masks in play beginning with this commit: masks per-channel and masks per-byte. The former make sense within a given fixed-size instruction; the latter are typesize-independent. It turns out you need the latter to meaningfully manipulate instructions containing multiple sizes (which is quite possible with ALU operations). Similarly, we have mir_srcsize. We calculate the size of the source by analyzing the size of the instruction itself and stepping down if there is a half-modifier. Finally, we have mir_round_bytemask_down, for when we want to take a byte mask and "round it down" to a given component size, so that we can use it as a component mask. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement OP_IS_STORE with tableAlyssa Rosenzweig2019-10-202-13/+2
| | | | | | ..rather than open-coding. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Tableize load/store opsAlyssa Rosenzweig2019-10-205-70/+102
| | | | | | | | | | This will allow us to encode properties about the load/store ops like we do for ALU ops. We include now properties about whether we have a store, and if there are special cases on the load/store op. We also tag each instruction by its natural size... this is probably not totally right, but it's a start. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Factor out mir_get_alu_srcAlyssa Rosenzweig2019-10-201-6/+8
| | | | | | | This helper is used in a bunch of places ... might as well make that common. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard/disasm: Fix printing 8-bit/16-bit masksAlyssa Rosenzweig2019-10-201-49/+30
| | | | | | | | | The trick is realizing even with a destination override, the masks are encoded in the same mode as the instruction itself, rather than stepping down. The override means that the smaller type is used, but the mask is parsed as if it were the higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Identify 64-bit atomic opcodesAlyssa Rosenzweig2019-10-202-0/+20
| | | | | | They are symmetric to their 32-bit counterparts, just shifted. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Debug mir_insert_instruction_after_scheduledAlyssa Rosenzweig2019-10-201-2/+6
| | | | | | | | Add some comments explaining what's going on in a more natural flow in order to solve the actual bug. Signed-off-by: Alyssa Rosenzweig <[email protected]> Fixes: 2d914ebe818 ("pan/midgard: Fix memory corruption in register spilling")
* panfrost: do not report alpha-test as supportedErik Faye-Lund2019-10-171-11/+0
| | | | | | | This triggers lowering in the state-tracker, which makes things a bit simpler. Reviewed-by: Marek Olšák <[email protected]>
* pan/midgard: Do not repeatedly spill same valueAlyssa Rosenzweig2019-10-161-2/+14
| | | | | | | It doesn't make sense. You already spilled it once, and it didn't help. Don't try again, or you'll end up in a loop. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix memory corruption in register spillingAlyssa Rosenzweig2019-10-161-2/+2
| | | | | | | Essentially an off-by-one error ... bit of an edge case, but seems to occur in some glamor shaders. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Use 16-bit liveness masksAlyssa Rosenzweig2019-10-163-15/+14
| | | | | | We'll want liveness per-byte, so we need to accomodate up to 16 bytes. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix mir_mask_of_read_components with dot productsAlyssa Rosenzweig2019-10-151-5/+5
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>