aboutsummaryrefslogtreecommitdiffstats
path: root/src/freedreno/ir3
Commit message (Collapse)AuthorAgeFilesLines
...
* freedreno/ir3: convert over to rallocRob Clark2020-06-193-26/+8
| | | | | | | | | | | The `ir3_shader` is the root mem ctx, with `ir3_shader_variant` hanging off that, and various variant specific allocations hanging off the variant. This lets us delete a bunch of cleanup code. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: pass variant to ir3_create()Rob Clark2020-06-194-6/+8
| | | | | | | Prep to convert over to ralloc. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* ir3: Split out variant-specific lowering and optimizationsConnor Abbott2020-06-194-95/+106
| | | | | | | | | | | | | | | | | | | | It seems a lot of the lowerings being run the second time were unnecessary. In addition, when const_state is moved to the variant, then it will become impossible to know ahead of time whether a variant needs additional optimizing, which means that ir3_key_lowers_nir() needs to go away. The new approach should have the same effect, since it skips running lowerings that are unnecessary and then skips the opt loop if no optimizations made progress, but it will work better when we move ir3_nir_analyze_ubo_ranges() to be after variant creation. The one maybe controversial thing I did is to make nir_opt_algebraic_late() always happen during variant lowering. I wanted to avoid code duplication, and it seems to me that we should push the _late variants as far back as possible so that later opt_algebraic runs don't miss out on optimization opportunities. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: constify shader keyRob Clark2020-06-192-5/+5
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: drop shader->num_ubosRob Clark2020-06-192-12/+1
| | | | | | | | | | | | | | The only difference between this and `const_state->num_ubos` was that the latter is counting # of ubos loaded via `ldg` (based on UBO addrs in push-consts). But turns out there isn't really any reason to care. Instead just add an early return in the one code-path that cares about the number of `ldg` UBOs. This gets rid of one more thing we need to move from `ir3_shader` to `ir3_shader_variant`. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: move ubo_state into const_stateRob Clark2020-06-194-22/+23
| | | | | | | | | As with const_state, this will also need to move into the variant. To simplify that, just move it into the const_state itself, since after all it is related. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: add accessor for const_stateRob Clark2020-06-196-11/+20
| | | | | | | | | | | | | | We are going to want to move this back to the variant, and come up with a different strategy for binning/nonbinning to share the same constant layout, in order to implement shader-cache support. (Since then we can have a mix of dynamically compiled variants and cache hits, so there is no good place to serialize the const-state.) To reduce the churn as we re-arrange things, move direct access to the const-state to a helper fxn. This patch is the boring churny part. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: refactor out helper to compile shader from asmRob Clark2020-06-195-29/+121
| | | | | | | | | Deduplicate a bit of hand-building of ir3_shader/_variant from computerator and delay test. This also removes the need for external things to depend on generated ir3_parser header. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5508>
* freedreno/ir3: update obsolete commentRob Clark2020-06-181-4/+10
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: make mergedregs a property of the variantRob Clark2020-06-186-13/+35
| | | | | | | | | | | | Rather than assuming a6xx+ means mergedregs. We can actually (mostly?) do splitregs on a6xx as well. And GS/DS/HS currently require it, which might be papering over a bug, or might be something to do with how chaining shaders works. At any rate, we should at least be consistent, and not have the compiler thinking we are doing mergedregs when we are actually doing splitregs. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: re-work assembler APIRob Clark2020-06-184-18/+22
| | | | | | | | Just pass thru the variant, since it has everything we need. And will be needed in the next patch. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: pass variant to postschedRob Clark2020-06-183-3/+6
| | | | | | | Prep for the next patch. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: decouple regset from gpu genRob Clark2020-06-185-5/+8
| | | | | | | | Allow different regset's to coexist, so we can make mergedregs vs split reg file a variant property. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: move mergedreg state out of regRob Clark2020-06-183-20/+30
| | | | | | | It is only needed one place, let's move it there. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* freedreno/ir3: convert regmask_t to structRob Clark2020-06-181-11/+15
| | | | | | | | Prep to make merged/split register file mode a property of the regmask, rather than the ir3_register. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5458>
* ir3: Don't calculate num_samp ourselvesConnor Abbott2020-06-171-9/+5
| | | | | | | In addition to duplicating what core NIR does better, this was wrong for Vulkan, where it should be 0 as there are no non-bindless samplers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5519>
* ir3: Pass reserved_user_consts to ir3_shader_from_nir()Connor Abbott2020-06-172-2/+3
| | | | | | | | | | ir3_shader_from_nir() calls ir3_optimize_nir(), which currently sets up the const state. However, we need to know the number of user consts reserved by the driver before setting up the const state, which means that this information needs to be passed into ir3_shader_from_nir() somehow rather than being set in the shader. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5500>
* freedreno/ir3: add post-scheduler cp passRob Clark2020-06-164-0/+222
| | | | | | | | | | A pass to eliminate extra mov's from an array. We need to do this after scheduling so we know that there are not any potentially conflicting array writes between the original `mov` and it's use(s). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2124 Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/cp: extract valid_flagsRob Clark2020-06-163-174/+178
| | | | | | | We'll also need this in the postsched-cp pass. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3: delay test support for vectorish instructionsRob Clark2020-06-162-5/+68
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3: add helpers to move instructionsRob Clark2020-06-164-6/+25
| | | | | | | | | A bit cleaner than open coding the list manipulation. Plus I want to use it in the next patch, rather than adding more open coded list futzing. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/delay: calculate delay properly for (rptN)'d instructionsRob Clark2020-06-161-1/+23
| | | | | | | | | When a sequence of same instruction is encoded with repeat flag, destination registers are written on successive cycles. Teach the delay calculation about this. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3: add test for delay slot calculationRob Clark2020-06-162-0/+178
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/print: print (r) flagRob Clark2020-06-161-0/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/legalize: don't allow (nopN) if (rptN)Rob Clark2020-06-161-1/+2
| | | | | | | | | These two encodings are mutually exclusive. If the instruction is a vector(ish) `(rptN)` instruction, then we can't fold a `(nopN)` post- delay into it. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/cp: properly handle already-folded RELATIVRob Clark2020-06-161-3/+5
| | | | | | | | | | | | | | | In the `try_swap_mad_two_srcs()` case, valid_flags() gets called both for the src that we want to try to fold, and for the other src that we are trying to swap to make that possible. It can happen in the 2nd case that a RELATIV src has already been folded. Since `ssa()` returns non- null in both the `IR3_REG_SSA` and `IR3_REG_ARRAY` cases (in the later case, it is the dependent array access that the current instruction cannot be moved ahead of), we need to explicitly check that the src reg we are looking at is still an SSA src. Reported-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3/validate: also check instr->addressRob Clark2020-06-161-0/+9
| | | | | | | | Verify that instructions which have a relative src and/or dest, have `instr->address`. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/sched: reset delay counters at start of blockRob Clark2020-06-162-0/+4
| | | | | Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>
* freedreno/ir3: don't rely on intr->num_componentsRob Clark2020-06-164-22/+20
| | | | | | | | | | | | It is better to use `nir_intrinsic_dest_components()` which also handles the case of intrinsics with a fixed number of dest components. Somehow this starts showing up with a nir_serialize round-trip with shader-cache. But we really shouldn't have been relying on `intr->num_components` directly. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>
* freedreno/ir3: move the libdrm dependency out of shared codeRob Clark2020-06-152-19/+8
| | | | | | | | | | | | | The only reason for this dependency was the fd_bo used for the uploaded shader. But this isn't used by turnip. Now that we've unified the cleanup path from gallium, it isn't hard to pull the fd_bo upload/free parts into ir3_gallium. This cleanup has the added benefit that the shader disk-cache will not have to deal with it. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5476>
* freedreno/ir3: fix ir3_nir_move_varying_inputsJonathan Marek2020-06-141-10/+5
| | | | | | | | | | | | | | ir3_nir_move_varying_inputs is broken when there a load input outside of the first block which depends on the result of a previous load input. This simplification/rework avoids the problem, and should also be faster. Fixes this dEQP-VK test: dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_pixel_center.128_128_1.samples_2 Signed-off-by: Jonathan Marek <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5465>
* freedreno/ir3: limit pre-fetched tex destRob Clark2020-06-113-4/+60
| | | | | | | | | | | | | | | | Teach RA to setup additional interference to prevent textures fetched before the FS starts from ending up in a register that is too high to encode. Fixes mis-rendering in multiple playcanv.as webgl apps. Note that the regression was not actually 733bee57eb8's fault, but that was the commit that exposed the problem. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3108 Fixes: 733bee57eb8 ("glsl: lower samplers with highp coordinates correctly") Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
* freedreno/ir3: remove RA "q-values" optimizationRob Clark2020-06-111-54/+3
| | | | | | | | | | This is mainly the "piglit optimization" (ie, since piglit launches an separate process for for each test). It was never wired up for a6xx, and makes register class setup unnecessarily complicated. Remove it to simplify the next patch. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
* freedreno/ir3: respect tex prefetch limitsRob Clark2020-06-112-21/+51
| | | | | | | | | Refactor a bit the limit checking in the bindless case, and add tex/samp limit checking for the non-bindless case, to ensure we do not try to prefetch textures which cannot be encoded in the # of bits available. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
* freedreno/ir3: add debug code to print conflicting half-regsRob Clark2020-06-111-0/+7
| | | | | | | | I keep re-typing this from time to time when debugging various things. Which is dumb. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>
* freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads.Eric Anholt2020-06-051-39/+39
| | | | | | | | We advertize 4096 vec4s of GL uniform storage, but the HW can only store 512 vec4s in the const buffer. Closes: #3049 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Drop the max_const on a6xx to 512.Eric Anholt2020-06-051-1/+4
| | | | | | | | | | | | | | | | | | The GLES blob on the p3a limits constlen to 512 between VS and FS across a6xx gpu ids (615, 630, 640, and 650). Experimentally, exceeding that limit in any one stage results in rendering corruption or GPU hangs (though my most detailed testing had a loop limit in a uniform, so that may the cause of the hang). Clamp the limit we use inside of a shader so we don't exceed it within a stage. This commit doesn't resovle limiting inter-stage. Experimentally, I've found that I can push up to a total of ~768 vec4s between VS and FS on a630, with or without uniform updates between each draw. We'll need to do some shader key-based limiting of constlen at draw time to respect that limit, but that's left for future work, and this commit is enough for the google earth case that initiated this work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Account for driver params in UBO max const upload.Eric Anholt2020-06-053-7/+26
| | | | | | | The const state setup needs to be able to push its driver params, so account for them in the analyze_ubo_ranges. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Stop shifting UBO 1 down to be UBO 0.Eric Anholt2020-06-051-18/+9
| | | | | | | | | | | It turns out the GL uniforms file is larger than the hardware constant file, so we need to limit how many UBOs we lower to constbuf loads. To do actual UBO loads, we'll need to be able to upload UBO 0's pointer or descriptor. No difference on nohw 1 UBO update drawoverhead case (n=35). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Drop unnecessary alignment of pushed UBO size.Eric Anholt2020-06-051-1/+1
| | | | | | | The analysis pass gives us vec4-aligned size, and all of our other constbuf allocations here are in vec4 units, so we can just divide by 16. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Stop pushing immediates once we've filled the constbuf.Eric Anholt2020-06-051-1/+8
| | | | | | | If we filled the constbuf up with UBOs, we may need to avoid generating more immediate push constants. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: Refactor ir3_cp's lower_immed().Eric Anholt2020-06-051-20/+24
| | | | | | There was duplicated handling in the callers that we can just move inside. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>
* freedreno/ir3: split kill from no_earlyzRob Clark2020-06-042-2/+9
| | | | | | | | | Unlike other conditions which prevent early-discard of fragments, kill does not prevent early LRZ test. Split `has_kill` from `no_earlyz` so we can take advantage of this. Signed-off-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>
* nir: add callback to nir_remove_dead_variables()Timothy Arceri2020-06-031-1/+1
| | | | | | | | | | | | This allows us to do API specific checks before removing variable without filling nir_remove_dead_variables() with API specific code. In the following patches we will use this to support the removal of dead uniforms in GLSL. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>
* meson: use gnu_symbol_visibility argumentDylan Baker2020-06-011-2/+2
| | | | | | | | | | This uses a meson builtin to handle -fvisibility=hidden. This is nice because we don't need to track which languages are used, if C++ is suddenly added meson just does the right thing. Acked-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>
* freedreno/ir3: Avoid {0} initializer for struct reginfoKristian H. Kristensen2020-05-261-3/+4
| | | | | | | | | | | First element is not a scalar. Just initialize the struct like we do elsewhere. src/freedreno/ir3/disasm-a3xx.c:958:33: warning: suggest braces around initialization of subobject [-Wmissing-braces] Reviewed-by: Rob Clark <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5174>
* freedreno/ir3: Use RESINFO for a6xx image size queries.Eric Anholt2020-05-266-9/+43
| | | | | | | | | | | | | | | | | The closed GL driver uses resinfo on images with the writeonly flag (using the texture-path's getsize only for readonly images). The closed vulkan driver seems to use resinfo regardless. Using resinfo doesn't need any fixups after the instruction. It also avoids one of the needs for the TEX_CONST state for the image, which is awkward to set up in the GL driver. The new handler goes into ir3_a6xx to be next to the other current image code, but the a4xx version is left in place because it wants a bunch of sampler helpers. Fixes assertion failure in dEQP-VK.image.image_size.buffer.readonly_32. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
* freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse.Eric Anholt2020-05-263-22/+19
| | | | | | | There was an open coded version for ldc, and now we can drop that. I needed to do it for resinfo as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
* freedreno/ir3: Refactor out IBO source references.Eric Anholt2020-05-264-57/+37
| | | | | | | | All the users of the unsigned result just wanted an ir3_instruction to reference. Move a6xx's helpers to ir3_image.c and inline the old unsigned results version. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>
* freedreno: Set the immediate flag in a4/a5xx resinfos.Eric Anholt2020-05-263-14/+26
| | | | | | | | Noticed comparing our RESINFO asm to qcom's for the same test, and if I drop this bit their disasm switches from immediate to reg. ldgb seems to have the same behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>