summaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-031-1/+1
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* freedreno: update registersRob Clark2019-08-022-4/+42
| | | | | | | Pull in some updates of VSC regs Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/drm: convert ring_pool to child_poolRob Clark2019-08-023-6/+29
| | | | | | Worth another couple percent at driver2 Signed-off-by: Rob Clark <[email protected]>
* freedreno/drm: remove idx_lockRob Clark2019-08-023-29/+24
| | | | | | | Since it ends up contended, it is a bit of a bottleneck for workloads with high driver overhead. Worth nearly +10% at gfxbench driver2. Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: implement texture tilingJonathan Marek2019-08-021-1/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix for array/reg store vs meta instructionsRob Clark2019-07-291-1/+4
| | | | | | | | | | | | | | | fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix data race on making the shader's id.Eric Anholt2019-07-291-1/+2
| | | | | | | The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Take a lock around shader variant creation.Eric Anholt2019-07-292-0/+7
| | | | | | | | | Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix data races with allocating/freeing struct ir3.Eric Anholt2019-07-291-1/+1
| | | | | | | | | | | | | | There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: 8fe20762433d ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <[email protected]>
* anv+tu+radv: delete unusable dev_icd.jsonEric Engestrom2019-07-261-13/+0
| | | | | | | | | | | As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> # for anv Reviewed-by: Eric Anholt <[email protected]> # for tu Reviewed-by: Bas Nieuwenhuizen <[email protected]> # for radv
* freedreno: Add support for drm-shim.Eric Anholt2019-07-254-0/+224
| | | | | | I'm using this for shader-db analysis on x86_64 systems. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.Eric Anholt2019-07-181-88/+51
| | | | | | Cuts a bunch of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_sample to the NIR lowering helper.Eric Anholt2019-07-181-48/+30
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.Eric Anholt2019-07-181-39/+19
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Generate headers from xml filesKristian H. Kristensen2019-07-1024-23892/+14106
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Rob Clark <[email protected]>
* tu: add exported symbols checkEric Engestrom2019-07-101-0/+13
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+2
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno: update generated registersRob Clark2019-07-017-16/+23
| | | | | | Corrects the a3xx texconst state for TILE_MODE. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small cleanupRob Clark2019-06-281-1/+1
| | | | | | `target` cannot be NULL here. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix missing (ss) in dummy bary.f caseRob Clark2019-06-281-0/+5
| | | | | | | | | | | | In case we need to insert a dummy bary.f for the (ei) flag, it also needs (ss) so we don't release varying storage to the next VS wave before the ldlv completed. Fixes random failures in: dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.* Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Only upload the used part of UBO0 to the constant buffer.Eric Anholt2019-06-242-5/+13
| | | | | | | | | | We were pessimistically uploading all of it in case of indirection, but we can just bump that when we encounter indirection. total constlen in shared programs: 2529623 -> 2485933 (-1.73%) Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop treating UBO 0 specially in UBO uploading.Eric Anholt2019-06-242-7/+0
| | | | | | | | | ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we need to upload (all of it, since it will lower indirect UBO 0 accesses from load_ubo back to indirection on the constant buffer). Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.Daniel Schürmann2019-06-241-2/+0
| | | | | | | | | | | That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* freedreno: Only upload UBO pointers for UBOs that haven't been lowered.Eric Anholt2019-06-211-1/+7
| | | | | | | total constlen in shared programs: 2485933 -> 2462236 (-0.95%) Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Remove silly return from ir3_optimize_nir().Eric Anholt2019-06-214-12/+8
| | | | | | | | We only ever return the shader we were passed in (but internally modified). Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Fix up end range of unaligned UBO loads.Eric Anholt2019-06-211-2/+3
| | | | | | | | | | We need the constants uploaded to cover the NIR offset plus the size, not the aligned-down start of our upload range plus the size. Fixes mistaken UBO analysis with mat3 loads. Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix UBO load range detection on booleans.Eric Anholt2019-06-211-2/+1
| | | | | | | | NIR 1-bit bool dests will have a bit size of 1, and thus a calculated "bytes" of 0. load_ubo is always loading from dwords in the source. Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop reporting max_const in shader-db.Eric Anholt2019-06-211-3/+1
| | | | | | | We end up uploading constlen regardless, so max_const would only get you slightly improved granularity in const usage in comparison. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Include binning shaders in shader-db.Eric Anholt2019-06-211-1/+3
| | | | | | | We want to see if we've improved our binning VS output, as well as the render VS. Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix typoHyunjun Ko2019-06-201-1/+1
| | | | | Fixes: a9b556d3a04 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov") Reviewed-by: Rob Clark <[email protected]>
* ir3: initialize progress false before ir3_nir_lower_imulTapani Pälli2019-06-141-1/+1
| | | | | | | | | Removes a compiler warning about uninitialized variable. Fixes: c02ffd2700c "ir3: Use the new NIR lowering pass for integer multiplication" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eduardo Lima <[email protected]>
* freedreno: update generated headersRob Clark2019-06-117-53/+305
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* ir3: Use the new NIR lowering pass for integer multiplicationEduardo Lima Mitev2019-06-072-17/+16
| | | | | | | | | | | | | | | | | | | Shader-db stats courtesy of Eric Anholt: total instructions in shared programs: 6480215 -> 6475457 (-0.07%) instructions in affected programs: 662105 -> 657347 (-0.72%) helped: 1209 HURT: 13 total constlen in shared programs: 1432704 -> 1427769 (-0.34%) constlen in affected programs: 100063 -> 95128 (-4.93%) helped: 512 HURT: 0 total max_sun in shared programs: 875561 -> 873387 (-0.25%) max_sun in affected programs: 46179 -> 44005 (-4.71%) helped: 1087 HURT: 0 Reviewed-by: Eric Anholt <[email protected]>
* ir3/nir: Add new NIR AlgebraicPass for lowering imulEduardo Lima Mitev2019-06-073-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | Currently, ir3 backend compiler is lowering integer multiplication from: dst = a * b to: dst = (al * bl) + (ah * bl << 16) + (al * bh << 16) by emitting this code: mull.u tmp0, a, b ; mul low, i.e. al * bl madsh.m16 tmp1, a, b, tmp0 ; mul-add shift high mix, i.e. ah * bl << 16 madsh.m16 dst, b, a, tmp1 ; i.e. al * bh << 16 which at that point has very low chances of being optimized. This patch adds a new nir_algebraic.AlgebraicPass to performs this lowering during NIR algebraic optimization passes, giving it a better chance for optimizing the resulting code. Reviewed-by: Eric Anholt <[email protected]>
* ir3/compiler: Handle new alu opcodes 'umul_low' and 'imadsh_mix16'Eduardo Lima Mitev2019-06-071-0/+6
| | | | | | They directly emit ir3_MULL_U and ir3_MADSH_M16 respectively. Reviewed-by: Eric Anholt <[email protected]>
* nir: Combine lower_fmod16/32 back into a single lower_fmod.Kenneth Graunke2019-06-051-2/+2
| | | | | | | | | | | | | | We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Drop lower_fmod64 from drivers that don't support doubles.Kenneth Graunke2019-06-051-2/+0
| | | | | | | Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no fmod64 to be lowered. Reviewed-by: Marek Olšák <[email protected]>
* freedreno/ir3: Extend debug helpers to support TCS/TES/GSKristian H. Kristensen2019-06-053-7/+19
| | | | Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Generalize ir3_shader_disasm()Kristian H. Kristensen2019-06-051-46/+42
| | | | | | | Use a helper function to get the sysval/attribute/varying/output name and make the disam debug output independent of shader stage. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Reuse glsl_get_sampler_coordinate_components().Eric Anholt2019-06-041-25/+5
| | | | | | | | We have the GLSL type, so we can just ask it how many coordinates there are. The GLSL function already has Vulkan cases that we'd probably want eventually. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Improve the pi approximations in trig lowering.Eric Anholt2019-06-041-2/+2
| | | | | | | | | | | | When comparing our sin/cos behavior to the closed source driver, I noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits). Fixes: dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52 dEQP-GLES3.functional.shaders.random.all_features.vertex.0 Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix GCC build error.Vinson Lee2019-06-031-1/+1
| | | | | | | | | | | ../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 }, ^ Suggested-by: Kristian Høgsberg <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix counting and printing for half registers.Hyunjun Ko2019-06-032-7/+18
| | | | | v2: defining 0x100 and use this for setting the FS_OUTPUT_REG.HALF_PRECISION Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Fix up the half reg source even when src instr==NULLNeil Roberts2019-06-031-3/+2
| | | | | | | | | | | | | | | | Previously the loop for assigning registers was bailing out early if the register had a null source. I think the intention is that in this case it isn’t necessary to assign a register. However it was also missing out the part to fix up the types. This can happen if the instruction is copy propagated to be a move from a constant half-float input register. In that case it still needs to fix up the types. Fixes assert in dEQP-GLES3.functional.shaders.invariance.highp.subexpression_precision_mediump when lowering the precision of the variables. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Add a 16-bit implementation of nir_op_imulNeil Roberts2019-06-031-9/+15
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: set dst type of alu instructions correctly.Hyunjun Ko2019-06-031-5/+8
| | | | | | | | | | Though it should be fixed in RA pass, it needs to be set correctly from the beginning according to the bitsize of NIR dest. v2: Would be better for mad,fddx,fddy to fixup later in RA pass. [small cleanup of fallout from imov/fmov removal fallout] Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: adjust the bitsize of regs when an array loading.Hyunjun Ko2019-06-032-7/+16
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: convert back to 32-bit values for half constant registers.Hyunjun Ko2019-06-032-4/+54
| | | | | | | | | It seems to handle only 32-bit values for half constant registers within floating point opcodes according to the blob driver. So we need to convert back to 32-bit values from 16-bit values, when a lower precision pass is in effect. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov.Hyunjun Ko2019-06-031-0/+16
| | | | | | | | | | If the type of dest reg and src reg of absneg opcode are different, it shouldn't be considered as same type mov. This patch becomes meaningful when we start to use mediump information for doing precision lowering to 16bit. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: set proper dst type for uniform according to the type of nir ↵Hyunjun Ko2019-06-032-7/+14
| | | | | | | | | | | dest. eg. uniform mediump vec4 f; This patch means nothing since there's no mediump lowering pass for now, but will be meaningful when the pass land in the near future. Signed-off-by: Rob Clark <[email protected]>