summaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-214-8/+16
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3+a6xx: same VBO state for draw/binningRob Clark2019-08-135-17/+141
| | | | | | Worth ~+20% on gl_driver2 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: track # of driver paramsRob Clark2019-08-132-10/+32
| | | | | | To avoid emitting unneeded const state. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop unneeded ir3_ra() argsRob Clark2019-08-133-9/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: replace nir_move_load_const() with nir_opt_sink()Rhys Perry2019-08-121-1/+1
| | | | | | | | | | | | | | | | | | | This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <[email protected]> Co-authored-by: Daniel Schürmann <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* spirv: Drop lower_workgroup_access_to_offsetsCaio Marcelo de Oliveira Filho2019-08-101-1/+0
| | | | | | | | Intel drivers are not using this anymore, and turnip still don't have Compute Shaders, so won't make a difference to stop using this option. Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Rob Clark <[email protected]>
* mesa: freedreno: Android.registers.mk: Fix up register xml.h file generationJohn Stultz2019-08-071-3/+34
| | | | | | | | | | | | | | | | | | | | | | | | The current Androdi.registers.mk file causes build failures that look like: FAILED: external/mesa3d/src/freedreno/Android.registers.mk:49: error: implicit rules are obsolete: out/target/product/linaro_db845c/gen/STATIC_LIBRARIES/libfreedreno_registers_intermediates/registers/%.xml.h Caused by the following Android build rule change: https://android.googlesource.com/platform/build/+/HEAD/Changes.md#implicit_rules I tried to replace this with something similar to the static pattern suggested in the URL above, but ended up getting all the xml.h files generated using only the first a2xx.xml source file. So I've fallen back to explicitly defining the make rules for each. Additionally, we needed to provide the proper LOCAL_EXPORT_C_INCLUDE_DIRS and add the defined static library to the components that depend on the register headers. Acked-by: Eric Anholt <[email protected]> Signed-off-by: John Stultz <[email protected]>
* mesa: Add ir3/ir3_nir_imul.c generation to Android.mkJohn Stultz2019-08-071-1/+2
| | | | | | | | | | | | | | | | | | | | | | With current master we're seeing build failures with AOSP: error: undefined symbol: ir3_nir_lower_imul This is due to the ir3_nir_imul.c file not being generated in the Android.mk files. This patch simply adds it to the Android build, after which thigns build and book ok on db410c. Cc: Rob Clark <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Amit Pundir <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alistair Strachan <[email protected]> Cc: Greg Hartman <[email protected]> Cc: Tapani Pälli <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: John Stultz <[email protected]>
* meson: replace libmesa_util with idep_mesautilEric Engestrom2019-08-031-1/+1
| | | | | | | | | | | This automates the include_directories and dependencies tracking so that all users of libmesa_util don't need to add them manually. Next commit will remove the ones that were only added for that reason. Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Eric Anholt <[email protected]> Tested-by: Vinson Lee <[email protected]>
* freedreno: update registersRob Clark2019-08-022-4/+42
| | | | | | | Pull in some updates of VSC regs Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/drm: convert ring_pool to child_poolRob Clark2019-08-023-6/+29
| | | | | | Worth another couple percent at driver2 Signed-off-by: Rob Clark <[email protected]>
* freedreno/drm: remove idx_lockRob Clark2019-08-023-29/+24
| | | | | | | Since it ends up contended, it is a bit of a bottleneck for workloads with high driver overhead. Worth nearly +10% at gfxbench driver2. Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: implement texture tilingJonathan Marek2019-08-021-1/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix for array/reg store vs meta instructionsRob Clark2019-07-291-1/+4
| | | | | | | | | | | | | | | fishgl.com has a shader which does roughly: foo = texture(...); if (bar) foo = texture(...); after lowering phi webs to regs we end up w/ a vec4 reg (array). But since it was not an indirect access, we try to skip the extra mov. This results that the per-component fanout (split) meta instructions store directly to the reg (array). Which doesn't work out in RA. Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix data race on making the shader's id.Eric Anholt2019-07-291-1/+2
| | | | | | | The value is only used for IR3_DBG_DISASM, but it cleans up the helgrind output. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Take a lock around shader variant creation.Eric Anholt2019-07-292-0/+7
| | | | | | | | | Shaders are shared across contexts in gallium (part of making it so that you get shader compile at link time, for shader-db and to reduce compiles at draw time). So, we need to protect from variant creation for a shader from multiple threads at the same time. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix data races with allocating/freeing struct ir3.Eric Anholt2019-07-291-1/+1
| | | | | | | | | | | | | | There is a single ir3_compiler in the screen, and each context may be compiling ir3 shaders, which call ir3_create. ralloc doesn't do any locking on its own, so eventually you can end up racing to break ralloc's linked lists. We really don't want struct ir3 to live as long as the compiler (maybe struct ir3_shader's lifetime, if anything), so you'd better be freeing it anyway. Fixes: 8fe20762433d ("freedreno/ir3: convert over to ralloc") Reviewed-by: Rob Clark <[email protected]>
* anv+tu+radv: delete unusable dev_icd.jsonEric Engestrom2019-07-261-13/+0
| | | | | | | | | | | As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> # for anv Reviewed-by: Eric Anholt <[email protected]> # for tu Reviewed-by: Bas Nieuwenhuizen <[email protected]> # for radv
* freedreno: Add support for drm-shim.Eric Anholt2019-07-254-0/+224
| | | | | | I'm using this for shader-db analysis on x86_64 systems. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Convert nir_lower_tg4_to_tex to the NIR lowering helper.Eric Anholt2019-07-181-88/+51
| | | | | | Cuts a bunch of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_sample to the NIR lowering helper.Eric Anholt2019-07-181-48/+30
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Convert load_barycentric_at_offset to the NIR lowering helper.Eric Anholt2019-07-181-39/+19
| | | | | | Cuts out a ton of boilerplate. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Generate headers from xml filesKristian H. Kristensen2019-07-1024-23892/+14106
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Rob Clark <[email protected]>
* tu: add exported symbols checkEric Engestrom2019-07-101-0/+13
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+2
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno: update generated registersRob Clark2019-07-017-16/+23
| | | | | | Corrects the a3xx texconst state for TILE_MODE. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small cleanupRob Clark2019-06-281-1/+1
| | | | | | `target` cannot be NULL here. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix missing (ss) in dummy bary.f caseRob Clark2019-06-281-0/+5
| | | | | | | | | | | | In case we need to insert a dummy bary.f for the (ei) flag, it also needs (ss) so we don't release varying storage to the next VS wave before the ldlv completed. Fixes random failures in: dEQP-GLES3.functional.transform_feedback.random.interleaved.lines.* Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Only upload the used part of UBO0 to the constant buffer.Eric Anholt2019-06-242-5/+13
| | | | | | | | | | We were pessimistically uploading all of it in case of indirection, but we can just bump that when we encounter indirection. total constlen in shared programs: 2529623 -> 2485933 (-1.73%) Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop treating UBO 0 specially in UBO uploading.Eric Anholt2019-06-242-7/+0
| | | | | | | | | ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we need to upload (all of it, since it will lower indirect UBO 0 accesses from load_ubo back to indirection on the constant buffer). Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.Daniel Schürmann2019-06-241-2/+0
| | | | | | | | | | | That is: the five least significant bits provide the values of 'bits' and 'offset' which is the case for all hardware currently supported by NIR and using the bfm/bfe instructions. This patch also changes the lowering of bitfield_insert/extract using shifts to not use bfm and removes the flag 'lower_bfm'. Tested-by: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* freedreno: Only upload UBO pointers for UBOs that haven't been lowered.Eric Anholt2019-06-211-1/+7
| | | | | | | total constlen in shared programs: 2485933 -> 2462236 (-0.95%) Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Remove silly return from ir3_optimize_nir().Eric Anholt2019-06-214-12/+8
| | | | | | | | We only ever return the shader we were passed in (but internally modified). Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Fix up end range of unaligned UBO loads.Eric Anholt2019-06-211-2/+3
| | | | | | | | | | We need the constants uploaded to cover the NIR offset plus the size, not the aligned-down start of our upload range plus the size. Fixes mistaken UBO analysis with mat3 loads. Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix UBO load range detection on booleans.Eric Anholt2019-06-211-2/+1
| | | | | | | | NIR 1-bit bool dests will have a bit size of 1, and thus a calculated "bytes" of 0. load_ubo is always loading from dwords in the source. Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file") Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop reporting max_const in shader-db.Eric Anholt2019-06-211-3/+1
| | | | | | | We end up uploading constlen regardless, so max_const would only get you slightly improved granularity in const usage in comparison. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Include binning shaders in shader-db.Eric Anholt2019-06-211-1/+3
| | | | | | | We want to see if we've improved our binning VS output, as well as the render VS. Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix typoHyunjun Ko2019-06-201-1/+1
| | | | | Fixes: a9b556d3a04 ("freedreno/ir3: check the type of regs of absneg opcode in is_same_type_mov") Reviewed-by: Rob Clark <[email protected]>
* ir3: initialize progress false before ir3_nir_lower_imulTapani Pälli2019-06-141-1/+1
| | | | | | | | | Removes a compiler warning about uninitialized variable. Fixes: c02ffd2700c "ir3: Use the new NIR lowering pass for integer multiplication" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eduardo Lima <[email protected]>
* freedreno: update generated headersRob Clark2019-06-117-53/+305
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* ir3: Use the new NIR lowering pass for integer multiplicationEduardo Lima Mitev2019-06-072-17/+16
| | | | | | | | | | | | | | | | | | | Shader-db stats courtesy of Eric Anholt: total instructions in shared programs: 6480215 -> 6475457 (-0.07%) instructions in affected programs: 662105 -> 657347 (-0.72%) helped: 1209 HURT: 13 total constlen in shared programs: 1432704 -> 1427769 (-0.34%) constlen in affected programs: 100063 -> 95128 (-4.93%) helped: 512 HURT: 0 total max_sun in shared programs: 875561 -> 873387 (-0.25%) max_sun in affected programs: 46179 -> 44005 (-4.71%) helped: 1087 HURT: 0 Reviewed-by: Eric Anholt <[email protected]>
* ir3/nir: Add new NIR AlgebraicPass for lowering imulEduardo Lima Mitev2019-06-073-1/+64
| | | | | | | | | | | | | | | | | | | | | | | | Currently, ir3 backend compiler is lowering integer multiplication from: dst = a * b to: dst = (al * bl) + (ah * bl << 16) + (al * bh << 16) by emitting this code: mull.u tmp0, a, b ; mul low, i.e. al * bl madsh.m16 tmp1, a, b, tmp0 ; mul-add shift high mix, i.e. ah * bl << 16 madsh.m16 dst, b, a, tmp1 ; i.e. al * bh << 16 which at that point has very low chances of being optimized. This patch adds a new nir_algebraic.AlgebraicPass to performs this lowering during NIR algebraic optimization passes, giving it a better chance for optimizing the resulting code. Reviewed-by: Eric Anholt <[email protected]>
* ir3/compiler: Handle new alu opcodes 'umul_low' and 'imadsh_mix16'Eduardo Lima Mitev2019-06-071-0/+6
| | | | | | They directly emit ir3_MULL_U and ir3_MADSH_M16 respectively. Reviewed-by: Eric Anholt <[email protected]>
* nir: Combine lower_fmod16/32 back into a single lower_fmod.Kenneth Graunke2019-06-051-2/+2
| | | | | | | | | | | | | | We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam split 32 and 64-bit lowering into separate flags, with the rationale that some drivers might want different options there. This left 16-bit unhandled, so Iago added a lower_fmod16 option in commit ca31df6f. Now that lower_fmod64 is gone (in favor of nir_lower_doubles and nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a single lower_fmod flag again. I'm not aware of any hardware which need lowering for one bitsize and not the other. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Drop lower_fmod64 from drivers that don't support doubles.Kenneth Graunke2019-06-051-2/+0
| | | | | | | Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no fmod64 to be lowered. Reviewed-by: Marek Olšák <[email protected]>
* freedreno/ir3: Extend debug helpers to support TCS/TES/GSKristian H. Kristensen2019-06-053-7/+19
| | | | Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Generalize ir3_shader_disasm()Kristian H. Kristensen2019-06-051-46/+42
| | | | | | | Use a helper function to get the sysval/attribute/varying/output name and make the disam debug output independent of shader stage. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Reuse glsl_get_sampler_coordinate_components().Eric Anholt2019-06-041-25/+5
| | | | | | | | We have the GLSL type, so we can just ask it how many coordinates there are. The GLSL function already has Vulkan cases that we'd probably want eventually. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Improve the pi approximations in trig lowering.Eric Anholt2019-06-041-2/+2
| | | | | | | | | | | | When comparing our sin/cos behavior to the closed source driver, I noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits). Fixes: dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52 dEQP-GLES3.functional.shaders.random.all_features.vertex.0 Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix GCC build error.Vinson Lee2019-06-031-1/+1
| | | | | | | | | | | ../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant .minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 }, ^ Suggested-by: Kristian Høgsberg <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Rob Clark <[email protected]>