aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/ir3
Commit message (Collapse)AuthorAgeFilesLines
* python: Use the print functionMathieu Bridon2018-07-061-3/+5
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* freedreno/ir3: fix deref conversion falloutRob Clark2018-06-231-13/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: Remove old-school deref chain supportJason Ekstrand2018-06-221-3/+0
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: convert to deref instructionsRob Clark2018-06-223-53/+57
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Rework lower_locals_to_regs to use deref instructionsJason Ekstrand2018-06-221-2/+2
| | | | | | | | | | This completely reworks the pass to support deref instructions and delete support for old deref chains Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Re-enable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | Now that it's rewritten for deref instructions, we can turn it back on. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Delete lower_io_typesJason Ekstrand2018-06-221-1/+0
| | | | | | | | | | It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st,ir3,radeonsi: push lower_deref_instrs back into driverRob Clark2018-06-222-3/+3
| | | | | | | | | | | | | vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: remove legacy versionRob Clark2018-06-221-1/+1
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: split out _legacy version for deref chainsRob Clark2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Disable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | | | | This pass doesn't handle deref instructions yet. Making it handle both legacy derefs and deref instructions would be painful. Since it's not important for correctness, just disable it for now. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand2018-06-222-1/+6
| | | | | | | | | | | This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: txf_ms supportRob Clark2018-06-213-7/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix base_vertexRob Clark2018-06-211-0/+1
| | | | | Fixes: c366f422f0a nir: Offset vertex_id by first_vertex instead of base_vertex Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix missing recursion into block conditionRob Clark2018-06-191-0/+4
| | | | | | Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4 Signed-off-by: Rob Clark <[email protected]>
* freedreno: remove per-stateobj dirty_mask'sRob Clark2018-06-191-4/+1
| | | | | | | | These never got updated in fd_context_all_dirty() so actually trying to rely on them (in the case of fd5_emit_images()) ends up in some cases where state is not emitted but should be. Best to just rip this out. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: cubemap image fixesRob Clark2018-06-191-1/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle image bufferRob Clark2018-06-191-1/+8
| | | | | | Similar to txf case, we need to insert a 2nd coordinate (zero). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle arrays of imagesRob Clark2018-06-191-6/+30
| | | | | | | | Unlike textures, this doesn't get lowered for us. (Would be nice if they were.. at least until we are ready to deal w/ indirect indexing..) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: images can be arrays tooRob Clark2018-06-191-19/+54
| | | | | | Seems I previously toally forgot about 2d-arrays, etc.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use move_load_const passRob Clark2018-06-191-0/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use pipe_image_view's cppRob Clark2018-06-111-1/+6
| | | | | | | At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix image dimensions offsetRob Clark2018-06-111-1/+1
| | | | | | copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use saml always if we have lodRob Clark2018-06-111-1/+1
| | | | | | | In some cases we get plain tex opcodes (but w/ a lod argument).. in this case always use the saml instruction. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't cp absneg into meta:fiRob Clark2018-06-111-0/+4
| | | | | | | | | | | If using a fanin (collect) to collect of consecutive registers together, we can CP mov's into the fanin, but not (abs) or (neg). No places that allow those modifiers are consuming a fanin anyways. But this caused an absneg to be lost between a ldgb and stgb for shaders like: outputs[n] = abs(input[n]) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rework size/type conversion instructionsRob Clark2018-06-111-10/+156
| | | | | | With 8b and 16b, there are a lot more to handle. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: propagate HALF flag across fanoutRob Clark2018-06-111-1/+4
| | | | | | | | | | If we have a fanout (split) meta instruction to split the result of a vector instruction, propagate the HALF flag back to the original instruction. Otherwise result ends up in a full precision register while instruction(s) that use the result look in a half-precision register. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add sample-id/sample-mask-inRob Clark2018-06-111-0/+21
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: image atomics use image-store pathRob Clark2018-06-111-0/+8
| | | | | | | | image reads are handled via tex state, whereas image writes and atomics are handled via SSBO state block. Previously we were only considering image write, and not image atomics which also uses the SSBO state block. Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix ir3_cmdline.c build.Eric Anholt2018-05-011-0/+1
| | | | | | Fixes: 6487e7a30c9e ("nir: move GL specific passes to src/compiler/glsl") Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir: move GL specific passes to src/compiler/glslTimothy Arceri2018-05-011-1/+1
| | | | | | | With this we should have no passes in src/compiler/nir with any dependencies on headers from core GL Mesa. Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Offset vertex_id by first_vertex instead of base_vertexNeil Roberts2018-04-191-3/+2
| | | | | | | | | | | | | | | | | | base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Reviewed-by: Ian Romanick <[email protected]> Cc: Rob Clark <[email protected]> Acked-by: Marek Olšák <[email protected]>
* freedreno/ir3: use lower_global_vars_to_local in cmdline compilerRob Clark2018-04-071-0/+1
| | | | | | | | tgsi_to_nir emits things with arrays as global vars.. and nir->ir3 does lower_locals_to_regs. But nothing was lowering global to local, which breaks compiling tgsi shaders Signed-off-by: Rob Clark <[email protected]>
* nir+drivers: add helpers to get # of src/dest componentsRob Clark2018-04-031-5/+1
| | | | | | | | | Add helpers to get the number of src/dest components for an intrinsic, and update spots that were open-coding this logic to use the helpers instead. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: fix fallout of unused false-depth eliminationRob Clark2018-04-032-17/+19
| | | | | | | | | | Since we were MARK flag for both preventing loops, and tracking whether instructions were used, we could end up in an infinite loop due to bd2ca2bcdd. Instead invert the logic.. mark all instructions UNUSED up front and clear the flag as we visit them. Fixes: bd2ca2bcdd freedreno/ir3: eliminate unused false-deps Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix issue w/ glamor composite shadersRob Clark2018-03-312-2/+36
| | | | | | | | | | | | | | | Fixes an issue that became possible when we started lowering phi webs to regs (a7ea2b4e) (although was not really seen until we also switched to using peephole select pass (ec8bc54a) instead of lowering *all* if/else to select). If texture coord (or anything else that uses create_collect() to collect scalar values in a sequence of scalar registers) was consuming a value produced on either side of an if/else (ie. a phi lowered to nir reg, which in ir3 is an "array" of length 1) then register allocation would happen incorrectly and we'd end up sampling from garbage coordinates. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: more half-precision fixesRob Clark2018-03-312-8/+37
| | | | | | | | Some instructions require src/dst to be in full or half precision register depending on src/dst type. So do a better job of propagating register type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add helper to create immed of specified sizeRob Clark2018-03-311-4/+11
| | | | | | | We'll also need to be able to create a half-precision immediate. So re-work create_immed(). Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: pass ctx instead of block to create_collect()Rob Clark2018-03-311-18/+19
| | | | | | Prep work for following patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: eliminate unused false-depsRob Clark2018-03-312-11/+31
| | | | | | | | | | | | | | Previously false-dependencies would get flagged as used, even if the only "use" was a false dep to (for example) prevent a load from being scheduled after a store. In addition to being pointless instructions, in some cases they can cause problems. For example, ldg (and similar instructions) depend on an immed arg getting CP'd into the instruction, but this doesn't happen if an instruction is otherwise unused. Which can result in undefined results (overwriting unintended registers). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add local_group_sizeRob Clark2018-03-313-2/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: clear SSA flag when assigning "ARRAY" regs tooRob Clark2018-03-311-0/+1
| | | | | | Avoids a misleading "INVALID FLAGS" warning in debug builds. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: print array live rangesRob Clark2018-03-311-4/+10
| | | | | | This is also useful to see if optmsgs are enabled. Signed-off-by: Rob Clark <[email protected]>
* nir: Rename image intrinsics to image_varJason Ekstrand2018-03-232-20/+20
| | | | | | | | | | | Generated with git grep -l nir_intrinsic_image | xargs \ sed -i 's/nir_intrinsic_image/nir_intrinsic_image_var/g' and some manual fixing in nir_intrinsics.h Reviewed-by: Timothy Arceri <[email protected]>
* freedreno/ir3: start dealing with half-precisionRob Clark2018-03-053-30/+81
| | | | | | | | | | | | | | | | | Some instructions, assume src and/or dst is half-precision based on a type field (ie. f32/s32/u32 are full precision but others are half precision). So add some code to sanity check the src/dst registers to catch mixups. Also propagate half-precision flag for SSA sources. The instruction consuming a SSA value needs to be of the same type as the one producing it. This is probably not complete half-precision support, but a useful first step. We do still need to add support for nir alu instructions for converting between half/full precision. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix fixing-up register footprintRob Clark2018-03-052-18/+27
| | | | | | | | | | | | | It isn't just vertex shaders that need to fixup reg footprint for inputs populated before shader starts. This problem showed up with compute shaders. If you have (for example) a localregid sysval, but only the .x component is used, the hw still writes the .yz components, which could overflow into other threads causing corruption. Showed up in cl cts 'basic/test_basic intmath_int'. But in theory the same problem could crop up elsewhere. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ignore return jumpRob Clark2018-03-051-0/+1
| | | | | | | I think this should also always only occur at the end of a BB (by definition), and the BB successor should be the end block. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small cleanupRob Clark2018-03-051-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cmdline compiler updates for spv shadersRob Clark2018-03-051-0/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* nir: add lower_ldexp to nir compiler optionsTimothy Arceri2018-02-281-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>