summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno/ir3
Commit message (Collapse)AuthorAgeFilesLines
* treewide: s/comparitor/comparator/Ilia Mirkin2016-12-121-1/+1
| | | | | | | | | | git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g' Just happened to notice this in a patch that was sent and included one of the tokens in question. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* freedreno/a5xx: fix negative branchesRob Clark2016-11-302-1/+6
| | | | | | | | Looks like immed branch offset size increased again.. making what we think is a small negative number look to hw like a huge positive number. And things go badly when shader tries to jump to hyperspace. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't offset inloc by 8Rob Clark2016-11-302-11/+5
| | | | | | | | | On a3xx/a4xx, the SP_VS_VPC_DST_REG.OUTLOCn is offset by 8, so we used to add this offset into fs->inputs[n].inloc. But a5xx drops this extra offset-by-8. So instead make inloc zero based and add the offset when we emit OUTLOCn values (for the gen's that need the offset). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add new helper for shader linkageRob Clark2016-11-301-0/+47
| | | | | | | Helps simplify things on a5xx, where pos/psize get added to the vs-out map. And anyways, simplifies a3xx and a4xx. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fixup ralloc falloutRob Clark2016-11-122-2/+2
| | | | | | | Fixes fallout from acc23b04 ("ralloc: remove memset from ralloc_size"). We were still depending on zero'd allocations in a couple of places. Signed-off-by: Rob Clark <[email protected]>
* ralloc: use rzalloc where it's necessaryMarek Olšák2016-10-311-1/+1
| | | | | | | | | | | | | | | | | No change in behavior. ralloc_size is equivalent to rzalloc_size. That will change though. Calls not switched to rzalloc_size: - ralloc_vasprintf - glsl_type::name allocation (it's filled with snprintf) - C++ classes where valgrind didn't show uninitialized values I switched most of non-glsl stuff to rzalloc without checking whether it's really needed. Reviewed-by: Edward O'Callaghan <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/i965/anv/radv/gallium: make shader info a pointerTimothy Arceri2016-10-261-1/+1
| | | | | | | | | | When restoring something from shader cache we won't have and don't want to create a nir_shader this change detaches the two. There are other advantages such as being able to reuse the shader info populated by GLSL IR. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Make nir_foo_first/last_cf_node return a block insteadJason Ekstrand2016-10-061-7/+4
| | | | | | | | | | One of NIR's invariants is that control flow lists always start and end with blocks. There's no good reason why we should return a cf_node from these functions since we know that it's always a block. Making it a block lets us remove a bunch of code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a flag to lower_io to force "sample" interpolationJason Ekstrand2016-09-151-1/+1
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Report progress from nir_lower_phis_to_scalar.Kenneth Graunke2016-09-141-1/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Report progress from nir_lower_alu_to_scalar.Kenneth Graunke2016-09-141-1/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* a3xx: make use of software clipping when hw can't handle itIlia Mirkin2016-09-032-0/+7
| | | | | | | | | The hw clipper only handles up to 6 UCPs. If there are more than 6 UCPs, or a clip vertex, or clip distances are in use, then we must use the fallback discard-based clipping from the frag shader. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nir: Change nir_shader_get_entrypoint to return an impl.Kenneth Graunke2016-08-251-1/+1
| | | | | | | | | | | | | | | | | Jason suggested adding an assert(function->impl) here. All callers of this function actually want ->impl, so I decided just to change the API. We also change the nir_lower_io_to_temporaries API here. All but one caller passed nir_shader_get_entrypoint(), and with the previous commit, it now uses a nir_function_impl internally. Folding this change in avoids the need to change it and change it back. v2: Fix one call I missed in ir3_compiler (caught by Eric). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* ttn: Use nir_load_front_face instead of the TGSI-style input.Eric Anholt2016-08-191-46/+0
| | | | | | | This reduces the diff between GLSL-to-NIR and TGSI-to-NIR, and gives NIR more optimization to work on. Reviewed-by: Kenneth Graunke <[email protected]>
* ttn: Make FRAG_RESULT_DEPTH be a float variable to match gtn and ptn.Eric Anholt2016-08-192-7/+0
| | | | | | | This lets TTN-using drivers handle FRAG_RESULT_DEPTH the same between all their source paths. Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: fix issue with emit_tex()Rob Clark2016-08-131-19/+28
| | | | | | | | | | | | | | For various tex fetch instructions, coord's get fixed up in different ways. But modifying the array returned from get_src() has side-effects if the same SSA src is used again.. the later instruction will see the previous fixups. Fix this, and const'ify things to prevent this sort of mistake in the future. Noticed by Varad when adding support for txf_ms. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: init ir3_shader_key with memset()[email protected]2016-07-301-1/+2
| | | | | | | To silence missing initializers warning Signed-off-by: Francesco Ansanelli <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: move needs_wfi into batchRob Clark2016-07-301-5/+5
| | | | | | | This is also used in gmem code, which executes from the "bottom half" (ie. from the flush_queue worker thread), so it cannot be in fd_context. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Add missing braces in initializer[email protected]2016-07-231-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* compiler: Rename INTERP_QUALIFIER_* to INTERP_MODE_*.Kenneth Graunke2016-07-172-4/+4
| | | | | | | | | | | | | | | | | Likewise, rename the enum type to glsl_interp_mode. Beyond the GLSL front-end, talking about "interpolation modes" seems more natural than "interpolation qualifiers" - in the IR, we're removed from how exactly the source language specifies how to interpolate an input. Also, SPIR-V calls these "decorations" rather than "qualifiers". Generated by: $ find . -regextype egrep -regex '.*\.(c|cpp|h)' -type f -exec sed -i \ -e 's/INTERP_QUALIFIER_/INTERP_MODE_/g' \ -e 's/glsl_interp_qualifier/glsl_interp_mode/g' {} \; Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Dave Airlie <[email protected]>
* freedreno/ir3: support glsl linking for cmdline compilerRob Clark2016-07-021-24/+47
| | | | | | | | | | | For .vert/.frag, now multiple can be specified on the cmdline for purposes of linking, and the last one specified is the one that is fed into the ir3 backend (and dumped along the way if --verbose is specified) Without this, varyings in frag shaders would appear as undefined. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: support non-user_buffer constsRob Clark2016-07-021-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: switch emit_const_bo() to take prsc'sRob Clark2016-07-021-8/+8
| | | | | | We can push the unwrap of pipe_resource down. Signed-off-by: Rob Clark <[email protected]>
* Remove wrongly repeated words in commentsGiuseppe Bilotta2016-06-231-1/+1
| | | | | | | | | | | | | | | | | Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* freedreno/ir3: do idiv lowering after main opt loopRob Clark2016-06-031-16/+27
| | | | | | | Give algebraic-opt pass a chance to catch udiv by const power-of-two, before running lower-idiv pass. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix coverity warningRob Clark2016-06-021-1/+3
| | | | | | CID 1362453 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use nir_shader_get_entrypoint() helperRob Clark2016-06-021-10/+1
| | | | | | Should also fix coverity warning: CID 1362454 Signed-off-by: Rob Clark <[email protected]>
* compiler: Move glsl_to_nir to libglsl.laJason Ekstrand2016-05-261-1/+1
| | | | | | | | Right now libglsl.la depends on libnir.la so putting it in libnir.la adds a dependency on libglsl.la that goes the wrong direction. Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* freedreno/ir3: cmdline compiler for glslRob Clark2016-05-251-14/+75
| | | | | | | | Use glsl/libstandalone.la to add support for taking glsl src files (in addition to .tgsi) as input. Then glsl->nir and feed the result into the ir3 backend as normal. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: disable cp for indirect src'sRob Clark2016-05-231-0/+9
| | | | | | | | | | | | The variable-indexing tests always had a few random fails, which I usually couldn't reproduce when running tests manually. Somehow recently this got a lot worse. I ported a couple of the shaders to GLES to see what blob does, and it also seems to be avoiding to cp indirect srcs. So I guess indirect w/ instructions other than cat1 (mov) are not totally reliable. Let's just switch that off until this is better understood. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: need to lower fmod tooRob Clark2016-05-201-0/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix compiler warningRob Clark2016-05-171-0/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small standalone compiler cleanupRob Clark2016-05-151-2/+1
| | | | | | Don't hard-code the gpu-id anymore. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower fdivRob Clark2016-05-151-0/+1
| | | | | | | Not sure how we didn't hit this already, but since we want fdiv converted into mul + rcp, we should set this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle VARYING_SLOT_PNTCRob Clark2016-05-151-0/+12
| | | | | | | In the glsl->tgsi path, this already gets translated to VAR8, which matches up with rasterizer->sprite_coord_enable. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: disable TGSI specific hacks in nir caseRob Clark2016-05-154-2/+7
| | | | | | | When we got NIR directly from state tracker (vs using tgsi_to_nir) we need to realize this and skip some TGSI specific hacks. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for NIR as preferred IRRob Clark2016-05-154-17/+41
| | | | | | For now under debug flag, since only suitable for debugging/testing. Signed-off-by: Rob Clark <[email protected]>
* nir/algebraic: Separate ffma lowering from fusingJason Ekstrand2016-05-111-0/+1
| | | | | | | | The i965 driver has its own pass for fusing mul+add combinations that's much smarter than what nir_opt_algebraic can do so we don't want to get the nir_opt_algebraic one just because we didn't set lower_ffma. Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: size input/output arrays properlyRob Clark2016-05-101-3/+14
| | | | | | | | | We index into these based on var->data.driver_location, which might have gaps (ie. two inputs, one w/ drvloc 0 and other 2). This shows up in (for example) 'bin/copyteximage 1D', but was only noticed recently due to additional asserts. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower lrp when operating with double operandsSamuel Iglesias Gonsálvez2016-05-101-0/+1
| | | | | | | | | Lower lrp when operating with double operands because float version of lrp is also lowered. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: fix fallout from new block iteratorsRob Clark2016-05-091-1/+1
| | | | | | | Since this is potentially modifying the block structure of the shader, it needs the _safe() version of the iterator. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: allow for additional VS sysval inputsRob Clark2016-05-091-2/+5
| | | | | | | | | | | | There are a total of four possible currently, rather than 2. So we need to be prepared for the input array to grow by 16 components. We could get away with less if we could pack sysval inputs.. and the way this is handled currently isn't really the nicest thing. But it's a tactical fix for an issue hit in: GL31-CTS.gtf30.GL3Tests.transform_feedback.transform_feedback_vertex_id Signed-off-by: Rob Clark <[email protected]>
* ir3: fixup for new nir_foreach_block()Connor Abbott2016-05-051-30/+21
|
* freedreno: move shader-stage dirty bits to global dirty flagRob Clark2016-05-041-2/+2
| | | | | | | | | | | This was always a bit overly complicated, and had some issues (like ctx->prog.dirty not getting reset at the end of the batch). It also required some special hacks to avoid resetting dirty state on binning pass. So just move it all into ctx->dirty (leaving some free bits for future shader stages), and make FD_DIRTY_PROG just be the union of all FD_SHADER_DIRTY_*. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove a couple redundant is_flow()sRob Clark2016-05-042-2/+2
| | | | | | | Now that the opc's encode the instruction category (making them unique) we no longer need to check the category in addition to the opc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cp small negative integers tooRob Clark2016-05-041-1/+2
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix # of registersRob Clark2016-05-041-1/+1
| | | | | | | The instruction encoding allows for more registers, but at least on a3xx/a4xx they don't actually exist. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower immeds to constRob Clark2016-05-041-0/+70
| | | | | | | | | | | | | | | | | Helps reduce register pressure and instruction counts for immediates that would otherwise require a mov into gpr. total instructions in shared programs: 4455332 -> 4369297 (-1.93%) total dwords in shared programs: 8807872 -> 8614432 (-2.20%) total full registers used in shared programs: 263062 -> 250846 (-4.64%) total half registers used in shader programs: 9845 -> 9845 (0.00%) total const registers used in shared programs: 1029735 -> 1466993 (42.46%) half full const instr dwords helped 0 10415 0 17861 5912 hurt 0 1157 21458 947 33 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3_cp_ctxRob Clark2016-05-043-12/+22
| | | | | | Needed in next commit.. just split out to reduce noise. Signed-off-by: Rob Clark <[email protected]>
* freedreno: s/Elements/ARRAY_SIZE/Brian Paul2016-05-031-1/+1
| | | | | Signed-off-by: Brian Paul <[email protected]> Reviewed-by: Rob Clark <[email protected]>