summaryrefslogtreecommitdiffstats
path: root/src/glsl/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir/vars_to_ssa: Rework copy set handling in lower_copies_to_load_storeJason Ekstrand2015-11-181-1/+4
| | | | | | | | | | | | | | | | | | | | | | Previously, we walked through a given deref_node's copies and, after lowering the copy away, removed it from both the source and destination copy sets. This commit changes this to only remove it from the other node's copy set (not the one we're lowering). At the end of the loop, we just throw away the copy set for the node we're lowering since that node no longer has any copies. This has two advantages: 1) It's more efficient because we're doing potentially half as many set search operations. 2) It now properly handles copies from a node to itself. Perviously, it would delete the copy from the set when processing the destinatioon and then assert-fail when we couldn't find it for the source. Cc: "11.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92588 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Connor Abbott <[email protected]> (cherry picked from commit 226ba889a0f820b9f4b1132e379620d2688c96e7)
* nir: Properly invalidate metadata in nir_opt_remove_phis().Kenneth Graunke2015-11-071-0/+5
| | | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: [email protected] (cherry picked from commit 59bbe2681b73c3795b7298e2486d5fde7c464ed5)
* nir: Properly invalidate metadata in nir_lower_vec_to_movs().Kenneth Graunke2015-11-071-0/+5
| | | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: [email protected] (cherry picked from commit bc3942e2970c60a816cf954b1fa4d416d0852bd9)
* nir: Report progress from lower_vec_to_movs().Jason Ekstrand2015-11-072-7/+23
| | | | | | | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit 9f5e7ae9d83ce6de761936b95cd0b7ba4c1219c4) [Emil Velikov] Correctly derive nir_shader from vec_to_movs_state Signed-off-by: Emil Velikov <[email protected]> Conflicts: src/glsl/nir/nir.h src/glsl/nir/nir_lower_vec_to_movs.c
* nir/lower_vec_to_movs: Pass the shader around directlyJason Ekstrand2015-11-071-6/+8
| | | | | | | | | Previously, we were passing the shader around, we were just calling it "mem_ctx". However, the nir_shader is (and must be for the purposes of mark-and-sweep) the mem_ctx so we might as well pass it around explicitly. Reviewed-by: Eduardo Lima Mitev <[email protected]> (cherry picked from commit b7eeced3c724bf5de05290551ced8621ce2c7c52)
* nir: Properly invalidate metadata in nir_opt_copy_prop().Kenneth Graunke2015-11-071-0/+6
| | | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: [email protected] (cherry picked from commit 0f037bd71ffe083c05cd0867ef54bce91ff84243)
* nir: Properly invalidate metadata in nir_split_var_copies().Kenneth Graunke2015-11-071-0/+5
| | | | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: [email protected] (cherry picked from commit 8bb44510fca5315bbdd61502c72c22c7198c0daf)
* nir: Report progress from nir_split_var_copies().Kenneth Graunke2015-11-072-4/+13
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (cherry picked from commit dc18b9357b553a972ea439facfbc55e376f1179f)
* nir: Fix a bunch of ralloc parenting errorsJason Ekstrand2015-09-2310-31/+32
| | | | | | | | | | | | | | | As of a10d4937, we would really like things associated with an instruction to be allocated out of that instruction and not out of the shader. In particular, you should be passing the instruction that will ultimately be holding the source into nir_src_copy rather than an arbitrary memory context. We also change the prototypes of nir_dest_copy and nir_alu_src/dest_copy to explicitly take an instruction so we catch this earlier in the future. Cc: "11.0" <[email protected]> Reviewed-by: Thomas Helland <[email protected]> (cherry picked from commit 8c8fc5f8336c8c79e5890265ae6c03271aa94075)
* nir: convert the glsl intrinsic image_size to nir_intrinsic_image_sizeMartin Peres2015-08-202-6/+17
| | | | | | | | | | | | | | | | | v2, review from Francisco Jerez: - make the destination variable as large as what the nir instrinsic defines (4) instead of the size of the return variable of glsl. This is still safe for the already existing code because all the intrinsics affected returned the same amount of components as expected by glsl IR. In the case of image_size, it is not possible to do so because the returned number of component depends on the image type and this case is not well handled by nir. v3: - Style fix Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* nir: Use nir_builder in nir_lower_io's get_io_offset().Kenneth Graunke2015-08-191-28/+14
| | | | | | | Much more readable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Pull nir_lower_io's load_op selection into a helper function.Kenneth Graunke2015-08-191-17/+22
| | | | | | | Makes the function a bit smaller. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)Thomas Helland2015-08-181-0/+1
| | | | | | | | | | The positive and negative value of a float can only be equal to each other if it is -0.0f and 0.0f. This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf This gives no changes in my shader-db Signed-off-by: Thomas Helland <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Simplify fne(fneg(a), a) -> fne(a, 0.0)Thomas Helland2015-08-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | -NaN != NaN, and -Inf != Inf, so this should be safe. Found while working on my VRP pass. Shader-db results on my IVB: total instructions in shared programs: 1698267 -> 1698067 (-0.01%) instructions in affected programs: 15785 -> 15585 (-1.27%) helped: 36 HURT: 0 GAINED: 0 LOST: 0 Some shaders was found to have the following pattern in NIR: vec1 ssa_26 = fneg ssa_21 vec1 ssa_27 = fne ssa_21, ssa_26 Make that: vec1 ssa_27 = fne ssa_21, 0.0f This is found in Dota2 and Brutal Legend. One shader is cut by 8%, from 323 -> 296 instructons in SIMD8 Signed-off-by: Thomas Helland <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add a glsl_uint_type() wrapper.Kenneth Graunke2015-08-162-0/+7
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add support for CSE on textures.Eric Anholt2015-08-141-4/+39
| | | | | | | | | | | | NIR instruction count results on i965: total instructions in shared programs: 1261954 -> 1261937 (-0.00%) instructions in affected programs: 455 -> 438 (-3.74%) One in yofrankie, two in tropics. Apparently i965 had also optimized all of these out anyway. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Zero out texture instructions when creating them.Eric Anholt2015-08-141-1/+1
| | | | | | | | There are so many flags in textures, that the CSE pass would have a hard time referencing the correct set when figuring out if two texture ops are the same. By zeroing, we can avoid that fragility. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Don't try to scalarize unpack ops.Eric Anholt2015-08-141-0/+15
| | | | | | | | | Avoids regressions in vc4 when trying to do our blending in NIR. v2: Add the other unpack ops I meant to when writing the original commit message. Reviewed-by: Matt Turner <[email protected]>
* nir: Add a nir_opt_undef() to handle csels with undef.Eric Anholt2015-08-142-0/+106
| | | | | | | | | | | | | | | | | | | We may find a cause to do more undef optimization in the future, but for now this fixes up things after if flattening. vc4 was handling this internally most of the time, but a GLB2.7 shader that did a conditional discard and assign gl_FragColor in the else was still emitting some extra code. total instructions in shared programs: 100809 -> 100795 (-0.01%) instructions in affected programs: 37 -> 23 (-37.84%) v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas Helland). v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg over to src[0], too (by anholt). Reviewed-by: Thomas Helland <[email protected]> (v2) Tested-by: Thomas Helland <[email protected]> (v2)
* nir: add missing type to type_size_vec4()Timothy Arceri2015-08-051-0/+2
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add a nir_lower_load_const_to_scalar() pass.Eric Anholt2015-08-042-0/+104
| | | | | | | | This is useful to increase the CSE opportunities for a scalar backend. It avoids regressions when dropping vc4's custom CSE implementation. v2: Cleanups by Matt (decl in the for loop, and unreachable()). Reviewed-by: Matt Turner <[email protected]>
* nir: Add algebraic opt for no-op iand.Eric Anholt2015-08-041-0/+1
| | | | | | I lazily generated some of these in VC4 NIR lowering. Reviewed-by: Iago Toral Quiroga <[email protected]>
* Revert "nir: Use a single bit for the dual-source blend index"Eric Anholt2015-08-041-6/+2
| | | | | This reverts commit ab5b7a0fe659ff6f9c1885d5cb047b6531959506. We use more than one bit of value in tgsi_to_nir.
* nir: Fix output swizzle in get_mul_for_srcSamuel Iglesias Gonsalvez2015-08-031-1/+12
| | | | | | | | | | | | | | | Avoid copying an overwritten swizzle, use the original values. Example: Former swizzle[] = xyzw src->swizzle[] = zyxx The expected output swizzle = zyxx but if we reuse swizzle in the loop, then output swizzle would be zyzz. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/nir_lower_io: Add vec4 supportIago Toral Quiroga2015-08-032-27/+78
| | | | | | | | | | The current implementation operates in scalar mode only, so add a vec4 mode where types are padded to vec4 sizes. This will be useful in the i965 driver for its vec4 nir backend (and possbly other drivers that have vec4-based shaders). Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Use a single bit for the dual-source blend indexTimothy Arceri2015-08-031-2/+6
| | | | | | | | | | The only values allowed are 0 and 1, and the value is checked before assigning. This is a copy of 8eeca7a56c that seems to have been made to the glsl ir type after it was copied for use in nir but before nir landed. Reviewed-by: Tapani Pälli <[email protected]>
* nir: Avoid double promotion.Matt Turner2015-07-291-2/+2
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: Remove MSVC implementations of copysign and isnormal.Matt Turner2015-07-291-13/+1
| | | | Non-Gallium parts of Mesa require MSVC 2013 which provides these.
* i965: add support for ARB_shader_subroutineDave Airlie2015-07-241-0/+1
| | | | | | | | This just adds some missing pieces to nir/i965, it is lightly tested on my Haswell. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* glsl/types: add new subroutine type (v3.2)Dave Airlie2015-07-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | This type will be used to store the name of subroutine types as in subroutine void myfunc(void); will store myfunc into a subroutine type. This is required to the parser can identify a subroutine type in a uniform decleration as a valid type, and also for looking up the type later. Also add contains_subroutine method. v2: handle subroutine to int comparisons, needed for lowering pass. v3: do subroutine to int with it's own IR operation to avoid hacking on asserts (Kayden) v3.1: fix warnings in this patch, fix nir, fix tgsi v3.2: fixup tests Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Signed-off-by: Dave Airlie <[email protected]> tests: fix warnings
* nir: add nir_foreach_instr_safe_reverse()Connor Abbott2015-07-171-0/+2
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add nir_instr_is_first() and nir_instr_is_last() helpersConnor Abbott2015-07-171-0/+12
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add nir_var_shader_storageIago Toral Quiroga2015-07-146-8/+20
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir: Fix comment above nir_convert_from_ssa() prototype.Kenneth Graunke2015-07-081-3/+3
| | | | | | | | Connor renamed the parameter, inverting the sense. Update the comment accordingly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_phis_to_scalar: undef is trivially scalarizableRob Clark2015-07-031-0/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Don't allow copying SSA destinationsJason Ekstrand2015-07-021-11/+11
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: remove parent_instr from nir_registerConnor Abbott2015-06-303-17/+0
| | | | | | It's no longer used. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: remove nir_src_get_parent_instr()Connor Abbott2015-06-301-10/+0
| | | | | | It's now unused. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/from_ssa: add a flag to not convert everything from SSAConnor Abbott2015-06-302-8/+24
| | | | | | | | | | | | | We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: cleanup open-coded instruction castsRob Clark2015-06-303-3/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Recognize max(min(a, 1.0), 0.0) as fsat(a).Kenneth Graunke2015-06-251-0/+1
| | | | | | | | | | | | | | We already recognize min(max(a, 0.0), 1.0) as a saturate, but neglected this variant (which is also handled by the GLSL IR pass). shader-db results on Broadwell: total instructions in shared programs: 7363046 -> 7362788 (-0.00%) instructions in affected programs: 11928 -> 11670 (-2.16%) helped: 64 HURT: 0 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Use a switch statement for detecting move-like operations.Kenneth Graunke2015-06-241-6/+14
| | | | | | | Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Allow vec2/vec3/vec4 instructions in the select peephole pass.Kenneth Graunke2015-06-221-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These are basically just moves, so they should be safe as well. When disabling i965's GLSL IR level scalarizer (channel expressions) pass, I started seeing NIR code like this: if ssa_21 { block block_1: /* preds: block_0 */ vec4 ssa_120 = vec4 ssa_82, ssa_83, ssa_84, ssa_30 /* succs: block_3 */ } else { block block_2: /* preds: block_0 */ /* succs: block_3 */ } block block_3: /* preds: block_1 block_2 */ vec4 ssa_33 = phi block_1: ssa_120, block_2: ssa_2 Previously, the GLSL IR scalarizer pass would break the vec4 into a series of fmovs, which were allowed by the peephole pass. But with the vec4 operation, they were not. We want to keep getting selects. Normal i965 on Broadwell: instructions in affected programs: 200 -> 176 (-12.00%) helped: 4 With brw_fs_channel_expressions() disabled: instructions in affected programs: 1832 -> 1646 (-10.15%) helped: 30 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add barrier intrinsic functionJordan Justen2015-06-122-1/+4
| | | | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* glsl: Add ir node for barrierChris Forbes2015-06-121-0/+7
| | | | | | | | | v2: * Changes suggested by mattst88 [[email protected]: Add nir support] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* nir: use src for ssa helperTimothy Arceri2015-06-031-5/+1
| | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: remove extra semicolonTimothy Arceri2015-06-031-1/+1
| | | | | | Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: prevent use-after-free condition in should_lower_phi()Eduardo Lima Mitev2015-06-021-0/+5
| | | | | | | | | | | | | | | | | | lower_phis_to_scalar() pass recurses the instruction dependence graph to determine if all the sources of a given instruction are scalarizable. To prevent cycles, it temporary marks the phi instruction before recursing in, then updates the entry with the resulting value. However, it does not consider that the entry value may have changed after a recursion pass, hence causing a use-after-free situation and a crash. This patch fixes this by reloading the entry corresponding to the 'phi' after recursing and before updating its value. The crash can be reproduced ~20% of times with the dEQP test: dEQP-GLES3.functional.shaders.loops.while_constant_iterations.nested_sequence_fragment Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix output swizzle in get_mul_for_srcIago Toral Quiroga2015-05-281-10/+9
| | | | | | | | When we compute the output swizzle we want to consider the number of components in the add operation. So far we were using the writemask of the multiplication for this instead, which is not correct. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Remove sRGB colorspace conversion round-trip.Matt Turner2015-05-221-0/+2
| | | | | | | | | | | | | | | | | | Some shaders in Civilization V and Beyond Earth do pow(pow(x, 2.2), 0.454545) which is converting to and from sRGB colorspace. A more general rule that replaces pow(pow(a, b), c) with pow(a, b * c) actually regresses two shaders in Sun Temple in which the result of the inner pow is used twice, once by another pow and once by another instruction. Also, since 2.2 * 0.454545 isn't exactly one, the more general pattern would have still left us with a pow, and I'm 2.2 * 0.454545 percent sure that's not what they want. instructions in affected programs: 934 -> 886 (-5.14%) helped: 16