summaryrefslogtreecommitdiffstats
path: root/src/compiler/nir
Commit message (Collapse)AuthorAgeFilesLines
* nir: Recognize open-coded extract_u16.Matt Turner2016-03-041-0/+5
| | | | | | | | No shader-db changes, but does recognize some extract_u16 which enables the next patch to optimize some code. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Recognize open-coded extract_u8.Matt Turner2016-03-041-0/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack three bytes from an integer and convert each into a float: float((val >> 16u) & 0xffu) float((val >> 8u) & 0xffu) float((val >> 0u) & 0xffu) Instead of shifting, masking, and type converting like this: shr(8) g15<1>UD g25<8,8,1>UD 0x00000010UD and(8) g16<1>UD g15<8,8,1>UD 0x000000ffUD mov(8) g17<1>F g16<8,8,1>UD shr(8) g18<1>UD g25<8,8,1>UD 0x00000008UD and(8) g19<1>UD g18<8,8,1>UD 0x000000ffUD mov(8) g20<1>F g19<8,8,1>UD and(8) g21<1>UD g25<8,8,1>UD 0x000000ffUD mov(8) g22<1>F g21<8,8,1>UD i965 can simply extract a byte and convert to float in a single instruction: mov(8) g17<1>F g25.2<32,8,4>UB mov(8) g20<1>F g25.1<32,8,4>UB mov(8) g22<1>F g25.0<32,8,4>UB This patch implements the first step: recognizing byte extraction. A later patch will optimize out the conversion to float. instructions in affected programs: 28568 -> 27450 (-3.91%) helped: 7 cycles in affected programs: 210076 -> 203144 (-3.30%) helped: 7 This patch decreases the number of instructions in the two Unigine programs by: #1721: 4520 -> 4374 instructions (-3.23%) #1706: 3752 -> 3582 instructions (-4.53%) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Remove the const_offset from nir_tex_instrJason Ekstrand2016-02-105-32/+5
| | | | | | | | | | | When NIR was originally drafted, there was no easy way to determine if something was constant or not. The result was that we had lots of special-casing for constant values such as this. Now that load_const instructions are SSA-only, it's really easy to find constants and this isn't really needed anymore. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* nir/lower_vec_to_movs: Better report channels handled by insert_movJason Ekstrand2016-02-101-1/+3
| | | | | | | | | | | | | | This fixes two issues. First, we had a use-after-free in the case where the instruction got deleted and we tried to return mov->dest.write_mask. Second, in the case where we are doing a self-mov of a register, we delete those channels that are moved to themselves from the write-mask. This means that those channels aren't reported as being handled even though they are. We now stash off the write-mask before remove unneeded channels so that they still get reported as handled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073 Reviewed-by: Matt Turner <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nir: Separate texture from sampler in nir_tex_instrJason Ekstrand2016-02-0910-18/+94
| | | | | | | | | | | | | This commit adds the capability to NIR to support separate textures and samplers. As it currently stands, glsl_to_nir only sets the texture deref and leaves the sampler deref alone as it did before and nir_lower_samplers assumes this. Backends can still assume that they are combined and only look at only at the texture index. Or, if they wish, they can assume that they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir all set both texture and sampler index whenever a sampler is required (the two indices are the same in this case). Reviewed-by: Kenneth Graunke <[email protected]>
* nir/tex_instr: Rename sampler to textureJason Ekstrand2016-02-0911-58/+58
| | | | | | | | | We're about to separate the two concepts. When we do, the sampler will become optional. Doing a rename first makes the separation a bit more safe because drivers that depend on GLSL or TGSI behaviour will be fine to just use the texture index all the time. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add some braces around loops and ifsJason Ekstrand2016-02-091-5/+10
|
* nir: use const_index helpersRob Clark2016-02-0911-24/+23
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* gtn: use const_index helpersRob Clark2016-02-091-8/+9
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: const_index helpersRob Clark2016-02-094-100/+191
| | | | | | | | | | | | | | Direct access to intr->const_index[n], where different slots have different meanings, is somewhat confusing. Instead, let's put some extra info in nir_intrinsic_infos[] about which slots map to what, and add some get/set helpers. The helpers validate that the field being accessed (base/writemask/etc) is applicable for the intrinsic opc, for some extra safety. And nir_print can use this to dump out decoded const_index fields. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: remove unused nir_variable fieldsTimothy Arceri2016-02-092-20/+0
| | | | | | | These are used in GLSL IR to removed unused varyings and match transform feedback variables. There is no need to use these in NIR. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Recognize open-coded bitfield_reverse.Matt Turner2016-02-081-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Helps 11 shaders in UnrealEngine4 demos. I seriously hope they would have given us bitfieldReverse() if we exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that anyway?). instructions in affected programs: 4875 -> 4633 (-4.96%) cycles in affected programs: 270516 -> 244516 (-9.61%) I suspect there's a *lot* of room to improve nir_search/opt_algebraic's handling of this. We'd actually like to match, e.g., step2 by matching step1 once and then doing a pointer comparison for the second instance of step1, but unfortunately we generate an enormous tuple for instead. The .text size increases by 6.5% and the .data by 17.5%. text data bss dec hex filename 22957 45224 0 68181 10a55 nir_libnir_la-nir_opt_algebraic.o 24461 53160 0 77621 12f35 nir_libnir_la-nir_opt_algebraic.o I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is in a GL 4.0 context once we expose GL 4.0. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Handle large unsigned values in opt_algebraic.Matt Turner2016-02-081-4/+1
| | | | | | | | | | | | | | | The next patch adds an algebraic rule that uses the constant 0xff00ff00. Without this change, the build fails with return hex(struct.unpack('I', struct.pack('i', self.value))[0]) struct.error: 'i' format requires -2147483648 <= number <= 2147483647 The hex() function handles integers of any size, and assigning a negative value to an unsigned does what we want in C. The pack/unpack is unnecessary (and as we see, buggy). Reviewed-by: Dylan Baker <[email protected]>
* nir: Do opt_algebraic in reverse order.Matt Turner2016-02-081-2/+2
| | | | | | | | | | | | | | | | | | | | | | Walking the SSA definitions in order means that we consider the smallest algebraic optimizations before larger optimizations. So if a smaller rule is part of a larger rule, the smaller one will happen first, preventing the larger one from happening. instructions in affected programs: 32721 -> 32611 (-0.34%) helped: 106 In programs whose nir_optimize loop count changes (129 of them): before: 1164 optimization loops after: 1071 optimization loops Of the 129 affected, 16 programs' optimization loop counts increased. Prevents regressions and annoyances in the next commits. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Recognize product of open-coded pow()s.Matt Turner2016-02-081-0/+2
| | | | | | Prevents regressions in the next commit. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add opt_algebraic rules for xor with zero.Matt Turner2016-02-081-0/+2
| | | | | | | | instructions in affected programs: 668 -> 664 (-0.60%) helped: 4 Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add lowering support for unpacking opcodes.Matt Turner2016-02-012-0/+32
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add lowering support for packing opcodes.Matt Turner2016-02-014-0/+66
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add opcodes to extract bytes or words.Matt Turner2016-02-013-0/+28
| | | | | | The uint versions zero extend while the int versions sign extend. Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: Remove 2x16 half-precision pack/unpack opcodes.Matt Turner2016-02-011-9/+0
| | | | | | i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Add lowering of nir_op_unpack_half_2x16.Matt Turner2016-02-012-4/+29
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Make argument order of unop_convert match binop_convert.Matt Turner2016-02-011-10/+10
| | | | | | Strangely the return and parameter types were reversed. Reviewed-by: Iago Toral Quiroga <[email protected]>
* glsl: move to compiler/Emil Velikov2016-01-266-7/+78
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
* nir: move to compiler/Emil Velikov2016-01-2672-0/+24141
Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>