summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* nir/builder: Add a helper for storing to variable derefsJason Ekstrand2016-03-281-0/+16
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir/builder: Add a helper for building fdot instructionsJason Ekstrand2016-03-281-0/+17
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir: Add a variable_foreach_safe helperJason Ekstrand2016-03-281-0/+3
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir/Makefile: Fix alphabetizationJason Ekstrand2016-03-282-6/+6
| | | | Reviewed-by: Rob Clark <[email protected]>
* glsl: add OES_texture_buffer and EXT_texture_buffer supportIlia Mirkin2016-03-286-24/+46
| | | | | | | | | Expose the samplerBuffer/imageBuffer types, and allow the various functions to operate on them. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Delete initialized field from uniform storage test.Kenneth Graunke2016-03-281-19/+0
| | | | | | | Timothy deleted this field. Fixes "make check". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: remove initialized field from uniform storageTimothy Arceri2016-03-293-10/+0
| | | | | | | | The only place this was used was in a gallium debug function that had to be manually enabled. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: reduce buffer block duplicationTimothy Arceri2016-03-264-43/+54
| | | | | | | | | | | | | This reduces some of the craziness required for handling buffer blocks. The problem is each shader stage holds its own information about a block in memory, we were copying that information to a program wide list but the per stage information remained meaning when a binding was updated we needed to update all versions of it. This changes the per stage blocks to instead point to a single version of the block information in the program list. Acked-by: Kenneth Graunke <[email protected]>
* nir: Add a pass to inline functionsJason Ekstrand2016-03-244-0/+274
| | | | | | | This commit adds a new NIR pass that lowers all function calls away by inlining the functions. Reviewed-by: Jordan Justen <[email protected]>
* nir/builder: Add helpers for easily inserting copy_var intrinsicsJason Ekstrand2016-03-241-0/+23
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir: Add return lowering passJason Ekstrand2016-03-244-0/+251
| | | | | | | | This commit adds a NIR pass for lowering away returns in functions. If the return is in a loop, it is lowered to a break. If it is not in a loop, it's lowered away by moving/deleting code as needed. Reviewed-by: Jordan Justen <[email protected]>
* nir: Add a cursor helper for getting a cursor after any phi nodesJason Ekstrand2016-03-241-0/+16
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir/builder: Add a helper for inserting jump instructionsJason Ekstrand2016-03-241-0/+7
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir/cf: Make extracting or re-inserting nothing a no-opJason Ekstrand2016-03-241-0/+9
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a function for comparing cursorsJason Ekstrand2016-03-242-0/+58
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir/cf: Handle relinking top-level blocksJason Ekstrand2016-03-241-2/+5
| | | | | | | | This can happen if a function ends in a return instruction and you remove the return. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a pass to repair SSA formJason Ekstrand2016-03-244-0/+163
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir/vars_to_ssa: Use the new nir_phi_builder helperJason Ekstrand2016-03-241-359/+134
| | | | | | | | | | | | | | | | | The efficiency should be approximately the same. We do a little more work per phi node because we have to sort the predecessors. However, we no longer have to walk the blocks a second time to pop things off the stack. The bigger advantage, however, is that we can now re-use the phi placement and per-block SSA value tracking in other passes. As a side-benifit, the phi builder actually handles unreachable blocks correctly. The original vars_to_ssa code, because of the way it iterated the blocks and added phi sources, didn't add sources corresponding to predecessors of unreachable blocks. The new strategy employed by the phi builder creates a phi source for each predecessor and should correctly handle unreachable blocks by setting those sources to SSA undefs. Reviewed-by: Jordan Justen <[email protected]>
* nir/dominance: Handle unreachable blocksJason Ekstrand2016-03-241-1/+5
| | | | | | | | | | | | | | | | | | | | | Previously, nir_dominance.c didn't properly handle unreachable blocks. This can happen if, for instance, you have something like this: loop { if (...) { break; } else { break; } } In this case, the block right after the if statement will be unreachable. This commit makes two changes to handle this. First, it removes an assert and allows block->imm_dom to be null if the block is unreachable. Second, it properly skips unreachable blocks in calc_dom_frontier_cb. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a phi node placement helperJason Ekstrand2016-03-244-0/+414
| | | | | | | | | | | Right now, we have phi placement code in two places and there are other places where it would be nice to be able to do this analysis. Instead of repeating it all over the place, this commit adds a helper for placing all of the needed phi nodes for a value. v2: Add better documentation Reviewed-by: Jordan Justen <[email protected]>
* nir: fix dangling ssadef->name ptrsRob Clark2016-03-243-4/+8
| | | | | | | | | | | | | | | | | In many places, the convention is to pass an existing ssadef name ptr when construction/initializing a new nir_ssa_def. But that goes badly (as noticed by garbage in nir_print output) when the original string gets freed. Just use ralloc_strdup() instead, and add ralloc_free() in the two places that would care (not that the strings wouldn't eventually get freed anyways). Also fixup the nir_search code which was directly setting ssadef->name to use the parent instruction as memctx. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Add propagate_invariance to the other makefileJason Ekstrand2016-03-231-0/+1
| | | | This fixes the scons build
* nir/glsl: Propagate invariant into NIR alu opsJason Ekstrand2016-03-231-0/+3
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* glsl/rebalance_tree: Don't handle invariant or precise treesJason Ekstrand2016-03-231-0/+16
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* glsl/opt_algebraic: Don't handle invariant or precise treesJason Ekstrand2016-03-231-0/+19
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* glsl: Add a pass to propagate the "invariant" and "precise" qualifiersJason Ekstrand2016-03-234-0/+128
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/alu_to_scalar: Propagate the "exact" bitJason Ekstrand2016-03-231-0/+1
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/cse: Properly handle nir_ssa_def.exactJason Ekstrand2016-03-231-2/+14
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/algebraic: Flag inexact optimizationsJason Ekstrand2016-03-231-59/+62
| | | | | | | | | | Many of our optimizations, while great for cutting shaders down to size, aren't really precision-safe. This commit tries to flag all of the inexact floating-point optimizations so they don't get run on values that are flagged "exact". It's a bit conservative and maybe flags some safe optimizations as unsafe but that's better than missing one. Reviewed-by: Francisco Jerez <[email protected]>
* nir/algebraic: Fix fmin detection to match the specJason Ekstrand2016-03-231-1/+1
| | | | | | | The previous transformation got the arguments to fmin backwards. When NaNs are involved, the GLSL min/max aren't commutative so it matters. Reviewed-by: Francisco Jerez <[email protected]>
* nir/algebraic: Get rid of an invlid fxor optimizationJason Ekstrand2016-03-231-1/+0
| | | | | | | The fxor opcode is required to return 1.0f or 0.0f but the input variable may not be 1.0f or 0.0f. Reviewed-by: Francisco Jerez <[email protected]>
* nir/algebraic: Allow for flagging operations as being inexactJason Ekstrand2016-03-234-2/+26
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/search: Propagate exactness into newly created expressionsJason Ekstrand2016-03-231-4/+5
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/builder: Add a flag for setting exactJason Ekstrand2016-03-231-0/+9
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir: Add an "exact" bit to nir_alu_instrJason Ekstrand2016-03-233-0/+14
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* nir/clone: Export nir_variable_cloneJason Ekstrand2016-03-232-4/+13
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir/clone: Expose nir_constant_cloneJason Ekstrand2016-03-232-4/+5
| | | | Reviewed-by: Rob Clark <[email protected]>
* nir: Fix whitespaceJason Ekstrand2016-03-231-1/+1
| | | | Reviewed-by: Rob Clark <[email protected]>
* compiler/glsl: allow sequence op as a const expr in gles 1.0Lars Hamre2016-03-231-1/+3
| | | | | | | | | | | | | | | | | Allow the sequence operator to be a constant expression in GLSL ES versions prior to GLSL ES 3.0 Fixes the following piglit test: /all/spec/glsl-es-1.0/compiler/array-sized-by-sequence-in-parenthesis.vert This is similar to the logic from process_initializer() which performs the same check for constant variable initialization with sequence operators. v2: Fixed regression pointed out by Eduardo Lima Mitev Signed-off-by: Lars Hamre <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nir: Don't abs slt and friendsIan Romanick2016-03-221-0/+4
| | | | | | | No shader-db changes, but this is symmetric with the previous commit. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Don't abs the result of b2f or b2iIan Romanick2016-03-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the results below, 2 SIMD16 shaders in Trine are lost. G4X total instructions in shared programs: 4012279 -> 4011108 (-0.03%) instructions in affected programs: 116776 -> 115605 (-1.00%) helped: 339 HURT: 0 total cycles in shared programs: 84315862 -> 84313584 (-0.00%) cycles in affected programs: 1767232 -> 1764954 (-0.13%) helped: 274 HURT: 81 Ironlake total instructions in shared programs: 6399073 -> 6396998 (-0.03%) instructions in affected programs: 218050 -> 215975 (-0.95%) helped: 600 HURT: 0 total cycles in shared programs: 128892088 -> 128888810 (-0.00%) cycles in affected programs: 2867452 -> 2864174 (-0.11%) helped: 422 HURT: 137 Sandy Bridge total instructions in shared programs: 8462174 -> 8460759 (-0.02%) instructions in affected programs: 178529 -> 177114 (-0.79%) helped: 596 HURT: 0 total cycles in shared programs: 117542276 -> 117534098 (-0.01%) cycles in affected programs: 1239166 -> 1230988 (-0.66%) helped: 369 HURT: 150 Ivy Bridge total instructions in shared programs: 7775131 -> 7773410 (-0.02%) instructions in affected programs: 162903 -> 161182 (-1.06%) helped: 590 HURT: 0 total cycles in shared programs: 65759882 -> 65747268 (-0.02%) cycles in affected programs: 1004354 -> 991740 (-1.26%) helped: 467 HURT: 141 Haswell total instructions in shared programs: 7107786 -> 7106327 (-0.02%) instructions in affected programs: 140954 -> 139495 (-1.04%) helped: 590 HURT: 0 total cycles in shared programs: 64668028 -> 64655322 (-0.02%) cycles in affected programs: 967080 -> 954374 (-1.31%) helped: 452 HURT: 149 LOST: 2 GAINED: 0 Broadwell total instructions in shared programs: 8980029 -> 8978287 (-0.02%) instructions in affected programs: 197232 -> 195490 (-0.88%) helped: 715 HURT: 0 total cycles in shared programs: 70070448 -> 70055970 (-0.02%) cycles in affected programs: 975724 -> 961246 (-1.48%) helped: 471 HURT: 111 LOST: 2 GAINED: 0 Skylake total instructions in shared programs: 9115178 -> 9113436 (-0.02%) instructions in affected programs: 203012 -> 201270 (-0.86%) helped: 715 HURT: 0 total cycles in shared programs: 68848660 -> 68834004 (-0.02%) cycles in affected programs: 993888 -> 979232 (-1.47%) helped: 473 HURT: 116 LOST: 2 GAINED: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Simplify 0 < fabs(a)Ian Romanick2016-03-221-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Sandy Bridge / Ivy Bridge / Haswell total instructions in shared programs: 8462180 -> 8462174 (-0.00%) instructions in affected programs: 564 -> 558 (-1.06%) helped: 6 HURT: 0 total cycles in shared programs: 117542462 -> 117542276 (-0.00%) cycles in affected programs: 9768 -> 9582 (-1.90%) helped: 12 HURT: 0 Broadwell / Skylake total instructions in shared programs: 8980833 -> 8980826 (-0.00%) instructions in affected programs: 626 -> 619 (-1.12%) helped: 7 HURT: 0 total cycles in shared programs: 70077900 -> 70077714 (-0.00%) cycles in affected programs: 9378 -> 9192 (-1.98%) helped: 12 HURT: 0 G45 and Ironlake showed no change. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Simplify 0 >= b2f(a)Ian Romanick2016-03-221-0/+7
| | | | | | | | | | | | | | | | | | | | | | | This also prevented some regressions with other patches in my local tree. Broadwell / Skylake total instructions in shared programs: 8980835 -> 8980833 (-0.00%) instructions in affected programs: 45 -> 43 (-4.44%) helped: 1 HURT: 0 total cycles in shared programs: 70077904 -> 70077900 (-0.00%) cycles in affected programs: 122 -> 118 (-3.28%) helped: 1 HURT: 0 No changes on earlier platforms. v2: Modify the comments to look more like a proof. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Simplify i2b with negated or abs operandIan Romanick2016-03-221-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables removing ssa_201 and ssa_202 in sequences like: vec1 ssa_200 = flt ssa_199, ssa_194 vec1 ssa_201 = b2i ssa_200 vec1 ssa_202 = i2b -ssa_201 shader-db results: Sandy Bridge total instructions in shared programs: 8462257 -> 8462180 (-0.00%) instructions in affected programs: 3846 -> 3769 (-2.00%) helped: 35 HURT: 0 total cycles in shared programs: 117542934 -> 117542462 (-0.00%) cycles in affected programs: 20072 -> 19600 (-2.35%) helped: 20 HURT: 1 Ivy Bridge total instructions in shared programs: 7775252 -> 7775137 (-0.00%) instructions in affected programs: 3645 -> 3530 (-3.16%) helped: 35 HURT: 0 total cycles in shared programs: 65760522 -> 65760068 (-0.00%) cycles in affected programs: 21082 -> 20628 (-2.15%) helped: 25 HURT: 2 Haswell total instructions in shared programs: 7108666 -> 7108589 (-0.00%) instructions in affected programs: 3253 -> 3176 (-2.37%) helped: 35 HURT: 0 total cycles in shared programs: 64675726 -> 64675272 (-0.00%) cycles in affected programs: 21034 -> 20580 (-2.16%) helped: 26 HURT: 1 Broadwell / Skylake total instructions in shared programs: 8980912 -> 8980835 (-0.00%) instructions in affected programs: 3223 -> 3146 (-2.39%) helped: 35 HURT: 0 total cycles in shared programs: 70077926 -> 70077904 (-0.00%) cycles in affected programs: 21886 -> 21864 (-0.10%) helped: 21 HURT: 6 G45 and Ironlake showed no change. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Lower flrp with Boolean interpolator to bcselIan Romanick2016-03-221-2/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | On Intel platforms that don't set lower_flrp, using bcsel instead of flrp seems to be a small amount worse. On those platforms, the use of flrp, bcsel, and multiply of b2f is still an active area of research. In review, Matt suggested this is because bcsel turns into CMP+SEL, and because of the flag register we can't schedule instructions well. shader-db results: G4X / Ironlake total instructions in shared programs: 4016538 -> 4012279 (-0.11%) instructions in affected programs: 161556 -> 157297 (-2.64%) helped: 1077 HURT: 1 total cycles in shared programs: 84328296 -> 84315862 (-0.01%) cycles in affected programs: 4174570 -> 4162136 (-0.30%) helped: 926 HURT: 53 Unsurprisingly, no changes on later platforms. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: disable varying packing when its not safeTimothy Arceri2016-03-184-53/+180
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In GL 4.4+ there is no guarantee that interpolation qualifiers will match between stages so we cannot safely pack varyings using the current packing pass in Mesa. We also disable packing on outerward facing interfaces for SSO because in ES we need to retain the unpacked varying information for draw time validation. For desktop GL we could allow packing for SSO in versions < 4.4 but its just safer not to do so. We do however enable packing on individual arrays, structs, and matrices as these are required by the transform feedback code and it is still safe to do so. Finally we also enable packing when a varying is only used for transform feedback and its not a SSO. This fixes all remaining rendering issues with the dEQP SSO tests, the only issues remaining with thoses tests are to do with validation. Note: There is still one remaining SSO bug that this patch doesn't fix. Their is a chance that VS -> TCS will have mismatching interfaces because we pack VS output in case its used by transform feedback but don't pack TCS input for performance reasons. This patch will make the situation better but doesn't fix it. V4: fix out of order function params after rebase, make sure packing still disabled in tess stages. Update comments as to why we disable packing on SSO. V3: ES 3.1 *does* require interpolation to match so don't disable packing there. Rebased on master rather than on enhanced layouts component packing series. V2: Make is_varying_packing_safe() a function in the varying_matches class, fix spelling (Matt) and make sure to remove the outer array when dealing with Geom and Tess shaders where appropriate. Lastly fix piglit regression in new piglit test and document the undefined behaviour it depends on: arb_separate_shader_objects/execution/vs-gs-linking.shader_test Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* glsl: pass disable_varying_packing bool to the lowering passTimothy Arceri2016-03-183-15/+24
| | | | | | | | | | This will allow us to choose to ignore the disable which will be useful for more fine grained control over when to enable or disable packing. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: propagate bitsize information in nir_searchConnor Abbott2016-03-173-27/+247
| | | | | | | | | | | | | | | | | | | | | | | | | | | When we replace an expresion we have to compute bitsize information for the replacement. We do this in two passes to validate that bitsize information is consistent and correct: first we propagate bitsize from child nodes to parent, then we do it the other way around, starting from the original's instruction destination bitsize. v2 (Iago): - Always use nir_type_bool32 instead of nir_type_bool when generating algebraic optimizations. Before we used nir_type_bool32 with constants and nir_type_bool with variables. - Fix bool comparisons in nir_search.c to account for bitsized types. v3 (Sam): - Unpack the double constant value as unsigned long long (8 bytes) in nir_algrebraic.py. v4 (Sam): - Use helpers to get type size and base type from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: add a bit_size parameter to nir_ssa_dest_initConnor Abbott2016-03-1720-54/+112
| | | | | | | | | | | | | | | | | | | | | | v2: Squash multiple commits addressing the new parameter in different files so we don't break the build (Iago) v3: Fix tgsi (Samuel) v4: Fix nir_clone.c (Samuel) v5: Fix vc4 and freedreno (Iago) v6 (Sam) - Fix build errors in nir_lower_indirect_derefs - Use helper to get type size from nir_alu_type. Signed-off-by: Iago Toral Quiroga <[email protected]> Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: rename nir_const_value fields to include bitsize informationIago Toral Quiroga2016-03-1714-53/+53
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>