summaryrefslogtreecommitdiffstats
path: root/src/glsl/nir
Commit message (Collapse)AuthorAgeFilesLines
* glsl: reduce memory footprint of uniform_storage structTimothy Arceri2015-10-051-2/+2
| | | | | | | | The uniform will only be of a single type so store the data for opaque types in a single array. Cc: Francisco Jerez <[email protected]> Cc: Ilia Mirkin <[email protected]>
* nir: Add a nir_shader_info::has_transform_feedback_varyings flag.Kenneth Graunke2015-10-042-0/+5
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Introduce new nir_intrinsic_load_per_vertex_input intrinsics.Kenneth Graunke2015-10-043-6/+52
| | | | | | | | | | | | | | | | | | | | | | | | | | | Geometry and tessellation shaders process multiple vertices; their inputs are arrays indexed by the vertex number. While GLSL makes this look like a normal array, it can be very different behind the scenes. On Intel hardware, all inputs for a particular vertex are stored together - as if they were grouped into a single struct. This means that consecutive elements of these top-level arrays are not contiguous. In fact, they may sometimes be in completely disjoint memory segments. NIR's existing load_input intrinsics are awkward for this case, as they distill everything down to a single offset. We'd much rather keep the vertex ID separate, but build up an offset as normal beyond that. This patch introduces new nir_intrinsic_load_per_vertex_input intrinsics to handle this case. They work like ordinary load_input intrinsics, but have an extra source (src[0]) which represents the outermost array index. v2: Rebase on earlier refactors. v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_io: Make get_io_offset() return a nir_ssa_def * for indirects.Kenneth Graunke2015-10-041-42/+20
| | | | | | | | | | | | | | | get_io_offset() already walks the dereference chain and discovers whether or not we have an indirect; we can just return that rather than computing it a second time via deref_has_indirect(). This means moving the call a bit earlier. By returning a nir_ssa_def *, we can pass back both an existence flag (via NULL checking the pointer) and the value in one parameter. It also simplifies the code somewhat. nir_lower_samplers works in a similar fashion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a nir_foreach_variable macroJason Ekstrand2015-10-027-18/+21
| | | | | | | This is a common enough operation that it's nice to not have to think about the arguments to foreach_list_typed every time. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Move GS data to nir_shader_infoJason Ekstrand2015-10-024-14/+11
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a a nir_shader_info structJason Ekstrand2015-10-023-0/+53
| | | | | | This commit also adds code to glsl_to_nir and prog_to_nir to fill it out. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/glsl: Take a gl_shader_program and a stage rather than a gl_shaderJason Ekstrand2015-10-022-3/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Fix uninitialized 'progress' variable in nir_lower_system_values.Chris Wilson2015-10-021-1/+1
| | | | | | | | | | | | | | | | | | Commit 0a1adaf11d051b71b4c46aabee2e5342f2d6aef3 (nir: Report progress from nir_lower_system_values().) introduced a bug caught by Valgrind: ==823== Conditional jump or move depends on uninitialised value(s) ==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68) ==823== by 0xB079FB8: foreach_cf_node (nir.c:1310) ==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336) ==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79) ... ==823== Uninitialised value was created by a stack allocation ==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76) which is trivially fixed by initializing progress. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/remove_phis: handle trivial back-edgesConnor Abbott2015-10-021-0/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some loops may have phi nodes that look like: foo = ... loop { bar = phi(foo, bar) ... } in which case we can remove the phi node and replace all uses of 'bar' with 'foo'. In particular, there are some L4D2 vertex shaders with loops that, after optimization, look like: /* succs: block_1 */ loop { block block_1: /* preds: block_0 block_4 */ vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994 vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321 vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324 vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327 vec1 ssa_8139 = intrinsic load_uniform () () (232) vec1 ssa_588 = ige ssa_2195, ssa_8139 /* succs: block_2 block_3 */ if ssa_588 { block block_2: /* preds: block_1 */ break /* succs: block_5 */ } else { block block_3: /* preds: block_1 */ /* succs: block_4 */ } block block_4: /* preds: block_3 */ vec1 ssa_994 = iadd ssa_2195, ssa_2150 /* succs: block_1 */ } where after removing the second, third, and fourth phi nodes, the loop becomes entirely dead, and this patch will cause the loop to be deleted entirely. No piglit regressions. Shader-db results on bdw: instructions in affected programs: 5824 -> 5664 (-2.75%) total loops in shared programs: 2234 -> 2202 (-1.43%) helped: 32 Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: Allow nir_lower_io() to only lower one type of variable.Kenneth Graunke2015-10-012-4/+18
| | | | | | | | | We may want to use different type_size functions for (e.g.) inputs vs. uniforms. Passing in -1 for mode ignores this, handling all modes as before. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Don't set dest in SSBO store glsl_to_nir conversionJordan Justen2015-09-291-1/+0
| | | | | | | | | This matches the function signature created in lower_ubo_reference_visitor::ssbo_store which has a void return. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir: Use a system value for gl_PrimitiveIDIn.Kenneth Graunke2015-09-293-1/+10
| | | | | | | | | | | | | | | | At least on Intel hardware, gl_PrimitiveIDIn comes in as a special part of the payload rather than a normal input. This is typically what we use system values for. Dave and Ilia also agree that a system value would be nicer. At some point, we should change it at the GLSL IR level as well. But that requires changing most of the drivers. For now, let's at least make NIR do the right thing, which is easy. v2: Add a comment about not creating a temporary (suggested by Iago). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* nir: Convert SYSTEM_VALUE_NUM_WORK_GROUPS to a nir intrinsicJordan Justen2015-09-292-0/+5
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Add a function to count the number of vertices a GS emits.Kenneth Graunke2015-09-262-0/+95
| | | | | | | | | | | Some hardware (such as Broadwell) can run geometry shaders more efficiently when the number of vertices emitted is statically known. This pass provides a way to obtain the constant vertex count, or -1 indicating that the vertex count is unknown/non-constant. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir: Implement lowered SSBO atomic intrinsicsIago Toral Quiroga2015-09-252-0/+82
| | | | | | | | | | | | The original GLSL IR intrinsics have been lowered to an internal version that accepts a block index and an offset instead of a SSBO reference. v2 (Connor): - Document the sources used by the atomic intrinsics. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir/glsl_to_nir: ignore an instruction's dest if it hasn't anyIago Toral Quiroga2015-09-251-1/+2
| | | | | Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Implement __intrinsic_load_ssboIago Toral Quiroga2015-09-253-1/+70
| | | | | | | | | | | v2: - Fix ssbo loads with boolean variables. v3: - Simplify the changes (Kristian) Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: modify the instruction insertion in nir_visitor::visit(ir_call *ir)Samuel Iglesias Gonsalvez2015-09-251-4/+10
| | | | | | | | | | | | This patch moves nir_instr_insert_after_cf_list call into each case in the intrinsics switch at nir_visitor::visit(ir_call *ir) and define a nir_dest variable which will be used when handling ir->return_deref after the switch. This patch simplifies the code for nir_intrinsic_load_ssbo implementation changes we are going to do next. Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Implement __intrinsic_store_ssboIago Toral Quiroga2015-09-252-8/+48
| | | | | | | | v2 (Connor): - Make the STORE() macro take arguments for the extra sources (and their size) and any extra indices required. Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Implement ir_unop_get_buffer_sizeSamuel Iglesias Gonsalvez2015-09-252-0/+17
| | | | | | | | This is how backends provide the buffer size required to compute the size of unsized arrays in the previous patch Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
* nir: Add new GS intrinsics that maintain a count of emitted vertices.Kenneth Graunke2015-09-233-0/+241
| | | | | | | | | | | | | | | | | | | | | | | | This patch also introduces a lowering pass to convert the simple GS intrinsics to the new ones. See the comments above that for the rationale behind the new intrinsics. This should be useful for i965; it's a generic enough mechanism that I could see other drivers potentially using it as well, so I don't feel too bad about putting it in the generic code. v2: - Use nir_after_block_before_jump for the cursor (caught by Jason Ekstrand - I'd mistakenly used nir_after_block when rebasing this code onto the new NIR control flow API). - Remove the old emit_vertex intrinsic at the end, rather than in the middle (requested by Jason). - Use state->... directly rather than locals (requested by Jason). - Report progress from nir_lower_gs_intrinsics() (requested by me). - Remove "Authors:" section from file comment (requested by Michael Schellenberger Costa). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add unit tests for control flow graphs.Kenneth Graunke2015-09-231-0/+155
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Connor Abbott <[email protected]>
* nir/cf: Fix dominance metadata in the dead control flow pass.Kenneth Graunke2015-09-232-3/+7
| | | | | | | | | | | | | | | | | The NIR control flow modification API churns the block structure, splitting blocks, stitching them back together, and so on. Preserving information about block dominance is hard (and probably not worthwhile). This patch makes nir_cf_extract() throw away all metadata, like we do when adding/removing jumps. We then make the dead control flow pass compute dominance information right before it uses it. This is necessary because earlier work by the pass may have invalidated it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/cf: Fix unlink_block_successors to actually unlink the second one.Kenneth Graunke2015-09-231-2/+2
| | | | | | | | | | | Calling unlink_blocks(block, block->successors[0]) will successfully unlink the first successor, but then will shift block->successors[1] down to block->successor[0]. So the successors[1] != NULL check will always fail. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/cf: Alter block successors before adding a fake link.Kenneth Graunke2015-09-231-16/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the case of "while (...) { break }". Or in NIR: block block_0 (0x7ab640): ... /* succs: block_1 */ loop { block block_1: /* preds: block_0 */ break /* succs: block_2 */ } block block_2: Calling nir_handle_remove_jump(block_1, nir_jump_break) will remove the break. Unfortunately, it would mangle the predecessors and successors. Here, block_2->predecessors->entries == 1, so we would create a fake link, setting block_1->successors[1] = block_2, and adding block_1 to block_2's predecessor set. This is illegal: a block cannot specify the same successor twice. In particular, adding the predecessor would have no effect, as it was already present in the set. We'd then call unlink_block_successors(), which would delete the fake link and remove block_1 from block_2's predecessor set. It would then delete successors[0], and attempt to remove block_1 from block_2's predecessor set a second time...except that it wouldn't be present, triggering an assertion failure. The fix appears to be simple: simply unlink the block's successors and recreate them to point at the correct blocks first. Then, add the fake link. In the above example, removing the break would cause block_1 to have itself as a successor (as it becomes an infinite loop), so adding the fake link won't cause a duplicate successor. v2: Add comments (requested by Connor Abbott) and fix commit message. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/cf: Conditionally do block_add_normal_succs() in unlink_jump();Kenneth Graunke2015-09-231-6/+6
| | | | | | | | | | | | | | There is a bug where we mess up predecessors/successors due to the ordering of unlinking/recreating edges/adding fake edges. In order to fix that, I need everything in one routine. However, calling block_add_normal_succs() isn't safe from cleanup_cf_node() - it would crash trying to insert phi undefs. So unfortunately I need to add a parameter. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/cf: Don't break outer-block successors in split_block_beginning().Kenneth Graunke2015-09-231-3/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Consider the following NIR: block block_0; /* succs: block_1 block_2 */ if (...) { block block_1; ... } else { block block_2; } Calling split_block_beginning() on block_1 would break block_0's successors: link_block() sets both successors of a block, so calling link_block(block_0, new_block, NULL) would throw away the second successor, leaving only /* succ: new_block */. This is invalid: the block before an if statement must have two successors. Changing the call to link_block(pred, new_block, pred->successors[0]) would correctly leave both successors in place, but because unlink_block may shift successor[1] to successor[0], it may not preserve the original order. NIR maintains a convention that successor[0] must point to the "then" block, while successor[1] points to the "else" block, so we need to take care to preserve this ordering. This patch creates a new function that swaps out one successor for another, preserving the ordering. It then uses this to fix the issue. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/cf: Make a helper function for removing a predecessor.Kenneth Graunke2015-09-231-5/+11
| | | | | | | | | I need to do this in a second place, and I'd rather make a helper function than cut and paste the code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Validate that a block doesn't have two identical successors.Kenneth Graunke2015-09-231-0/+1
| | | | | | | | | | This is invalid, and causes disasters if we try to unlink successors: removing the first will work, but removing the second copy will fail because the block isn't in the successor's predecessor set any longer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_vec_to_movs: Don't emit unneeded movsJason Ekstrand2015-09-231-1/+19
| | | | | | | | | | | | | | | | | It's possible that, if a vecN operation is involved in a phi node, that we could end up moving from a register to itself. If swizzling is involved, we need to emit the move but. However, if there is no swizzling, then the mov is a no-op and we might as well not bother emitting it. Shader-db results on Haswell: total instructions in shared programs: 6262536 -> 6259558 (-0.05%) instructions in affected programs: 184780 -> 181802 (-1.61%) helped: 838 HURT: 0 Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/lower_vec_to_movs: Properly handle source modifiers on vecN opsJason Ekstrand2015-09-231-1/+5
| | | | | | | | | I don't know of any piglit tests that are currently broken. However, there is nothing stopping a vecN instruction from getting source modifiers and lower_vec_to_movs is run after we lower to source modifiers. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/lower_alu_to_scalar: Add support for nir_op_fdphJason Ekstrand2015-09-221-0/+18
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add fdph and fdph_replicated opcodesJason Ekstrand2015-09-223-1/+8
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_alu_to_scalar: Return after lower_reductionJason Ekstrand2015-09-221-1/+1
| | | | | | | | | | We don't use any of the code after the switch anyway. Since we check for num_components == 1 and early-return, it doesn't get executed so everything's ok. However, it makes it much clearer what's going on if we simply do an early return. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_alu_to_scalar: Use the builderJason Ekstrand2015-09-221-25/+22
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Report progress from nir_normalize_cubemap_coords().Kenneth Graunke2015-09-212-8/+23
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add braces around multi-line loop.Kenneth Graunke2015-09-211-1/+2
| | | | | | | This was correct but not our usual style. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from nir_lower_system_values().Kenneth Graunke2015-09-212-10/+19
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from nir_split_var_copies().Kenneth Graunke2015-09-212-4/+13
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from nir_lower_locals_to_regs().Kenneth Graunke2015-09-212-4/+16
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from nir_remove_dead_variables().Kenneth Graunke2015-09-212-5/+12
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from lower_vec_to_movs().Jason Ekstrand2015-09-212-7/+22
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Report progress from nir_lower_globals_vars_to_local().Kenneth Graunke2015-09-212-2/+6
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/builder: Don't use designated initializersJason Ekstrand2015-09-211-3/+18
| | | | | | | | | | | Designated initializers are not allowed in C++ (not even C++11). Since nir_lower_samplers is now using nir_builder, and nir_lower_samplers is in C++, this breaks the build on some compilers. Aparently, GCC 5 allows it in some limited extent because mesa still builds on my system without this patch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92052 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Move system value -> intrinsic mapping into nir.cJason Ekstrand2015-09-213-40/+40
| | | | | | This way they're right next to the map going the other direction. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename nir_lower_samplers.c{pp,}Emil Velikov2015-09-211-4/+2
| | | | | | | | | | | | | | | | | | With the only C++ function having its own wrapper we can 'demote' this file to a normal C one. This allows us to get rid of extern C { #include <foo.h> } 'hacks'. Plus some of the headers may use C99 initializers, which are not supported by the ISO standard. This may cause build issue on incremental builds. If so run the following: sed -i -e 's|samplers\.cpp|samplers.c|' src/glsl/nir/.deps/nir_lower_samplers.Plo Fixes: ef8eebc6ad5(nir: support indirect indexing samplers in struct arrays) Signed-off-by: Emil Velikov <[email protected]> Reported-by: Gottfried Haider <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: add C wrapper around glsl_type::record_location_offsetEmil Velikov2015-09-212-0/+9
| | | | | | | | This will allow us to convert nir_lower_sampler.cpp to C. Signed-off-by: Emil Velikov <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: move stdio.h inclusion before extern CEmil Velikov2015-09-211-2/+2
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Tested-by: Gottfried Haider <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir/print: fix coverity errorRob Clark2015-09-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Not something actually hit in real life (now state is never non-null, but only case state->syms is null is if nir_print_instr() path). But it was something I overlooked the first time, so might as well fix it. *** CID 1324642: Null pointer dereferences (REVERSE_INULL) /src/glsl/nir/nir_print.c: 299 in print_var_decl() 293 294 fprintf(fp, " (%s, %u)", loc, var->data.driver_location); 295 } 296 297 fprintf(fp, "\n"); 298 >>> CID 1324642: Null pointer dereferences (REVERSE_INULL) >>> Null-checking "state" suggests that it may be null, but it has already been dereferenced on all paths leading to the check. 299 if (state) { 300 _mesa_set_add(state->syms, name); 301 _mesa_hash_table_insert(state->ht, var, name); 302 } 303 } 304 Signed-off-by: Rob Clark <[email protected]>