summaryrefslogtreecommitdiffstats
path: root/src/glsl/nir/nir.h
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add a nir_lower_load_const_to_scalar() pass.Eric Anholt2015-08-041-0/+1
| | | | | | | | This is useful to increase the CSE opportunities for a scalar backend. It avoids regressions when dropping vc4's custom CSE implementation. v2: Cleanups by Matt (decl in the for loop, and unreachable()). Reviewed-by: Matt Turner <[email protected]>
* Revert "nir: Use a single bit for the dual-source blend index"Eric Anholt2015-08-041-6/+2
| | | | | This reverts commit ab5b7a0fe659ff6f9c1885d5cb047b6531959506. We use more than one bit of value in tgsi_to_nir.
* nir/nir_lower_io: Add vec4 supportIago Toral Quiroga2015-08-031-8/+10
| | | | | | | | | | The current implementation operates in scalar mode only, so add a vec4 mode where types are padded to vec4 sizes. This will be useful in the i965 driver for its vec4 nir backend (and possbly other drivers that have vec4-based shaders). Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Use a single bit for the dual-source blend indexTimothy Arceri2015-08-031-2/+6
| | | | | | | | | | The only values allowed are 0 and 1, and the value is checked before assigning. This is a copy of 8eeca7a56c that seems to have been made to the glsl ir type after it was copied for use in nir but before nir landed. Reviewed-by: Tapani Pälli <[email protected]>
* nir: add nir_foreach_instr_safe_reverse()Connor Abbott2015-07-171-0/+2
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add nir_instr_is_first() and nir_instr_is_last() helpersConnor Abbott2015-07-171-0/+12
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir: add nir_var_shader_storageIago Toral Quiroga2015-07-141-0/+1
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir: Fix comment above nir_convert_from_ssa() prototype.Kenneth Graunke2015-07-081-3/+3
| | | | | | | | Connor renamed the parameter, inverting the sense. Update the comment accordingly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: remove parent_instr from nir_registerConnor Abbott2015-06-301-8/+0
| | | | | | It's no longer used. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: remove nir_src_get_parent_instr()Connor Abbott2015-06-301-10/+0
| | | | | | It's now unused. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/from_ssa: add a flag to not convert everything from SSAConnor Abbott2015-06-301-1/+6
| | | | | | | | | | | | | We already don't convert constants out of SSA, and in our backend we'd like to have only one way of saying something is still in SSA. The one tricky part about this is that we may now leave some undef instructions around if they aren't part of a phi-web, so we have to be more careful about deleting them. v2: rename and flip meaning of flag (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir/nir: Use a linked list instead of a hash set for use/def setsJason Ekstrand2015-05-081-7/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit switches us from the current setup of using hash sets for use/def sets to using linked lists. Doing so should save us quite a bit of memory because we aren't carrying around 3 hash sets per register and 2 per SSA value. It should also save us CPU time because adding/removing things from use/def sets is 4 pointer manipulations instead of a hash lookup. Running shader-db 50 times with USE_NIR=0, NIR, and NIR + use/def lists: GLSL IR Only: 586.4 +/- 1.653833 NIR with hash sets: 675.4 +/- 2.502108 NIR + use/def lists: 641.2 +/- 1.557043 I also ran a memory usage experiment with Ken's patch to delete GLSL IR and keep NIR. This patch cuts an aditional 42.9 MiB of ralloc'd memory over and above what we gained by deleting the GLSL IR on the same dota trace. On the code complexity side of things, some things are now much easier and others are a bit harder. One of the operations we perform constantly in optimization passes is to replace one source with another. Due to the fact that an instruction can use the same SSA value multiple times, we had to iterate through the sources of the instruction and determine if the use we were replacing was the only one before removing it from the set of uses. With this patch, uses are per-source not per-instruction so we can just remove it safely. On the other hand, trying to iterate over all of the instructions that use a given value is more difficult. Fortunately, the two places we do that are the ffma peephole where it doesn't matter and GCM where we already gracefully handle duplicates visits to an instruction. Another aspect here is that using linked lists in this way can be tricky to get right. With sets, things were quite forgiving and the worst that happened if you didn't properly remove a use was that it would get caught in the validator. With linked lists, it can lead to linked list corruption which can be harder to track. However, we do just as much validation of the linked lists as we did of the sets so the validator should still catch these problems. While working on this series, the vast majority of the bugs I had to fix were caught by assertions. I don't think the lists are going to be that much worse than the sets. Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a function for rewriting the condition of an if statementJason Ekstrand2015-05-081-0/+1
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add and use initializer #defines for nir_src and nir_destJason Ekstrand2015-05-081-6/+7
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Move get_const_initializer_load from vars_to_ssa to NIR coreJason Ekstrand2015-04-221-0/+3
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/tex: Use the correct return size for query_levels and lodJason Ekstrand2015-04-221-1/+4
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Refactor tex_instr_dest_size to use a switch statementJason Ekstrand2015-04-221-5/+8
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Silence unused parameter warningsIan Romanick2015-04-141-1/+1
| | | | | | | | | | | | | | | | | | | nir/nir.h: In function 'nir_validate_shader': nir/nir.h:1567:56: warning: unused parameter 'shader' [-Wunused-parameter] static inline void nir_validate_shader(nir_shader *shader) { } ^ nir/nir_opt_cse.c: In function 'src_is_ssa': nir/nir_opt_cse.c:165:32: warning: unused parameter 'data' [-Wunused-parameter] src_is_ssa(nir_src *src, void *data) ^ nir/nir_opt_cse.c: In function 'dest_is_ssa': nir/nir_opt_cse.c:171:35: warning: unused parameter 'data' [-Wunused-parameter] dest_is_ssa(nir_dest *dest, void *data) ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Store num_direct_uniforms in the nir_shader.Kenneth Graunke2015-04-111-0/+3
| | | | | | | | | | | | | | Storing this here is pretty sketchy - I don't know if any driver other than i965 will want to use it. But this will make it a lot easier to generate NIR code at link time. We'll probably rework it anyway. (Ian suggested making nir_assign_var_locations_scalar_direct_first simply modify the nir_shader's fields, rather than passing pointers to them. If this stays long term, we should do that. But Jason and I suspect we'll be reworking this area again in the near future.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: split out lower_sub from lower_negateRob Clark2015-04-111-0/+2
| | | | | | | | | | Originally you had to have one or the other. But actually I don't want either. (Or rather I want whatever is the minimum # of instructions.) TODO: not sure where the best place to insert a check that driver hasn't set *both* lower_negate and lower_sub? Signed-off-by: Rob Clark <[email protected]>
* nir: Constify nir_lower_sampler's gl_shader_program pointer.Kenneth Graunke2015-04-101-1/+1
| | | | | | | | | Now that we're not generating linker errors, we don't actually modify this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Make nir_lower_samplers take a gl_shader_stage, not a gl_program *.Kenneth Graunke2015-04-101-1/+1
| | | | | | | | | | We don't actually need a gl_program struct. We only used it to translate prog->Target (i.e. GL_VERTEX_PROGRAM) to the gl_shader_stage (i.e. MESA_SHADER_VERTEX). We may as well just pass that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Move gl_shader_stage enum from mtypes.h to shader_enums.h.Kenneth Graunke2015-04-101-0/+1
| | | | | | | | | I want to use this in some code that doesn't currently include mtypes.h. It seems like a better place for it anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Make nir_*_instr_create take a nir_shader instead of a void * contextJason Ekstrand2015-04-071-9/+9
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Implement a nir_sweep() pass.Kenneth Graunke2015-04-071-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This pass performs a mark and sweep pass over a nir_shader's associated memory - anything still connected to the program will be kept, and any dead memory we dropped on the floor will be freed. The expectation is that this will be called when finished building and optimizing the shader. However, it's also fine to call it earlier, and many times, to free up memory earlier. v2: (feedback from Jason Ekstrand) - Skip sweeping impl->start_block, as it's already in the CF list. - Don't sweep SSA defs (they're owned by their defining instruction) - Don't steal phi sources (they're owned by nir_phi_instr). - Don't steal tex->src (it's owned by the tex_inst itself) - Don't sweep dereference chains (top-level dereferences are owned by the instruction; sub-dereferences are owned by the parent deref). - Don't sweep sources and destinations (SSA defs are handled as part of the defining instruction, and registers are handled as part of function implementations). - Just steal instructions; don't walk them (no longer required). v3: (feedback from Jason Ekstrand) - Steal indirect sources from nir_src/nir_dest. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add lowering for idiv/udiv/umodRob Clark2015-04-051-0/+1
| | | | | | | | | | | | | | | | | Based on the algo from NV50LegalizeSSA::handleDIV() and handleMOD(). See also trans_idiv() in freedreno/ir3/ir3_compiler.c (which was an adaptation of the nv50 code from Ilia Mirkin). A python/numpy script which implements the same algorithm (and is possibly useful for debugging or analysis) can be found here: http://people.freedesktop.org/~robclark/div-lowering.py I've tested this on i965 hacked up to insert the idiv lowering pass, and on freedreno with NIR frontend. Signed-off-by: Rob Clark <[email protected]> Tested-by: Eric Anholt <[email protected]> (vc4)
* nir: add option to lower slt/sge/seq/sneRob Clark2015-04-051-0/+3
| | | | | | | | | | | | | | In freedreno these get implemented as the matching f* instruction plus a u2f to convert the result to float 1.0/0.0. But less lines of code to just let nir_opt_algebraic handle this for us, plus opens up some small window for other opt passes to improve (ie. if some shader ended up with both a flt and slt with same src args, for example). v2: use b2f rather than u2f Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Add a cubemap normalizing passJason Ekstrand2015-04-031-0/+2
| | | | | | | | | | | | | | | This commit adds a pass to L1-normalize cube-map coordinates. Some hardware such as i965 requires that largest cube-map coordinate is +-1. We had a pass to perform this normalization in GLSL IR but we need it in NIR for cube maps on ARB programs to work correctly. Reviewed-by: Jordan Justen <[email protected]> v2 (Suggested by Eric): - Do a vector fabs and split into components later - Move to core NIR Reviewed-by: Eric Anholt <[email protected]>
* nir: Add a src_get_parent_instr functionJason Ekstrand2015-04-031-0/+10
| | | | Reviewed-by: Jordan Justen <[email protected]>
* nir: Add a lowering pass for texture projectors.Eric Anholt2015-04-031-0/+1
| | | | | | | Not much hardware wants them these days, and it might give us a chance to do CSE or algebraic at the NIR level. Reviewed-by: Jason Ekstrand <[email protected]>
* nir/algebraic: Add a seperate section for "late" optimizationsJason Ekstrand2015-04-011-0/+1
| | | | | | i965/nir: Use the late optimizations Reviewed-by: Matt Turner <[email protected]>
* nir: Add optional lowering of flrp.Eric Anholt2015-03-271-0/+1
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_io: Add a assign_locations function that sorts by [in]direct useJason Ekstrand2015-03-191-0/+4
| | | | | | | v2: Delete the set of indirectly accessed variables when we're done with it v3: Rename from _packed to _scalar Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_io: Make variable location assignment a manual operationJason Ekstrand2015-03-191-0/+3
| | | | | | | | | | Previously, we just assigned variable locations in nir_lower_io. Now, we force the user to assign variable locations for us. This gives the backend a bit more control over where variables are placed. v2: Rename from _packed to _scalar Reviewed-by: Connor Abbott <[email protected]>
* nir: Use a list instead of a hash_table for inputs, outputs, and uniformsJason Ekstrand2015-03-191-3/+3
| | | | | | | | | | | | We never did a single hash table lookup in the entire NIR code base that I found so there was no real benifit to doing it that way. I suppose that for linking, we'll probably want to be able to lookup by name but we can leave building that hash table to the linker. In the mean time this was causing problems with GLSL IR -> NIR because GLSL IR doesn't guarantee us unique names of uniforms, etc. This was causing massive rendering isues in the unreal4 Sun Temple demo. Reviewed-by: Connor Abbott <[email protected]>
* nir: Add native_integers to nir_shader_compiler_options.Kenneth Graunke2015-03-081-0/+6
| | | | | | | | | | | glsl_to_nir, tgsi_to_nir, and prog_to_nir all want to know whether the driver supports native integers. Presumably other passes may as well. Adding this to nir_shader_compiler_options is an easy way to provide that information, as it's accessible via nir_shader::options. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Delete nir_shader::user_structures and num_user_structures.Kenneth Graunke2015-03-081-4/+0
| | | | | | | | | | | Nothing actually uses these, and the only caller of glsl_to_nir() (brw_fs_nir.cpp) always passes NULL for the _mesa_glsl_parse_state pointer, meaning they'll always be NULL and 0, respectively. Just delete them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/register: Add a parent_instr fieldJason Ekstrand2015-02-241-1/+10
| | | | | | | | | | | | | This adds a parent_instr field similar to the one for ssa_def. The difference here is that the parent_instr field on a nir_register can be NULL if the register does not have a unique definition or if that definition does not dominate all its uses. We set this field in the out-of-SSA pass so that backends can get SSA-like information even after they have gone out of SSA. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Drop dependency on mtypes.h for core NIR.Eric Anholt2015-02-201-1/+3
| | | | | | | | One less new directory necessary for gallium code that wants to interact with NIR. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util: Move Mesa's bitset.h to util/.Eric Anholt2015-02-201-1/+1
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* nir: Add a global code motion (GCM) passJason Ekstrand2015-02-191-0/+2
| | | | | | | | | v2 Jason Ekstrand <[email protected]>: - Use nir_dominance_lca for computing least common anscestors - Use the block index for comparing dominance tree depths - Pin things that do partial derivatives Reviewed-by: Reviewed-by: Connor Abbott <[email protected]>
* nir/instr: Change "live" to a more generic "pass_flags" fieldJason Ekstrand2015-02-191-2/+4
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Make nir_[cf_node/instr]_[prev/next] return null if at the endJason Ekstrand2015-02-191-6/+22
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/dominance: Add a constant-time mechanism for comparing blocksJason Ekstrand2015-02-191-0/+9
| | | | | | | | | This is mostly thanks to Connor. The idea is to do a depth-first search that computes pre and post indices for all the blocks. We can then figure out if one block dominates another in constant time by two simple comparison operations. Reviewed-by: Connor Abbott <[email protected]>
* nir/dominance: Expose the dominance intersection functionJason Ekstrand2015-02-191-0/+2
| | | | | | | | | | Being able to find the least common anscestor in the dominance tree is a useful thing that we may want to do in other passes. In particular, we need it for GCM. v2: Handle NULL inputs by returning the other block Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a flag for lowering fsat.Eric Anholt2015-02-181-0/+1
| | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering ffma.Eric Anholt2015-02-181-0/+1
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fneg/ineg.Eric Anholt2015-02-181-0/+2
| | | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)).Eric Anholt2015-02-181-0/+1
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Conditionalize the POW reconstruction on shader compiler options.Eric Anholt2015-02-181-0/+1
| | | | | | | | | | | | | Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v2)