summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: Don't consider null dst instructions as matching non-null dst.Matt Turner2015-01-152-2/+4
| | | | | | | | | | | | | | | | | | | When performing common subexpression elimination on instructions with non-null destinations we emit a MOV to copy the result to a new register that must have no other uses. In the case of: cmp.g.f0.0(8) null:D, vgrf43:F, 0.500000f ... cmp.g.f0.0(8) vgrf113:D, vgrf43:F, 0.500000f we put the first instruction in the AEB and decided that we could reuse its result when we found the second. Unfortunately, that meant that we'd emit a MOV from the first's destination, which is null. Don't do anything if the entry's destination is null and the instruction's destination is non-null. Tested-by: Tapani Pälli <[email protected]>
* i965/vec4: Make sure that imm writes are to registers in the same file.Matt Turner2015-01-151-2/+8
| | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87887
* i965/fs: Emit MADs from (x + abs(y * z)).Matt Turner2015-01-151-3/+15
| | | | | | | | | Just use the abs source modifier on both of the multiplicand arguments. instructions in affected programs: 300 -> 296 (-1.33%) Reviewed-by: Kristian Høgsberg <[email protected]>
* i965/fs: Emit MADs from (x + -(y * z)).Matt Turner2015-01-151-0/+12
| | | | | | | | | | Just use the negation source modifier on one of the multiplicand arguments. total instructions in shared programs: 5889529 -> 5880016 (-0.16%) instructions in affected programs: 600846 -> 591333 (-1.58%) Reviewed-by: Kristian Høgsberg <[email protected]>
* nir/algebraic: Only replace an instruction onceJason Ekstrand2015-01-151-1/+3
| | | | | | | | | Without the break, it was possible that an instruction would match multiple expressions. If this happened, you could end up trying to replace it multiple times and get a segfault. This makes it so that, after a successful replacement, it moves on to the next instruction. Reviewed-by: Connor Abbott <[email protected]>
* i965/nir: Do a final copy lowering pass before lowering locals to regsJason Ekstrand2015-01-151-0/+3
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/vars_to_ssa: Use the copy lowering from lower_var_copiesJason Ekstrand2015-01-151-152/+46
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a pass for lowering copy instructionsJason Ekstrand2015-01-153-0/+227
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/vars_to_ssa: Refactor get_deref_nodeJason Ekstrand2015-01-151-20/+25
| | | | | | | | This refactor allows you to more easily get the deref node associated with a given variable. We then use that new functionality in the deref_may_be_aliased function instead of creating a 1-element deref chain. Reviewed-by: Connor Abbott <[email protected]>
* nir: Rename lower_variables to lower_vars_to_ssaJason Ekstrand2015-01-154-6/+6
| | | | | | | | The original name wasn't particularly descriptive. This one indicates that it actually gives you SSA values as opposed to the old pass which lowered variables to registers. Reviewed-by: Connor Abbott <[email protected]>
* nir/tex_instr: Add a nir_tex_src struct and dynamically allocate the src arrayJason Ekstrand2015-01-157-42/+50
| | | | | | | | This solves a number of problems. First is the ability to change the number of sources that a texture instruction has. Second, it solves the delema that may occur if a texture instruction has more than 4 sources. Reviewed-by: Connor Abbott <[email protected]>
* nir/validate: Only build in debug modeJason Ekstrand2015-01-152-0/+11
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Improve documentationJason Ekstrand2015-01-151-27/+79
| | | | | | | | Additional description was added to a variety of places. Also, we no longer use the term "leaf" to describe fully-qualified direct derefs. Instead, we simply use the term "direct" or spell it out completely. Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Use a for loop for get_deref_nodeJason Ekstrand2015-01-151-58/+48
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Use the actual FNV-1a hash for hashing derefsJason Ekstrand2015-01-152-90/+79
| | | | | | We also switch to using loops rather than recursion. Reviewed-by: Connor Abbott <[email protected]>
* util/hash_table: Pull the details of the FNV-1a into helpersJason Ekstrand2015-01-152-13/+23
| | | | | | | This way the basics of the FNV-1a hash can be reused to easily create other hashing functions. Reviewed-by: Eric Anholt <[email protected]>
* nir: Make intrinsic flags into an enumJason Ekstrand2015-01-151-14/+14
| | | | | | | | This should be much better for debugging as GDB will pick up on the fact that it's an enum and actually tell you what you're looking at instead of giving you some arbitrary hex value you have to go look up. Reviewed-by: Connor Abbott <[email protected]>
* nir: Use static inlines instead of macros for list gettersJason Ekstrand2015-01-151-28/+81
| | | | | | | | This should make debugging a lot easier as GDB handles static inlines much better than macros. Also, static inlines are typesafe. Reviewed-By: Glenn Kennard <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/variable: Remove the constant_value fieldJason Ekstrand2015-01-152-16/+4
| | | | | | | | | This was a left-over relic of GLSL IR that we aren't using for anything. If we ever want that value again, we can add it back, but NIR constant folding should be just as good as GLSL IR's if not better pretty soon, so I'm not worried about it. Reviewed-by: Connor Abbott <[email protected]>
* nir: Add some documentationJason Ekstrand2015-01-151-22/+69
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Follow the Cytron paper more closelyJason Ekstrand2015-01-151-26/+69
| | | | | | | | | | Previously, our variable renaming algorithm, while similar to the one in the Cytron paper, was not the same. While I'm pretty sure it was correct, it will be easier for readers of the code in the variable renaming pass if it follows more closely. This commit removes the automatic stack popping we were doing and replaces it with explicit popping like Cytron does. Reviewed-by: Connor Abbott <[email protected]>
* nir/print: Various cleanups recommended by EricJason Ekstrand2015-01-151-33/+12
| | | | | Cc: Eric Anholt <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Add a bunch of comments and re-arrange a few thingsJason Ekstrand2015-01-151-57/+170
| | | | | | | | This commit seeks to make the lower_variables pass much more clear by adding a pile of comments and re-arranging a few things. There are no functional or algorithmic changes. Reviewed-by: Connor Abbott <[email protected]>
* nir: Rename parallel_copy_copy to parallel_copy_entry and add a foreach macroJason Ekstrand2015-01-154-46/+55
| | | | | | | | | | parallel_copy_copy was a silly name. Also, things were getting long and annoying, so I added a foreach macro. For historical reasons, several of the original iterations over parallel copy entries in from_ssa used the _safe variants of the loop. However, all of these no longer ever remove an entry so it's ok to make them all use the normal iterator. Reviewed-by: Connor Abbott <[email protected]>
* nir/from_ssa: Clean up parallel copy handling and document it betterJason Ekstrand2015-01-153-66/+92
| | | | | | | | | | | | | | | Previously, we were doing a lazy creation of the parallel copy instructions. This is confusing, hard to get right, and involves some extra state tracking of the copies. This commit adds an extra walk over the basic blocks to add the block-end parallel copies up front. This should be much less confusing and, consequently, easier to get right. This commit also adds more comments about parallel copies to help explain what all is going on. As a consequence of these changes, we can now remove the at_end parameter from nir_parallel_copy_instr. Reviewed-by: Connor Abbott <[email protected]>
* nir: Rename nir_block_following_if to nir_block_get_following_ifJason Ekstrand2015-01-155-5/+5
| | | | | | The new name is a little longer but less confusing. Reviewed-by: Connor Abbott <[email protected]>
* i965/fs_nir: Handle sample ID, position, and mask betterJason Ekstrand2015-01-152-12/+71
| | | | | | | | | | Before, we were emitting the full pile of setup instructions for sample_id and sample_pos every time they were used. With this commit, we emit them in their own pass once at the beginning of the shader and simply emit uses later on. When it comes time for setting up VS, we can put setup for its special values in the same pass. Reviewed-by: Connor Abbott <[email protected]>
* nir/opcodes: Remove the per_component info fieldJason Ekstrand2015-01-153-37/+33
| | | | | | | | | | | Originally, this field was intended for determining if the given instruction acted per-component or if it had mismatching source and destination sizes that would have to be interpreted specially. However, we can easily derive this from output_size == 0, so it's not really that useful. Also, the values we were setting in nir_opcodes.h for this field were completely bogus and it was never used. Reviewed-by: Connor Abbott <[email protected]>
* nir/search: Use nir_op_infos to determine if an operation is commutativeJason Ekstrand2015-01-151-33/+2
| | | | | | | Prior to this commit, we had a big switch statement for this. Now it's baked into the opcode metadata so we can just use that. Reviewed-by: Connor Abbott <[email protected]>
* nir/opcodes: Add algebraic properties metadataJason Ekstrand2015-01-153-71/+89
| | | | | | | | | | | | | | | | | | This commit adds some algebraic properties to the metadata of each opcode in NIR. In particular, you now know, just from the metadata, if a given opcode is commutative or associative. This will be useful for algebraic transformation passes that want to be able to match a + b as well as b + a in one go. v2: Make algebraic properties all caps. This was more consistent with the intrinsics flags and seems better for flags in general. Also, the enums are now declared with (1 << n) rather then hex values. v3: fmin and fmax technically aren't commutative or associative. Things get funny when one of the arguments is a NaN. Reviewed-by: Connor Abbott <[email protected]>
* nir: Make load_const SSA-onlyJason Ekstrand2015-01-1516-162/+54
| | | | | | | | As it was, we weren't ever using load_const in a non-SSA way. This allows us to substantially simplify the load_const instruction. If we ever need a non-SSA constant load, we can do a load_const and an imov. Reviewed-by: Connor Abbott <[email protected]>
* nir: Make nir_ssa_undef_instr_create initialize the destinationJason Ekstrand2015-01-154-13/+11
| | | | Reviewed-by: Connor Abbott <[email protected]>
* i965/nir: Move the other lowering passes to before out-of-SSAJason Ekstrand2015-01-151-6/+6
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_system_values: Handle SSA destinationsJason Ekstrand2015-01-151-1/+14
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_atomics: Use/support SSAJason Ekstrand2015-01-152-21/+35
| | | | | | | | | | | Previously, lower_atomics was non-SSA only. We assert-failed if the destination of an atomic operation intrinsic was an SSA def and we used temporary registers for computing offsets. This commit changes both of these behaviors. We now use SSA values for computing offsets (so we can optimize them) and we handle SSA destinations. We also move the pass to run before we go out of SSA on i965 as it now generates SSA values. Reviewed-by: Connor Abbott <[email protected]>
* nir/live_variables: Use the new ssa_def iteratorJason Ekstrand2015-01-151-19/+13
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Use nir_foreach_ssa_def for setting up ssa destinationsJason Ekstrand2015-01-151-13/+11
| | | | | | | | | | | | Before, we were using foreach_dest and switching on whether the destination was an SSA value. This works, except not all destinations are SSA values so we have to special-case ssa_undef instructions. Now that we have a foreach_ssa_def function, we can iterate over all of the register destinations in one pass and iterate over the SSA destinations in a second. This way, if we add other ssa-only instructions, we won't have to worry about adding them to the special case we have for ssa_undef. Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a foreach_ssa_def functionJason Ekstrand2015-01-152-0/+43
| | | | | | | | | | | | There are some functions whose destinations are SSA-only and so aren't a nir_dest. This provides a function that is capable of iterating over the SSA definitions defined by those functions. If you want registers, you should use the old iterator. v2: Kenneth Graunke <[email protected]>: - Fix nir_foreach_ssa_def's return value. Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_variables: Use a real dominance DFS for variable renamingJason Ekstrand2015-01-151-4/+5
| | | | | | | | | | | | | | | | | | | | | | Previously, we were just iterating over the program "in order" which kind-of approximates a DFS, but not really. In particular, we got the following case wrong: loop { a = 3; if (foo) { a = 5; } else { break; } use(a); } where use(a) would get 3 instead of 5 because of premature popping of the SSA def stack. Now, since we do an actaul DFS, we should evaluate use(a) immediately after a = 5 and we should be ok. Reviewed-by: Connor Abbott <[email protected]>
* nir: Remove predicationJason Ekstrand2015-01-1510-321/+18
| | | | | | | | We stopped generating predicates in glsl_to_nir some time ago. Right now, it's all dead untested code that I'm not convinced always worked in the first place. If we decide we want them back, we can revert this patch. Reviewed-by: Connor Abbott <[email protected]>
* nir: Make bcsel a fully vector operationJason Ekstrand2015-01-155-6/+15
| | | | | | | | Previously, the condition was a scalar that applied to all components simultaneously. As of this commit, the condition is a vector and each component is switched seperately. Reviewed-by: Connor Abbott <[email protected]>
* nir: Call nir_metadata_preserve more placesJason Ekstrand2015-01-158-2/+27
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/metadata: Rename metadata_dirty to metadata_preserveJason Ekstrand2015-01-158-16/+18
| | | | | | | | | nir_metadata_dirty was a terrible name because the parameter it takes is the metadata to be preserved. This is really confusing because it looks like it's doing the opposite of what it is actually doing. Now it's named sensibly. Reviewed-by: Connor Abbott <[email protected]>
* i965/fs_nir: Add support for indirect texture arraysJason Ekstrand2015-01-151-4/+21
| | | | | | | | v2 Jason Ekstrand <[email protected]>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <[email protected]>
* nir: Rework the way samplers are loweredJason Ekstrand2015-01-151-75/+78
| | | | | | | | v2 Jason Ekstrand <[email protected]>: - Use the nir_tex_src_sampler_offset source type instead of the sampler_indirect thing that I cooked up before. Reviewed-by: Chris Forbes <[email protected]>
* nir/tex_instr_create: Initialize all 4 sourcesJason Ekstrand2015-01-151-1/+1
| | | | | | | This helps a lot with things like lowering passes that may need to add sources. Reviewed-by: Connor Abbott <[email protected]>
* nir/tex_instr: Rename the indirect source type and add an array sizeJason Ekstrand2015-01-154-4/+17
| | | | | | | | | In particular, we rename nir_tex_src_sampler_index to _sampler_offset and add a sampler_array_size field to nir_tex_instr. This way we can pass the size of sampler arrays through to backends even after removing the variable information and, with it, the type. Reviewed-by: Connor Abbott <[email protected]>
* nir: Use a source for uniform buffer indices instead of an indexJason Ekstrand2015-01-153-55/+76
| | | | | | | | | | In GLSL-to-NIR we were just setting the base index to 0 whenever there was an indirect so having it expressed as a sum makes no sense. Also, while a base offset may make sense for the memory location (first element in the array, etc.) it makes less sense for the actual uniform buffer index. This may change later, but it seems to make more sense for now. Reviewed-by: Connor Abbott <[email protected]>
* nir: Constant fold array indirectsJason Ekstrand2015-01-151-8/+76
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Make texture instruction names more consistentJason Ekstrand2015-01-1511-25/+25
| | | | | | | | This commit renames nir_instr_as_texture to nir_instr_as_tex and renames nir_instr_type_texture to nir_instr_type_tex to be consistent with nir_tex_instr. Reviewed-by: Connor Abbott <[email protected]>