mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	nir: Carve out nir_lower_samplers from GLSL code.	Timur Kristóf	2019-09-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Lowering samplers is needed to produce NIR that can actually be consumed by some gallium drivers, so it doesn't make sense to to keep it only in the GLSL code. This commit introduces nir_lower_samplers to compiler/nir, while maintains the GL-specific function too. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
*	nir: merge and extend nir_opt_move_comparisons and nir_opt_move_load_ubo	Rhys Perry	2019-08-12	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	v2: add to series v3: update Makefile.sources v4: don't remove a comment and break statement v4: use nir_can_move_instr Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir: replace nir_move_load_const() with nir_opt_sink()	Rhys Perry	2019-08-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is mostly the same as nir_move_load_const() but can also move undef instructions, comparisons and some intrinsics (being careful with loops). v2: actually delete nir_move_load_const.c v3: fix nir_opt_sink() usage in freedreno v3: update Makefile.sources v4: replace get_move_def with nir_can_move_instr and nir_instr_ssa_def v4: handle if uses v4: fix handling of nested loops v5: re-write adjust_block_for_loops v5: re-write setting of use_block for if uses Signed-off-by: Rhys Perry <[email protected]> Co-authored-by: Daniel Schürmann <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir/range-analysis: Rudimentary value range analysis pass	Ian Romanick	2019-08-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Most integer operations are omitted because dealing with integer overflow is hard. There are a few things that could be smarter if there was a small amount more tracking of ranges of integer types (i.e., operands are Boolean, operand values fit in 16 bits, etc.). The changes to nir_search_helpers.h are included in this patch to simplify reordering the changes to nir_opt_algebraic.py. v2: Memoize range analysis results. Without this, some shaders appear to get stuck in infinite loops. v3: Rebase on many months of Mesa changes, including 1-bit Boolean changes. v4: Rebase on "nir: Drop imov/fmov in favor of one mov instruction". v5: Use nir_alu_srcs_equal for detecting (aa). Previously just the SSA value was compared, and this incorrectly matched (a.xa.y). v6: Many code improvements including (but not limited to) better names, more comments, and better use of helper functions. All suggested by Caio. Rework the handling of several opcodes to use a table for mapping source ranges to a result range. This change fixed a bug that caused fmax(gt_zero, ge_zero) to be incorrectly recognized as ge_zero. Slightly tighten the range of fmul by recognizing that xx is gt_zero if x is gt_zero. Add similar handling for -xx. v7: Use _______ in the tables as an alias for unknown. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	nir: replace lower_sincos with algebraic opt	Jonathan Marek	2019-07-24	1	-1/+0
\| \| \| \| \| \| \| \|	This version has less ops for the same precision. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Matt Turner <[email protected]>
*	anv,nir: Move lower_input_attachments pass from ANV to NIR.	Daniel Schürmann	2019-07-08	1	-0/+1
\| \| \| \| \|	Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	nir: add pass to lower load_interpolated_input	Rob Clark	2019-07-02	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir/linker: add gl_nir_link_uniform_blocks.c	Alejandro Piñeiro	2019-06-30	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adding the ability to link uniform blocks and shader storage blocks using NIR, intended for ARB_gl_spirv support. Among other things, this linking needs to take into account that everything should work without names, as they could be not present, while the GLSL IR uniform block linking was wrote with the names on its core. The other major difference compared with the GLSL IR linker is that we don't deal with layouts. There are no references to std140, std430, etc. Layouts are expressed through explicit offset, array stride and matrix stride. That simplifies how the buffer size are computed. But also means that we couldn't use the existing methods at glsl_types, so we needed to implement new methods. It is worth to note that this linking do a iteration over the glsl_types, similarly to what the linking uniforms do. A possible future improvement would be refactor both cases to try to share more code that it sharing right now. On GLSL IR there are a class visitor, specialized on each case, for that sharing. As adding a class visitor on C would more complicated, for now we are just iterating on both. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Neil Roberts <[email protected]> Signed-off-by: Antia Puentes <[email protected]> v2: (from Timothy review) * Fix variable name convention * Stop to use _function_name convention * Don't use // for comments * "nir/linker: Keep track of the stages referencing an UBO/SSBO" squashed with this patch v3: (from Caio review) * Don't delete the linked shader on failure * Use rzalloc_array to avoid some explicit initializations Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	glsl/nir: Add optimization pass for access flags	Connor Abbott	2019-06-19	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <[email protected]>
*	nir: add a vectorization pass	Connor Abbott	2019-06-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This effectively does the opposite of nir_lower_alus_to_scalar, trying to combine per-component ALU operations with the same sources but different swizzles into one larger ALU operation. It uses a similar model as CSE, where we do a depth-first approach and keep around a hash set of instructions to be combined, but there are a few major differences: 1. For now, we only support entirely per-component ALU operations. 2. Since it's not always guaranteed that we'll be able to combine equivalent instructions, we keep a stack of equivalent instructions around, trying to combine new instructions with instructions on the stack. The pass isn't comprehensive by far; it can't handle operations where some of the sources are per-component and others aren't, and it can't handle phi nodes. But it should handle the more common cases, and it should be reasonably efficient. [Alyssa: Rebase on latest master, updating with respect to typeless moves] Acked-by: Alyssa Rosenzweig <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	nir: Rematerialize compare instructions	Ian Romanick	2019-05-31	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some architectures, Boolean values used to control conditional branches or condtional selection must be propagated into a flag. This generally means that a stored Boolean value must be compared with zero. Rather than force the generation of extra compares with zero, re-emit the original comparison instruction. This can save register pressure by not needing to store the Boolean value. There are several possible ares for future improvement to this pass: 1. Be more conservative. If both sources to the comparison instruction are non-constants, it may be better for register pressure to emit the extra compare. The current shader-db results on Intel GPUs (next commit) lead me to believe that this is not currently a problem. 2. Be less conservative. Currently the pass requires that all users of the comparison match the pattern. The idea is that after the pass is complete, no instruction will use the resulting Boolean value. The only uses will be of the flag value. It may be beneficial to relax this requirement in some cases. 3. Be less conservative. Also try to rematerialize comparisons used for discard_if intrinsics. After changing the way the Intel compiler generates cod e for discard_if (see MR!935), I tried implementing this already. The changes were pretty small. Instructions were helped in 19 shaders, but, overall, cycles were hurt. A commit "nir: Rematerialize comparisons for nir_intrinsic_discard_if too" is on my fd.o cgit. 4. Copy the preceeding ALU instruction. If the comparison is a comparison with zero, and it is the only user of a particular ALU instruction (e.g., (a+b) != 0.0), it may be a further improvment to also copy the preceeding ALU instruction. On Intel GPUs, this may enable cmod propagation to make additional progress. v2: Use much simpler method to get the prev_block for an if-statement. Suggested by Tim. Reviewed-by: Matt Turner <[email protected]>
*	nir: implement lowering for fsin and fcos	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Lower sin and cos using Nick's fast sin/cos approximation from https://web.archive.org/web/20180105155939/http://forum.devmaster.net/t/fast-and-accurate-sine-cosine/9648 It's suitable for GLES2, but it throws warnings in dEQP GLES3 precision tests. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Qiang Yu <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	nir/flrp: Add new lowering pass for flrp instructions	Ian Romanick	2019-05-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This pass will soon grow to include some optimizations that are difficult or impossible to implement correctly within nir_opt_algebraic. It also include the ability to generate strictly correct code which the current nir_opt_algebraic lowering lacks (though that could be changed). v2: Document the parameters to nir_lower_flrp. Rebase on top of 3766334923e ("compiler/nir: add lowering for 16-bit flrp") Reviewed-by: Matt Turner <[email protected]>
*	nir: add int_to_float lowering pass	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This new pass lowers ints and bools to floats. It allows hardware that doesn't have native integers (e.g. Mali4x0) use the same code paths as modern hardware. It uses newly introduced pass to gather SSA types and should be used as late as possible. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	mesa: Makefile.sources: Add nir_lower_fb_read.c to Makefile.sources list	John Stultz	2019-05-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In commit a99c360a4630 (nir: add pass to lower fb reads), a new file was added that needs to also be added to the Makefile.sources list used by the Android and SCons build system. Cc: Rob Clark <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Amit Pundir <[email protected]> Cc: Sumit Semwal <[email protected]> Cc: Alistair Strachan <[email protected]> Cc: Greg Hartman <[email protected]> Cc: Tapani Pälli <[email protected]> Cc: Jason Ekstrand <[email protected]> Fixes: a99c360a463 ("nir: add pass to lower fb reads") Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: John Stultz <[email protected]>
*	nir: Add a SSA type gathering pass	Jason Ekstrand	2019-05-04	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	nir: add rcp(w) lowering for gl_FragCoord	Andreas Baierl	2019-04-29	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some hardware (e.g. Mali400) the shader needs to apply some transformations for correct gl_FragCoord handling. The lowering actions look like the following in pseudocode: gl_FragCoord.xyz = gl_FragCoord_orig.xyz gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w Add this lowering as a nir pass in preparation for using it in the driver. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir: Add nir_lower_viewport_transform	Alyssa Rosenzweig	2019-04-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On Mali hardware (supported by Panfrost and Lima), the fixed-function transformation from world-space to screen-space coordinates is done in the vertex shader prior to writing out the gl_Position varying, rather than in dedicated hardware. This commit adds a shared NIR pass for implementing coordinate transformation and lowering gl_Position writes into screen-space gl_Position writes. v2: Run directly on derefs before io/vars are lowered to cleanup the code substantially. Thank you to Qiang for this suggestion! v3: Bikeshed continues. v4: Add to Makefile.sources (per Jason's comment). Bikeshed comment. Ian and Qiang's reviews are from v3, but no real functional changes from v4. Rob's review is from v4. Signed-off-by: Alyssa Rosenzweig <[email protected]> Suggested-by: Qiang Yu <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	nir: Add a pass for selectively lowering variables to scratch space	Jason Ekstrand	2019-04-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <[email protected]>
*	glsl/nir: add support for lowering bindless images_derefs	Karol Herbst	2019-04-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	v2: handle atomics as well make use of nir_rewrite_image_intrinsic v3: remove call to nir_remove_dead_derefs v4: (Timothy Arceri) dont actually call lowering yet Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v3) Reviewed-by: Marek Olšák <[email protected]>
*	nir: Get rid of global registers	Jason Ekstrand	2019-04-09	1	-1/+0
\| \| \| \| \| \| \| \| \|	We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add partial redundancy elimination for compares	Ian Romanick	2019-03-28	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass attempts to dectect code sequences like if (x < y) { z = y - x; ... } and replace them with sequences like t = x - y; if (t < 0) { z = -t; ... } On architectures where the subtract can generate the flags used by the if-statement, this saves an instruction. It's also possible that moving an instruction out of the if-statement will allow nir_opt_peephole_select to convert the whole thing to a bcsel. Currently only floating point compares and adds are supported. Adding support for integer will be a challenge due to integer overflow. There are a couple possible solutions, but they may not apply to all architectures. v2: Fix a typo in the commit message and a couple typos in comments. Fix possible NULL pointer deref from result of push_block(). Add missing (-A + B) case. Suggested by Caio. v3: Fix is_not_const_zero to work correctly with types other than nir_type_float32. Suggested by Ken. v4: Add some comments explaining how this works. Suggested by Ken. Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add a lowering pass for non-uniform resource access	Jason Ekstrand	2019-03-25	1	-0/+1
\| \| \| \|	Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	spirv,nir: lower frexp_exp/frexp_sig inside a new NIR pass	Samuel Pitoiset	2019-03-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a new pass to lower array dereferences on vectors	Jason Ekstrand	2019-03-15	1	-0/+1
\| \| \| \| \| \| \| \|	This pass was originally written for lowering TCS output reads and writes but it is also applicable just about anything including UBOs, SSBOs, and shared variables. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	glsl/nir: Add a pass to lower UBO and SSBO access	Jason Ekstrand	2019-03-15	1	-0/+1
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	nir: Add a pass to combine store_derefs to same vector	Caio Marcelo de Oliveira Filho	2019-03-13	1	-0/+1
\| \| \| \| \| \| \| \| \|	v2: (all from Jason) Reuse existing function for the end of the block combinations. Check the SSA values are coming from the right place in tests. Document the case when the store to array_deref is reused. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a pass for lowering IO back to vector when possible	Jason Ekstrand	2019-03-12	1	-0/+1
\| \| \| \| \| \| \| \|	This pass tries to turn scalar and array-of-scalar IO variables into vector IO variables whenever possible. Reviewed-by: Connor Abbott <[email protected]> Cc: "19.0" <[email protected]>
*	nir: Add a stripping pass for improved cacheability	Connor Abbott	2019-03-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Oftentimes various nir shaders after lowering will be the same, or almost the same. For example, this can happen when the same shader is linked with different shaders to form different pipelines and cross-stage optimizations don't kick in to change it. We want to avoid running the backend twice on these shaders. We were already doing this with radeonsi, but we were storing a few extra pieces of information that made this much less effective compared to TGSI. The worse offender by far was the program name, which caused most of the cache misses. This pass strips out these pieces of information, controlled by the NIR_STRIP debug env variable. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir/spirv: initial handling of OpenCL.std extension opcodes	Karol Herbst	2019-03-05	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Not complete, mostly just adding things as I encounter them in CTS. But not getting far enough yet to hit most of the OpenCL.std instructions. Anyway, this is better than nothing and covers the most common builtins. v2: add hadd proof from Jason move some of the lowering into opt_algebraic and create new nir opcodes simplify nextafter lowering fix normalize lowering for inf rework upsample to use nir_pack_bits add missing files to build systems v3: split lines of iadd/sub_sat expressions Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Move nir_lower_uniforms_to_ubo to compiler/nir.	Timur Kristóf	2019-03-05	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The nir_lower_uniforms_to_ubo function is useful outside of mesa/state_tracker, and in fact is needed to produce NIR for drivers that have the PIPE_CAP_PACKED_UNIFORMS capability. Signed-Off-By: Timur Kristóf <[email protected]> Tested-by: Andre Heider <[email protected]> Tested-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	automake: Fix path to generated source	Dylan Baker	2019-01-29	1	-1/+1
\| \| \| \| \| \|	Fixes: b63a1f8e40b6705d6a1d806fbd38dcd197d4229b ("glsl: Create file to contain software fp64 functions") Reviewed-by: Jordan Justen <[email protected]>
*	nir: Add a bool to float32 lowering pass	Jason Ekstrand	2019-01-14	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering. ior: fmax instead of fadd allows removing the fsat. inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better with the scalar instruction set. Reviewed-by: Jonathan Marek <[email protected]>
*	glsl: Create file to contain software fp64 functions	Matt Turner	2019-01-09	1	-1/+2
\| \| \| \| \| \| \|	The following patches will add implementations of various double-precision operations to this file. Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add a bool to int32 lowering pass	Jason Ekstrand	2018-12-16	1	-0/+1
\| \| \| \| \| \| \| \|	We also enable it in all of the NIR drivers. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
*	nir: Add a pass for lowering integer division by constants	Jason Ekstrand	2018-12-13	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	It's a reasonably well-known fact in the world of compilers that integer divisions by constants can be replaced by a multiply, an add, and some shifts. This commit adds such an optimization to NIR for easiest case of udiv. Other division operations will be added in following commits. In order to provide some additional driver control, the pass takes a minimum bit size to optimize. Reviewed-by: Ian Romanick [email protected]
*	nir: Add a pass for gathering transform feedback info	Jason Ekstrand	2018-10-29	1	-1/+3
\| \| \| \| \| \| \|	This is different from the GL_ARB_spirv pass because it generates a much simpler data structure that isn't tied to OpenGL and mtypes.h. Reviewed-by: Samuel Pitoiset <[email protected]>
*	nir: Separate dead write removal into its own pass	Caio Marcelo de Oliveira Filho	2018-10-15	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead of doing this as part of the existing copy_prop_vars pass. Separation makes easier to expand the scope of both passes to be more than per-block. For copy propagation, the information about valid copies comes from previous instructions; while the dead write removal depends on information from later instructions ("have any instruction used this deref before overwrite it?"). Also change the tests to use this pass (instead of copy prop vars). Note that the disabled tests continue to fail, since the standalone pass is still per-block. v2: Remove entries from dynarray instead of marking items as deleted. Use foreach_reverse. (Caio) (all from Jason) Do not cache nir_deref_path. Not worthy for this patch. Clear unused writes when hitting a call instruction. Clean up enumeration of modes for barriers. Move metadata calls to the inner function. v3: For copies, use the vector length to calculate the mask. (all from Jason) Use nir_component_mask_t when applicable. Rename functions for clarity. Consider local vars used by a call to be conservative (SPIR-V has such cases). Comment and assert the assumption that stores and copies are always to a deref that ends with a vector or scalar. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add an array copy optimization	Jason Ekstrand	2018-08-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This peephole optimization looks for a series of load/store_deref or copy_deref instructions that copy an array from one variable to another and turns it into a copy_deref that copies the entire array. The pattern it looks for is extremely specific but it's good enough to pick up on the input array copies in DXVK and should also be able to pick up the sequence generated by spirv_to_nir for a OpLoad of a large composite followed by OpStore. It can always be improved later if needed. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	nir: Add a structure splitting pass	Jason Ekstrand	2018-08-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through structures and considers them to be "split". This pass exists to help other passes more easily see through structure variables. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	nir/linker: Add the start of a pure-NIR linker for XFB	Neil Roberts	2018-07-31	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	v2: ignore names on purpose, for consistency with other places where we are doing the same (Alejandro) v3: changes proposed by Timothy Arceri, implemented by Alejandro Piñeiro: * Remove redundant 'struct active_xfb_varying' * Update several comments, including spec quotes if needed * Rename struct 'active_xfb_varying_array' to 'active_xfb_varyings' * Rename variable 'array' to 'active_varyings' * Replace one if condition for an assert (<MAX_FEEDBACK_BUFFERS) * Remove BufferMode initialization (was already done) v4: simplify output pointer handling (Timothy) Signed-off-by: Neil Roberts <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	glsl: use only copy_propagation_elements	Caio Marcelo de Oliveira Filho	2018-07-27	1	-1/+0
\| \| \| \| \| \| \| \|	Now that the elements version handles both cases, remove the non-elements version. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
*	nir: add builtin builder	Karol Herbst	2018-07-24	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	also move some of the GLSL builtins over we will need for implementing some OpenCL builtins v2: replace NIR_IMM_FP by nir_imm_floatN_t in ported code fix up changes caused by swizzle rework Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
*	nir/linker: Add a pure NIR implementation of the atomic counter linker	Neil Roberts	2018-07-03	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	This is mostly just a straight-forward conversion of link_assign_atomic_counter_resources to C directly using nir variables instead of GLSL IR variables. It is based on the version of link_assign_atomic_counter_resources in 6b8909f2d1906. I’m noting this here to make it easier to track changes and keep the NIR version up-to-date. Reviewed-by: Timothy Arceri <[email protected]>
*	nir: Add a large constants optimization pass	Jason Ekstrand	2018-07-02	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This pass searches for reasonably large local variables which can be statically proven to be constant and moves them into shader constant data. This is especially useful when large tables are baked into the shader source code because they can be moved into a UBO by the driver to reduce register pressure and make indirect access cheaper. v2 (Jason Ekstrand): - Use a size/align function to ensure we get the right alignments - Use the newly added deref offset helpers Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Delete lower_io_types	Jason Ekstrand	2018-06-22	1	-1/+0
\| \| \| \| \| \| \| \| \| \|	It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/lower_samplers: remove legacy version	Rob Clark	2018-06-22	1	-1/+0
\| \| \| \| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir/lower_samplers: split out _legacy version for deref chains	Rob Clark	2018-06-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add a concept of per-member structs and a lowering pass	Jason Ekstrand	2018-06-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	This adds a concept of "members" to a variable with an interface type. It allows you to specify the full variable data for each member of the interface instead of once for the variable. We also add a lowering pass to lower those variables to a sequence of variables and rewrite all the derefs accordingly. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Add a deref path helper struct	Jason Ekstrand	2018-06-22	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit introduces a new nir_deref.h header for helpers that are less common and really only needed by a few heavy-duty passes. In this header is a new struct for representing a full deref path which can be walked in either direction. v2 (Jason Ekstrand): - Assert that deref != NULL (Caio) - Fill _short_path with 0xdeadbeef in debug builds when not used (Caio) - Make nir_deref_path a typedef (Rob) Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>