mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	spirv/cl: support vload/vstore	Karol Herbst	2019-05-04	1	-0/+55
\| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add nir_op_vec helper	Karol Herbst	2019-05-04	3	-22/+14
\| \| \| \| \| \| \| \| \|	with that we can simplify code where nir vectors are created v2: merge both lines in nir_vec Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a nir_builder_alu variant which takes an array of components	Karol Herbst	2019-05-04	1	-14/+36
\| \| \| \| \| \| \|	v2: rename to nir_build_alu_src_arr Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	vtn: handle bitcast with pointer src/dest	Karol Herbst	2019-05-04	3	-29/+45
\| \| \| \| \| \| \|	v2: use vtn_push_ssa and vtn_ssa_value Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: Add a SSA type gathering pass	Jason Ekstrand	2019-05-04	4	-0/+223
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This new pass (which isn't even compile-tested) attempts to determine the ALU type of all the SSA values in a function impl. It takes a greedy approach and assigns intness or floatness to everything it thinks can possibly contain an int or a float. Some values will be labled as both int and float and some will be labled as neither and it is up to the caller to decide what to do with this information. However, for a "nice" shader where the original source contained no bit-casts and no implicit bit-casts were introduced by optimizations, there shouldn't be any overlap in the two sets save for the odd CSEd zero constant. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	nir/algebraic: Don't emit empty initializers for MSVC	Connor Abbott	2019-05-04	1	-0/+4
\| \| \| \| \| \| \| \| \|	Just don't emit the transform array at all if there are no transforms v2: - Don't use len(array) > 0 (Dylan) - Keep using ARRAY_SIZE to make the generated C code easier to read (Jason).
*	meson: Don't build glsl cache_test when shader cache is disabled	Dylan Baker	2019-05-03	1	-12/+13
\| \| \| \| \| \| \|	v2: - Use new with_shader_cache variable instead of host_machine.system() == 'windows' Reviewed-by: Eric Anholt <[email protected]>
*	glsl/tests: define ssize_t on windows	Dylan Baker	2019-05-03	1	-0/+4
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	glsl: fix general_ir_test with mingw	Dylan Baker	2019-05-03	1	-7/+7
\| \| \| \| \| \| \| \|	Somewhere down in the depths of the mingw headers 'interface' is defined, change it to iface like a similar patch did. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	nir: fix lower vars to ssa for larger vector sizes.	Dave Airlie	2019-05-03	1	-4/+4
\| \| \| \| \| \| \|	This has a couple of hardcoded vec4 limits in it, change them to the proper sizing to avoid future issues. Reviewed-by: Jason Ekstrand <[email protected]>
*	spirv: fix SpvOpBitSize return value.	Dave Airlie	2019-05-03	1	-3/+1
\| \| \| \| \| \|	The spir-v spec says this returns a bool. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: fix nir tex print harder	Rob Clark	2019-05-02	1	-6/+5
\| \| \| \| \| \|	Fixes: 691d5a825a6 nir: rework tex instruction printing Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Rob Clark <[email protected]>
*	glsl: fix and clean up NV_compute_shader_derivatives support	Marek Olšák	2019-05-02	1	-54/+24
\| \| \| \| \| \| \|	- make sure compute shader derivatives are exposed for all extensions - unify duplicated code Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	nir: add pass to lower fb reads	Rob Clark	2019-05-02	5	-6/+141
\| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	nir: fix lower_wpos_ytransform in load_frag_coord case	Rob Clark	2019-05-02	1	-10/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Apparently we never hit this path. Or at least haven't for a rather long time. But in either case (load_deref or load_frag_coord), we can just directly use the intrinsic's ssa dest. So stop passing the nir_variable (which would be NULL in the load_frag_coord case) around and instead just use &intr->dest.ssa. (This ofc means we need to setup the cursor to insert after the instruction, which seems to be another bug of the original implementation.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	nir: rework tex instruction printing	Rob Clark	2019-05-02	1	-8/+10
\| \| \| \| \| \| \|	The extra comma at the end was annoying me. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	nir/search: Add debugging code to dump the pattern matched	Connor Abbott	2019-05-02	1	-0/+75
\| \| \| \| \| \|	This was useful while debugging the previous commit. Reviewed-by: Jason Ekstrand <[email protected]>
*	nir/search: Add automaton-based pre-searching	Connor Abbott	2019-05-02	3	-19/+425
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	nir_opt_algebraic is currently one of the most expensive NIR passes, because of the many different patterns we've added over the years. Even though patterns are already sorted by opcode, there are still way too many patterns for common opcodes like bcsel and fadd, which means that many patterns are tried but only a few actually match. One way to fix this is to add a pre-pass over the code that scans it using an automaton constructed beforehand, similar to the automatons produced by lex and yacc for parsing source code. This automaton has to walk the SSA graph and recognize possible pattern matches. It turns out that the theory to do this is quite mature already, having been developed for instruction selection as well as other non-compiler things. I followed the presentation in the dissertation cited in the code, "Tree algorithms: Two Taxonomies and a Toolkit," trying to keep the naming similar. To create the automaton, we have to perform something like the classical NFA to DFA subset construction used by lex, but it turns out that actually computing the transition table for all possible states would be way too expensive, with the dissertation reporting times of almost half an hour for an example of size similar to nir_opt_algebraic. Instead, we adopt one of the "filter" approaches explained in the dissertation, which trade much faster table generation and table size for a few more table lookups per instruction at runtime. I chose the filter which resulted the fastest table generation time, with medium table size. Right now, the table generation takes around .5 seconds, despite being implemented in pure Python, which I think is good enough. Based on the numbers in the dissertation, the other choice might make table compilation time 25x slower to get 4x smaller table size, but I don't think that's worth it. As of now, we get the following binary size before and after this patch: text data bss dec hex filename 11979455 464720 730864 13175039 c908ff before i965_dri.so text data bss dec hex filename 12037835 616244 791792 13445871 cd2aef after i965_dri.so There are a number of places where I've simplified the automaton by getting rid of details in the LHS patterns rather than complicate things to deal with them. For example, right now the automaton doesn't distinguish between constants with different values. This means that it isn't as precise as it could be, but the decrease in compile time is still worth it -- these are the compilation time numbers for a shader-db run with my (admittedly old) database on Intel skylake: Difference at 95.0% confidence -42.3485 +/- 1.375 -7.20383% +/- 0.229926% (Student's t, pooled s = 1.69843) We can always experiment with making it more precise later. Reviewed-by: Jason Ekstrand <[email protected]>
*	glsl: fix typo in #warning message	Brian Paul	2019-05-02	1	-1/+1
\| \| \| \|	Trivial. Spotted by Eric Engestrom.
*	glsl: work around MinGW 7.x compiler bug	Brian Paul	2019-05-01	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \|	I'm not sure what triggered this, but building with scons platform=windows toolchain=crossmingw machine=x86 build=profile with MinGW g++ 7.3 or 7.4 causes an internal compiler error. We can work around it by forcing -O1 optimization. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
*	nir: Saturating integer arithmetic is not associative	Ian Romanick	2019-05-01	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In 8-bits, iadd_sat(iadd_sat(0x7f, 0x7f), -1) = iadd_sat(0x7f, -1) = 0x7e but, iadd_sat(0x7f, iadd_sat(0x7f, -1)) = iadd_sat(0x7f, 0x7e) = 0x7f Fixes: 272e927d0e9 ("nir/spirv: initial handling of OpenCL.std extension opcodes") Reviewed-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	nir: improve convert_yuv_to_rgb	Jonathan Marek	2019-05-01	1	-15/+14
\| \| \| \| \| \| \| \| \| \| \| \|	Use a different arrangement of constants to allow more ffma. A vec4 backend will now use 3 fma for yuv_to_rgb. On freedreno/ir3, it is down from 10 to 7 alu (4 fma, 3 mul, 3 add to 7 fma). Other backends shouldn't be hurt. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Tested-by: Ian Romanick <[email protected]>
*	spirv: add missing SPV_EXT_descriptor_indexing capabilities	Juan A. Suarez Romero	2019-04-30	2	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Add ShaderNonUniformEXT, UniformBufferArrayNonUniformIndexingEXT, SampledImageArrayNonUniformIndexingEXT, StorageBufferArrayNonUniformIndexingEXT, StorageImageArrayNonUniformIndexingEXT, InputAttachmentArrayNonUniformIndexingEXT, UniformTexelBufferArrayNonUniformIndexingEXT and StorageTexelBufferArrayNonUniformIndexingEXT capabilities. Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	spirv: Properly handle SpvOpAtomicCompareExchangeWeak	Caio Marcelo de Oliveira Filho	2019-04-29	1	-75/+82
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The code was handling the Weak variant in some cases, but missing others, e.g. the get_deref_nir_atomic_op. Add all the missing cases with the same behavior of the non-Weak SpvOpAtomicCompareExchange. Note that the Weak variant is basically an alias, as SPIR-V 1.3, Revision 7 says "OpAtomicCompareExchangeWeak Deprecated (use OpAtomicCompareExchange). Has the same semantics as OpAtomicCompareExchange." Reviewed-by: Jason Ekstrand <[email protected]>
*	delete autotools .gitignore files	Eric Engestrom	2019-04-29	9	-45/+0
\| \| \| \| \| \| \| \|	One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	glsl/linker: check for xfb_offset aliasing	Andres Gomez	2019-04-29	2	-31/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From page 76 (page 80 of the PDF) of the GLSL 4.60 v.5 spec: " No aliasing in output buffers is allowed: It is a compile-time or link-time error to specify variables with overlapping transform feedback offsets." Currently, this is expected to fail, but it succeeds: " ... layout (xfb_offset = 0) out vec2 a; layout (xfb_offset = 0) out vec4 b; ... " Fixes the following piglit test: tests/spec/arb_enhanced_layouts/compiler/transform-feedback-layout-qualifiers/xfb_offset/invalid-overlap.vert Fixes the following test: KHR-GL44.enhanced_layouts.xfb_output_overlapping v2: - Use a data structure to track the used components instead of a nested loop (Ilia). v3: - Take the BITSET_WORD array out from the gl_transform_feedback_buffer struct and make it local to the validation process (Timothy). - Do not use a nested scope for the validation (Timothy). v4: - Add reference to the fixed piglit test in the commit log. - Add reference to the fixed VK-GL-CTS test in the commit log (Tapani). - Empty initialize the BITSET_WORD pointers array (Tapani). Cc: Timothy Arceri <[email protected]> Cc: Ilia Mirkin <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	nir: Add a new nir_cf_list_is_empty_block() helper.	Kenneth Graunke	2019-04-28	1	-0/+15
\| \| \| \| \| \| \|	Helper and name suggested by Eric Anholt. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	glsl/list: Add an exec_list_is_singular() helper.	Kenneth Graunke	2019-04-28	1	-0/+7
\| \| \| \| \| \| \|	Similar to list_is_singular() in util/list.h. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir: add rcp(w) lowering for gl_FragCoord	Andreas Baierl	2019-04-29	4	-0/+84
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some hardware (e.g. Mali400) the shader needs to apply some transformations for correct gl_FragCoord handling. The lowering actions look like the following in pseudocode: gl_FragCoord.xyz = gl_FragCoord_orig.xyz gl_FragCoord.w = 1.0 / gl_FragCoord_orig.w Add this lowering as a nir pass in preparation for using it in the driver. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: use empty brace initializer	Tapani Pälli	2019-04-26	1	-2/+2
\| \| \| \| \| \| \| \| \|	fixes following warning with clang: warning: suggest braces around initialization of subobject Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	nir: use braces around subobject in initializer	Tapani Pälli	2019-04-26	2	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Used same syntax as elsewhere with Mesa sources, verified result against MSVC with godbolt.org. fixes following warning with clang: warning: suggest braces around initialization of subobject v2: empty braces -> braces around subobject (Caio, Kristian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
*	nir/algebraic: Optimize integer cast-of-cast	Jason Ekstrand	2019-04-26	1	-0/+42
\| \| \| \| \| \| \|	These have been popping up more and more with the OpenCL work and other bits causing extra conversions to/from 64-bit. Reviewed-by: Karol Herbst <[email protected]>
*	nir: fix bit_size in lower indirect derefs.	Dave Airlie	2019-04-26	1	-1/+1
\| \| \| \| \| \| \| \|	This fixes a case where we are expecting 64-bit but generate 32-bit consts and validate gets angry. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
*	glsl: fix shader_storage_blocks_write_access for SSBO block arrays (v2)	Marek Olšák	2019-04-25	1	-3/+19
\| \| \| \| \| \| \| \| \| \|	This fixes KHR-GL45.compute_shader.resources-max on radeonsi. Fixes: 4e1e8f684bf "glsl: remember which SSBOs are not read-only and pass it to gallium" v2: use is_interface_array, protect again assertion failures in u_bit_consecutive Reviewed-by: Dave Airlie <[email protected]>
*	freedreno/ir3: lower load_barycentric_at_offset	Rob Clark	2019-04-25	1	-0/+3
\| \| \| \| \| \| \| \| \|	Calculates i,j at specified offset within a pixel. A new load_size_ir3 intrinsic is used in conjunction with fddx/fddy to translate the offset into primitive space and adjust the i,j from load_barycentric_pixel accordingly. Signed-off-by: Rob Clark <[email protected]>
*	freedreno/ir3: lower load_barycentric_at_sample	Rob Clark	2019-04-25	1	-0/+7
\| \| \| \| \| \| \|	This lowers load_barycentric_at_sample to load_sample_pos_from_id plus load_barycentric_at_offset. Signed-off-by: Rob Clark <[email protected]>
*	compiler: rename SYSTEM_VALUE_VARYING_COORD	Rob Clark	2019-04-25	2	-3/+12
\| \| \| \| \| \| \|	And add corresponding enums for different sorts of varying interpolation. Signed-off-by: Rob Clark <[email protected]>
*	nir: Add option to lower tex to txl when shader don't support implicit LOD	Caio Marcelo de Oliveira Filho	2019-04-25	2	-0/+8
\| \| \| \| \| \| \| \| \| \| \|	We already add the LOD src, so go ahead and update the texop as well when this option is set. v2: Make it an option. (Rob Clark) v3: Use a more concise name suggested by Jason. Reviewed-by: Rob Clark <[email protected]>
*	nir: fix nir_remove_unused_varyings()	Timothy Arceri	2019-04-25	1	-18/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were only setting the used mask for the first component of a varying. Since the linking opts split vectors into scalars this has mostly worked ok. However this causes an issue where for example if we split a struct on one side of the interface but not the other, then we can possibly end up removing the first components on the side that was split and then incorrectly remove the whole struct on the other side of the varying. With this change we simply mark all 4 components for each slot used by a struct. We could possibly make this more fine gained but that would require a more complex change. This fixes a bug in Strange Brigade on RADV when tessellation is enabled, all credit goes to Samuel Pitoiset for tracking down the cause of the bug. Fixes: f1eb5e639997 ("nir: add component level support to remove_unused_io_vars()") Reviewed-by: Samuel Pitoiset <[email protected]>
*	glsl: handle interactions between EXT_gpu_shader4 and texture extensions	Marek Olšák	2019-04-24	3	-323/+412
\| \| \| \| \| \|	also, EXT_texture_buffer_object has to be enabled separately. Reviewed-by: Eric Anholt <[email protected]>
*	glsl: allow "varying out" for fragment shader outputs with EXT_gpu_shader4	Marek Olšák	2019-04-24	3	-2/+19
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	glsl: add texture builtin functions for EXT_gpu_shader4	Marek Olšák	2019-04-24	1	-25/+667
\| \| \| \| \| \| \| \| \|	v2: some fixes to texture functions thanks to piglit tests Reviewed-by: Timothy Arceri <[email protected]> (v1) Reviewed-by: Ian Romanick <[email protected]> (v1) Tested-by: Dieter Nützel <[email protected]> (v1) Reviewed-by: Eric Anholt <[email protected]>
*	glsl: add arithmetic builtin functions for EXT_gpu_shader4	Marek Olšák	2019-04-24	1	-13/+35
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: add builtin variables for EXT_gpu_shader4	Marek Olšák	2019-04-24	1	-3/+4
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: apply some 1.30 and other rules to EXT_gpu_shader4 as well	Marek Olšák	2019-04-24	3	-8/+12
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	glsl: enable types for EXT_gpu_shader4	Chris Forbes	2019-04-24	2	-25/+57
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: add `unsigned int` type for EXT_GPU_shader4	Marek Olšák	2019-04-24	2	-2/+11
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: enable noperspective\|flat\|centroid for EXT_gpu_shader4	Chris Forbes	2019-04-24	1	-3/+3
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: add scaffolding for EXT_gpu_shader4	Chris Forbes	2019-04-24	3	-0/+4
\| \| \| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	glsl: Silence may unused parameter warnings in glsl/ir.h	Ian Romanick	2019-04-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Every file that included glsl/ir.h had a warning like: src/compiler/glsl/ir.h: In member function ‘virtual bool ir_rvalue::is_lvalue(const _mesa_glsl_parse_state) const’: src/compiler/glsl/ir.h:236:64: warning: unused parameter ‘state’ [-Wunused-parameter] virtual bool is_lvalue(const struct _mesa_glsl_parse_state state = NULL) const ^ Cc: Samuel Pitoiset <[email protected]> Fixes: fa4ebf6b8d9 ("glsl: add _mesa_glsl_parse_state object to is_lvalue()") Reviewed-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>