mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i915: Add XRGB8888 format to intel_screen_make_configs	Derek Foreman	2017-01-13	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is a copy of commit 536003c11e4cb1172c540932ce3cce06f03bf44e except for i915. Original log for the i965 commit follows: Some application, such as drm backend of weston, uses XRGB8888 config as default. i965 doesn't provide this format, but before commit 65c8965d, the drm platform of EGL takes ARGB8888 as XRGB8888. Now that commit 65c8965d makes EGL recognize format correctly so weston won't start because it can't find XRGB8888. Add XRGB8888 format to i965 just as other drivers do. Signed-off-by: Derek Foreman <[email protected]> Acked-by: Boyan Ding <[email protected]> Tested-by: Mark Janes <[email protected]>
*	nir/i965: assert first is always less than 64	Juan A. Suarez Romero	2017-01-12	1	-0/+1
\| \| \| \| \| \|	This fixes a defect detected by Coverity Scan. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/gen7: expose OpenGL 4.2 on Haswell when supported	Juan A. Suarez Romero	2017-01-12	2	-2/+2
\| \| \| \| \| \| \| \| \|	GL_ARB_vertex_attrib_64bit was the last piece missing. v2: update docs (Jordan) Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: enable ARB_shader_precision to HSW+	Samuel Iglesias Gonsálvez	2017-01-12	1	-1/+1
\| \| \| \| \| \| \| \|	v2: update docs (Jordan) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: unify the code to enable of ARB_gpu_shader_fp64 and ↵	Samuel Iglesias Gonsálvez	2017-01-12	1	-7/+2
\| \| \| \| \| \| \| \| \|	ARB_vertex_attrib_64bit for HSW+ Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Enable ARB_vertex_attrib_64bit for Haswell	Alejandro Piñeiro	2017-01-12	1	-1/+3
\| \| \| \| \| \| \| \|	v2: update docs (Jordan) Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: check for dual slot attributes on any gen	Juan A. Suarez Romero	2017-01-12	1	-2/+1
\| \| \| \| \| \| \|	Those not supporting 64 bit input vertex attributes will have the dual_slot value as false. Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: emit correctly load_inputs for 64bit data	Juan A. Suarez Romero	2017-01-12	1	-6/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	For dvec3 and dvec4 types, a single GRF do not have enough space to allocate two inputs from two different vertices (SIMD4x2). So the GRF only contains first two components for the two vertices, and the next GRF has the remaining components. We want to put all the components for the same vertex in the same register. Thus, we do a shuffle to reorder the data. Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: take into account doubles when creating attribute mapping	Alejandro Piñeiro	2017-01-12	1	-4/+9
\| \| \| \| \| \| \| \| \| \|	Doubles needs more that one slot per attribute. So when filling the attribute_map we check if it is a double in order to allocate one extra register. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4/nir: vec4 also needs to remap vs attributes	Alejandro Piñeiro	2017-01-12	1	-10/+22
\| \| \| \| \| \| \| \| \| \| \| \| \|	Doubles need extra space, so we would need to do a remapping for vec4 too in order to take that into account. We reuse the already existing remap_vs_attrs, but passing is_scalar, so they could remap accordingly. v2: code-format remap_vs_attrs_params initialization (Matt) Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: use attribute slots for first non payload GRF	Alejandro Piñeiro	2017-01-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As part of the payload setup, setup_attributes is called with the first GRF that can be used for the attributes (first ones are used for uniforms for example) and returns the first GRF that is not part of the payload. Before this patch, it adds directly the number of attributes. But as with 64-bit attributes can consume more than one slot, that is not valid anymore. This patch change the addition to use the number of slots consumed. gen >= 8 would not be affected, as they use the scalar mode. For that case, the vs configuration is done at fs_visitor::assign_vs_urb_setup. v2: add explanation in commit log (Jordan) Reviewed-by: Jordan Justen <[email protected]>
*	i965: downsize 64PASSTHRU formats to equivalent 32FLOAT formats on gen < 8	Alejandro Piñeiro	2017-01-12	1	-30/+139
\| \| \| \| \| \| \| \| \| \| \| \|	gen < 8 doesn't support 64PASSTHRU formats when emitting vertices. So in order to provide the equivalent functionality, we need to downsize the format to equivalent 32FLOAT, and in some cases (R64G64B64 and R64G64B64A64) submit two 3DSTATE_VERTEX_ELEMENTS for each vertex element. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: return PASSTHRU surface types also on gen7	Alejandro Piñeiro	2017-01-12	1	-2/+6
\| \| \| \| \| \| \| \|	Although gen7 doesn't include surface types as a valid conversion format, we return it, as it reflects what we want to achieve, even if we need to workaround it on gen < 8. Reviewed-by: Jordan Justen <[email protected]>
*	i965: Enable predicate support on gen >= 8.	Rafael Antognolli	2017-01-11	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Predication needs cmd parser only on gen7. For newer platforms, it should be available without it. v2 (Ken): rebase on recent changes. Signed-off-by: Rafael Antognolli <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	i965: Use the nir_move_comparisons pass.	Kenneth Graunke	2017-01-12	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	While the below stats are encouraging this pass will also become very usefull for avoiding regression once brw_do_channel_expressions() and brw_do_vector_splitting() are disabled. On Broadwell: total instructions in shared programs: 13078787 -> 13060898 (-0.14%) instructions in affected programs: 1809827 -> 1791938 (-0.99%) helped: 4527 HURT: 157 total cycles in shared programs: 256562762 -> 256590424 (0.01%) cycles in affected programs: 159749392 -> 159777054 (0.02%) helped: 5583 HURT: 2289 total spills in shared programs: 14929 -> 14923 (-0.04%) spills in affected programs: 62 -> 56 (-9.68%) helped: 1 HURT: 0 total fills in shared programs: 20144 -> 20141 (-0.01%) fills in affected programs: 253 -> 250 (-1.19%) helped: 1 HURT: 3 LOST: 0 GAINED: 2 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Move nir_lower_locals_to_regs a bit later.	Kenneth Graunke	2017-01-12	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	I'm going to add a boolean scheduling pass that I want run late, but after copy propagation and dead code elimination. Yet, I don't want to have to think about registers. So, move the register conversion a little later. No impact on shader-db. Suggested by Jason Ekstrand. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	compiler: Merge shader_info's tcs and tes structs.	Kenneth Graunke	2017-01-10	5	-20/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Annoyingly, SPIR-V lets you specify all of these fields in either the TCS or TES, which means that we need to be able to store all of them for either shader stage. Putting them in a union won't work. Combining both is an easy solution, and given that the TCS struct only had a single field, it's pretty inexpensive. This patch renames the combined struct to "tess" to indicate that it's for tessellation in general, not one of the two stages. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Fix number of slots in SSO mode when there are no user varyings.	Kenneth Graunke	2017-01-09	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We want vue_map->num_slots to be one more than the final slot. When assigning fixed slots, built-in slots, and non-SSO user varyings, we do slot++. This leaves "slot" as one past the most recently assigned slot. But for SSO user varyings, we computed slot based on the varying location value...and left it at that slot value. To work around this inconsistency, I made num_slots be "slot + 1" if separate and "slot" otherwise. The problem is...if there are no user varyings in SSO mode...then we would have done slot++ when assigning built-ins, so it would be off by one. This resulted in loops from 0 to vue_map->num_slots hitting a bonus BRW_VARYING_SLOT_PAD at the end. This used to break the SIMD8 VS/TES backends, but I fixed that in commit 480d6c1653713dcae617ac523b2ca5deee01c845. It's probably safe at this point, but we should fix it anyway. To fix this, do slot++ in all cases. For SSO mode, we overwrite slot for every varying, so this increment only matters on the last varying. Because we process varyings in order, this will set slot to 1 more than the highest assigned slot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	nir/i965: use two slots from inputs_read for dvec3/dvec4 vertex input attributes	Juan A. Suarez Romero	2017-01-09	6	-29/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	So far, input_reads was a bitmap tracking which vertex input locations were being used. In OpenGL, an attribute bigger than a vec4 (like a dvec3 or dvec4) consumes just one location, any other small attribute. So we mark the proper bit in inputs_read, and also the same bit in double_inputs_read if the attribute is a dvec3/dvec4. But in Vulkan, this is slightly different: a dvec3/dvec4 attribute consumes two locations, not just one. And hence two bits would be marked in inputs_read for the same vertex input attribute. To avoid handling two different situations in NIR, we just choose the latest one: in OpenGL, when creating NIR from GLSL/IR, any dvec3/dvec4 vertex input attribute is marked with two bits in the inputs_read bitmap (and also in the double_inputs_read), and following attributes are adjusted accordingly. As example, if in our GLSL/IR shader we have three attributes: layout(location = 0) vec3 attr0; layout(location = 1) dvec4 attr1; layout(location = 2) dvec3 attr2; then in our NIR shader we put attr0 in location 0, attr1 in locations 1 and 2, and attr2 in location 3 and 4. Checking carefully, basically we are using slots rather than locations in NIR. When emitting the vertices, we do a inverse map to know the corresponding location for each slot. v2 (Jason): - use two slots from inputs_read for dvec3/dvec4 NIR from GLSL/IR. v3 (Jason): - Fix commit log error. - Use ladder ifs and fix braces. - elements_double is divisible by 2, don't need DIV_ROUND_UP(). - Use if ladder instead of a switch. - Add comment about hardware restriction in 64bit vertex attributes. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: call intel_prepare_render always when reading pixels	Tapani Pälli	2017-01-09	1	-6/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently we do this only in the fallback code (when tiled memcpy version failed) but it needs to be done always so that we have correct read and write buffer in place. No regressions seen in CI. Fixes: dEQP-EGL.functional.buffer_age.* Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98330 Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Move TES input VUE map calculation out a layer.	Kenneth Graunke	2017-01-07	3	-9/+11
\| \| \| \| \| \| \| \| \| \| \|	In Vulkan, we'll compile the TCS and TES at the same time, so I can just pass the TCS output VUE map to brw_compile_tes as the TES input VUE map. So, we only need to do this in GL. Move it to the GL-specific layer. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Pass NULL for gl_program when compiling TES.	Kenneth Graunke	2017-01-07	1	-1/+1
\| \| \| \| \| \| \| \|	This isn't needed, and Vulkan doesn't have one. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Move TES spacing/domain/topology setup to brw_compile_tes().	Kenneth Graunke	2017-01-07	2	-33/+34
\| \| \| \| \| \| \| \|	Moving this down a layer lets us share code between Vulkan and GL. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Access TES shader info via NIR.	Kenneth Graunke	2017-01-07	1	-6/+6
\| \| \| \| \| \| \| \|	NIR exists in both GL and Vulkan, but gl_program is GL specific. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	mesa: Introduce a compiler enum for tessellation spacing.	Kenneth Graunke	2017-01-07	2	-15/+9
\| \| \| \| \| \| \| \| \| \|	It feels weird using GL_* enums in a Vulkan driver. v2: Fix the TESS_SPACING -> PIPE_TESS_SPACING conversion. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	compiler: Change shader_info->tes.vertex_order into a ccw boolean.	Kenneth Graunke	2017-01-07	1	-10/+3
\| \| \| \| \| \| \| \| \| \|	The vertex order is either clockwise or counterclockwise. We can just store a "ccw" boolean rather than GLenum values. I don't want to use GLenums in a Vulkan driver, and even in GL a simple boolean works fine. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	drirc: Allow extension midshader for Divinity: Original Sin (EE)	Kai Wasserbäch	2017-01-07	1	-0/+4
\| \| \| \| \| \| \| \|	See also <https://bugs.freedesktop.org/show_bug.cgi?id=93551#c27> where this was first observed as a requirement. Signed-off-by: Kai Wasserbäch <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	i965/compiler: Use the new nir_opt_copy_prop_vars pass	Jason Ekstrand	2017-01-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We run this after nir_lower_vars_to_ssa so that as many load/store_var intrinsics as possible before copy_prop_vars executes. This is because the pass isn't particularly efficient (it does a lot of linear walks of a linked list) so we'd like as much of the work as possible to be done before copy_prop_vars runs. Shader DB results on Sky Lake: total instructions in shared programs: 12020290 -> 12013627 (-0.06%) instructions in affected programs: 26033 -> 19370 (-25.59%) helped: 16 HURT: 13 total cycles in shared programs: 137772848 -> 137549012 (-0.16%) cycles in affected programs: 6955660 -> 6731824 (-3.22%) helped: 217 HURT: 237 total loops in shared programs: 3208 -> 3208 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 4112 -> 4057 (-1.34%) spills in affected programs: 483 -> 428 (-11.39%) helped: 2 HURT: 0 total fills in shared programs: 5519 -> 5102 (-7.56%) fills in affected programs: 993 -> 576 (-41.99%) helped: 2 HURT: 0 LOST: 0 GAINED: 0 Broadwell had similar results. On older hardware, the impact isn't as large because they don't advertise GL 4.5. Of the hurt programs, all but one are hurt by a single instruction and the one is hurt by 3 instructions. All of the helped programs, on the other hand, are helped by at least 3 instructions and one kerbal space program shader is helped by 44.59%. The real star of the show, however, is the Gl43CSDof synmark2 benchmark which has two shaders which are cut by 28% and 40% and the over-all runtime performance of the benchmark on my Sky Lake laptop is improved by around 25-30% (it's a bit hard to be exact due to thermal throttling). Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Rework gl_TessLevel*[] handling to use NIR compact arrays.	Kenneth Graunke	2017-01-06	10	-364/+92
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Treating everything as scalar arrays allows us to drop a bunch of special case input/output munging all throughout the backend. Instead, we just need to remap the TessLevel components to the appropriate patch URB header locations in remap_patch_urb_offsets(). We also switch to treating the TES input versions of these as ordinary shader inputs rather than system values, as remap_patch_urb_offsets() just makes everything work out without special handling. This regresses one Piglit test: arb_tessellation_shader-large-uniforms/GL_TESS_CONTROL_SHADER-array-at-limit The compiler starts promoting the constant arrays assigned to gl_TessLevel* to uniform arrays. Since the shader also has a uniform array that uses the maximum number of uniform components, this puts it over the uniform component limit enforced by the linker. This is arguably a bug in the constant array promotion code (it should avoid pushing us over limits), but is unlikely to penalize any real application. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Inline store_output helper in quads workaround code.	Kenneth Graunke	2017-01-06	1	-14/+10
\| \| \| \| \| \| \| \| \|	It's only used in one place, it ignores the offset parameter currently, and I want to add more parameters...at which point, passing in a bunch of integers seems less obvious than writing it out. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Make unify_interfaces not spread VARYING_BIT_TESS_LEVEL_*.	Kenneth Graunke	2017-01-06	1	-2/+5
\| \| \| \| \| \| \| \| \| \| \|	This is harmless today because gl_TessLevelInner/Outer in the TES is currently treated as system values. However, when we move to treating them as inputs, this would cause a bug: with no TCS present, it would propagate TES reads of VARYING_SLOT_TESS_LEVEL into the VS output VUE map slots. This is totally bogus - those don't even exist in the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Enable several GLES 3.1 extensions on HSW+	Ian Romanick	2017-01-06	1	-3/+3
\| \| \| \| \| \| \| \| \|	The only reason we didn't previously enable this was the dependency on OpenGL ES 3.1. These should have been enabled as soon as HSW got stencil texturing. We also needed to fixup setting MaxViewports. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Always set MaxViewports and related limits	Ian Romanick	2017-01-06	1	-2/+1
\| \| \| \| \| \| \| \| \| \|	Since 9d6ca7c3, there should be no performance hit for having MaxViewports > 1. Always set this context state. This eliminates the need to update this conditional as we add support for OES_viewport_array on older GPUs. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Properly flush in hsw_pause_transform_feedback().	Kenneth Graunke	2017-01-06	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a number of transform feedback tests when run with Linux 4.8, which allows us to use the MI_LOAD_REGISTER_REG command, at which point we started using this new broken path. ES3-CTS.functional.transform_feedback.array_element.interleaved.lines.* and Piglit's arb_transform_feedback2/draw-auto are both fixed by this patch, for example. Thanks to Chris Wilson for catching this mistake! Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99030 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965: Fix texturing in the vec4 TCS and GS backends.	Kenneth Graunke	2017-01-06	1	-2/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were failing to zero m0.2 of the sampler message header for TCS and GS messages in the simple case. fs_generator has done this for about a year now, but we missed it in vec4_generator. Fixes ES31-CTS.core.texture_cube_map_array.sampling, GL45-CTS.texture_cube_map_array.sampling, and many dEQP-GLES31.functional.shaders.opaque_type_indexing.sampler subtests: - dynamically_uniform.tessellation_control.isampler3d - dynamically_uniform.tessellation_control.isamplercube - dynamically_uniform.tessellation_control.sampler2d - dynamically_uniform.tessellation_control.usamplercube - dynamically_uniform.tessellation_control.sampler2darray - dynamically_uniform.tessellation_control.isampler2darray - dynamically_uniform.tessellation_control.usampler3d - dynamically_uniform.tessellation_control.usampler2darray - dynamically_uniform.tessellation_control.usampler2d - dynamically_uniform.tessellation_control.sampler3d - dynamically_uniform.tessellation_control.samplercube - dynamically_uniform.tessellation_control.isampler2d - uniform.tessellation_control.isampler3d - uniform.tessellation_control.isamplercube - uniform.tessellation_control.usampler2d - uniform.tessellation_control.usampler3d - uniform.tessellation_control.sampler2darray - uniform.tessellation_control.isampler2darray - uniform.tessellation_control.usampler2darray - uniform.tessellation_control.sampler2d - uniform.tessellation_control.usamplercube - uniform.tessellation_control.sampler3d - uniform.tessellation_control.samplercube - uniform.tessellation_control.isampler2d Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Don't set EmitNoMainReturn.	Kenneth Graunke	2017-01-05	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	A while ago, we stopped using Luca's GLSL IR lower_jumps pass in favor of nir_lower_returns(). Marek's commit d3cb79e043338b0e55a3fba8df652f3 put it in do_common_optimization, which resulted in us calling it again. Dropping the EmitNoMainReturn setting makes us skip that pass again. Apparently that pass doesn't work properly, because this fixes Piglit's tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99287 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa/glsl: set SamplersUsed directly in gl_program	Timothy Arceri	2017-01-06	1	-1/+0
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	mesa: make _CurrentFragmentProgram a gl_program struct pointer	Timothy Arceri	2017-01-06	1	-6/+2
\| \| \| \| \| \| \| \|	Making this point to a gl_program struct rather than a gl_shader_program struct will allow use to later also make the CurrentProgram array hold gl_program structs which in turn will allow for code simpilifcation. Reviewed-by: Eric Anholt <[email protected]>
*	i965: stop passing gl_shader_program to the precompile and codegen functions	Timothy Arceri	2017-01-06	12	-87/+31
\| \| \| \| \| \| \| \|	We no longer need it. While we are at it we mark the vs, gs, and wm codegen functions as static. Reviewed-by: Eric Anholt <[email protected]>
*	i965: make use of new is_arb_asm flag	Timothy Arceri	2017-01-06	2	-13/+11
\| \| \| \|	Reviewed-by: Eric Anholt <[email protected]>
*	st/mesa/glsl: add new is_arb_asm flag in gl_program	Timothy Arceri	2017-01-06	3	-12/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Set the flag via the _mesa_init_gl_program() and NewProgram() helpers. In i965 we currently check for the existance of gl_shader_program to decide if this is an ARB assembly style program or not. Adding a flag makes the code clearer and will help removes a dependency on gl_shader_program in the i965 codegen functions. Also this will allow use to skip initialising sampler units for linked shaders, we currently memset it to zero again during linking. Reviewed-by: Eric Anholt <[email protected]>
*	i965: pass gl_program directly to brw_compile_tes()	Timothy Arceri	2017-01-06	3	-6/+4
\| \| \| \| \| \|	This is the only thing we use from gl_shader_program so pass it directly. Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965: stop passing gl_shader_program to brw_nir_setup_glsl_uniforms()	Timothy Arceri	2017-01-06	8	-18/+13
\| \| \| \| \| \| \|	We can now just get the data needed from the gl_shader_program_data pointer in gl_program. Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965: pass gl_program to brw_upload_ubo_surfaces()	Timothy Arceri	2017-01-06	6	-22/+20
\| \| \| \| \| \|	There is no need to pass gl_linked_shader anymore. Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965: stop passing gl_shader_program to ↵	Timothy Arceri	2017-01-06	8	-32/+13
\| \| \| \| \| \| \| \| \|	brw_assign_common_binding_table_offsets() We now get everything we need directly from gl_program so there is no need for this. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa/glsl/i965: move ShaderStorageBlocks to gl_program	Timothy Arceri	2017-01-06	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Having it here rather than in gl_linked_shader allows us to simplify the code. Also it is error prone to depend on the gl_linked_shader for programs in current use because a failed linking attempt will free infomation about the current program. In i965 we could be trying to recompile a shader variant but may have lost some required fields. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa/glsl/i965: set num_ssbos directly in shader_info	Timothy Arceri	2017-01-06	2	-6/+9
\| \| \| \| \| \| \|	Here we also remove the duplicate field in gl_linked_shader and always get the value from shader_info instead. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa/glsl/i965: move per stage UniformBlocks to gl_program	Timothy Arceri	2017-01-06	1	-1/+1
\| \| \| \| \| \| \|	This will help allow us to store pointers to gl_program structs in the CurrentProgram array resulting in a bunch of code simplifications. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa/glsl/i965: set num_ubos directly in shader_info	Timothy Arceri	2017-01-06	2	-4/+4
\| \| \| \| \| \| \|	This also removes the duplicate field in gl_linked_shader, and gets num_ubos from shader_info instead. Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa/glsl/i965: move ImageUnits and ImageAccess fields to gl_program	Timothy Arceri	2017-01-06	7	-41/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Having it here rather than in gl_linked_shader allows us to simplify the code. Also it is error prone to depend on the gl_linked_shader for programs in current use because a failed linking attempt will free infomation about the current program. In i965 we could be trying to recompile a shader variant but may have lost some required fields. We drop the memset on ImageUnits because gl_program is already created using rzalloc(). Reviewed-by: Lionel Landwerlin <[email protected]>