mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Disable the unlit centroid workaround on Gen7.	Matt Turner	2016-08-02	1	-3/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Once upon a time (commit 8313f44409) Paul added code for the unlit centroid workaround (WaCopyUnlitCentroidBarys). His commit message claims it fixed the EXT_framebuffer_multisample/interpolation {2,4} {centroid-deriv,centroid-deriv-disabled} piglit tests but does not say on which platform, though he cites the IVB PRM. "3DSTATE_WM [DevIVB, DevHSW]" says "[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW may produce incorrect interpolation results when a 2X2 pixels have unlit pixels." I later disabled it for Haswell (commit f6db414f3c) with no known ill effects. The Sandybridge page does not have this text, but the workarounds database (see WaCopyUnlitCentroidBarys) says the issues applies only to Sandybridge, and in fact in commit 1a2de7dce8fc I note that disabling the workaround on Sandybridge causes the tests Paul originally mentioned to fail. So this is, and always has been, a huge confusing mess. Disabling the workaround indeed causes the tests Paul originally mentioned to fail on Sandybridge but not on Ivybridge/Baytrail. On Ivybridge: total instructions in shared programs: 6914901 -> 6909599 (-0.08%) instructions in affected programs: 106766 -> 101464 (-4.97%) helped: 884 total cycles in shared programs: 70874764 -> 70813774 (-0.09%) cycles in affected programs: 794144 -> 733154 (-7.68%) helped: 688 HURT: 186 LOST: 1 GAINED: 6 Reviewed-by: Kenneth Graunke <[email protected]>
*	i915: Avoid aliasing violation.	Matt Turner	2016-08-01	1	-1/+3
\| \| \| \|	Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: fix comparison warning	Timothy Arceri	2016-08-01	1	-1/+1
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix move_interpolation_to_top() pass.	Kenneth Graunke	2016-07-29	1	-21/+29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pass I introduced in commit a2dc11a7818c04d8dc0324e8fcba98d60bae was entirely broken. A missing "break" made the load_interpolated_input case always fall through to "default" and hit a "continue", making it not actually move any load_interpolated_input intrinsics at all. It would only move the simple load_barycentric_* intrinsics, which don't emit any code anyway, making it basically useless. The initial version I sent of the pass worked, but I apparently failed to verify that the simplified version in v2 actually worked. With the obvious fix applied (so we actually tried to move load_interpolated_input intrinsics), I discovered a second bug: we weren't moving the offset SSA def to the top, breaking SSA validation. The new version of the pass actually moves load_interpolated_input intrinsics and all their dependencies, as intended. Papers over GPU hangs on Ivybridge and Baytrail caused by the recent NIR FS input rework by restoring the old behavior. (I'm not honestly sure why they hang with PLN not at the top.) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97083 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: remove unnecessary null check	Timothy Arceri	2016-07-28	1	-4/+1
\| \| \| \| \| \| \| \|	We would have hit a segfault already if this could be null. Fixes Coverity warning spotted by Matt. Reviewed-by: Matt Turner <[email protected]>
*	glsl: Separate overlapping sentinel nodes in exec_list.	Matt Turner	2016-07-26	3	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	I do appreciate the cleverness, but unfortunately it prevents a lot more cleverness in the form of additional compiler optimizations brought on by -fstrict-aliasing. No difference in OglBatch7 (n=20). Co-authored-by: Davin McCall <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	i965/miptree: Stop multiplying cube depth by 6 in HiZ calculations	Jason Ekstrand	2016-07-26	1	-17/+2
\| \| \| \| \| \| \| \| \|	intel_mipmap_tree::logical_depth0 is now in number of 2D slices so we no longer need to be multiplying by 6. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Cc: "12.0" <[email protected]>
*	i965/miptree/isl: Stop multiplying depth by 6 for cubes	Jason Ekstrand	2016-07-26	1	-5/+0
\| \| \| \| \| \| \| \| \| \|	Now that the logical_depth0 field is in number of 2D slices, we don't need to be multiplying by 6 when creating the surface. It wasn't hurting anything primarily because we get the actual length from the view which was already handling it correctly. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965/blorp/gen8: Stop multiplying depth by 6 for cubes	Jason Ekstrand	2016-07-26	1	-4/+1
\| \| \| \| \| \| \| \|	intel_mipmap_tree::logical_depth0 is now in 2-D slices so there is no need for us to multiply by 6 when we go to fill out a blorp surface state. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965: Get rid of the do_lower_unnormalized_offsets pass	Jason Ekstrand	2016-07-22	4	-109/+0
\| \| \| \| \| \| \| \| \|	We can do this in NIR now. No need to keep a GLSL pass lying around for it. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
*	i965/nir: Enable NIR lowering of txf and rect offsets	Jason Ekstrand	2016-07-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This fixes the following piglit tests on gen6+: tex-miplevel-selection textureProjGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRect tex-miplevel-selection textureGradOffset 2DRectShadow tex-miplevel-selection textureProjGradOffset 2DRect_ProjVec4 tex-miplevel-selection textureProjGradOffset 2DRectShadow Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
*	i965: fix varying output setup	Timothy Arceri	2016-07-23	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	Since 7f53fead5c we treat every location as using all four components so we only need special handling for doubles when they cross multiple locations. This fixes a crash in GL45-CTS.enhanced_layouts.varying_locations where the outputs array would overflow when a dmat2 was stored at the max varying location i.e 30. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: Fix "operation operation" in comment.	Kenneth Graunke	2016-07-21	1	-1/+1
\| \| \| \| \| \|	From the redundant redundant department. Reported-by: Michael Schellenberger Costa <[email protected]>
*	i965: Fix shared atomic intrinsics to pay attention to base.	Kenneth Graunke	2016-07-21	1	-1/+12
\| \| \| \| \| \|	Cc: "12.0" <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	i965: Include VUE handles for GS with invocations > 1.	Kenneth Graunke	2016-07-21	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We always resort to the pull model for instanced GS inputs. So, we'd better include the VUE handles, or else we can't actually pull anything. Ian reports that on his branch with OES_geometry_shader enabled, this fixes a bunch of dEQP-GLES31.functional.geometry_shading tests:: - instanced.draw_2_instances_geometry_2_invocations - instanced.draw_2_instances_geometry_8_invocations - instanced.draw_4_instances_geometry_2_invocations - instanced.draw_4_instances_geometry_8_invocations - instanced.draw_8_instances_geometry_2_invocations - instanced.draw_8_instances_geometry_8_invocations - instanced.geometry_2_invocations - instanced.geometry_32_invocations - instanced.geometry_8_invocations - instanced.geometry_max_invocations - instanced.geometry_output_different_2_invocations - instanced.geometry_output_different_32_invocations - instanced.geometry_output_different_8_invocations - instanced.geometry_output_different_max_invocations - instanced.invocation_output_vary_by_attribute - instanced.invocation_output_vary_by_texture - instanced.invocation_output_vary_by_uniform - query.primitives_generated_instanced Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Tested-by: Ian Romanick <[email protected]>
*	i965: print error messages if gs fails to compile	Timothy Arceri	2016-07-21	1	-0/+6
\| \| \| \| \| \|	We do this for all other stages. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: enable GL4.4 for Gen8+	Timothy Arceri	2016-07-21	2	-2/+2
\| \| \| \|	Acked-by: Kenneth Graunke <[email protected]>
*	i965: enable ARB_enhanced_layouts for gen6+	Timothy Arceri	2016-07-21	1	-1/+1
\| \| \| \|	Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965/vec4: add packing support for tcs load outputs	Timothy Arceri	2016-07-21	3	-7/+17
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965/vec4: add support for packing tes inputs	Timothy Arceri	2016-07-21	1	-4/+10
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
*	i965/vec4: add support for packing tcs outputs	Timothy Arceri	2016-07-21	1	-0/+7
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965/vec4: support packing tcs inputs	Timothy Arceri	2016-07-21	2	-2/+7
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965/vec4: add component packing for gs	Timothy Arceri	2016-07-21	1	-0/+2
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965/vec4: add support for packing vs/gs/tes outputs	Timothy Arceri	2016-07-21	3	-4/+45
\| \| \| \| \| \| \| \| \| \| \|	Here we create a new output_generic_reg array with the ability to store the dst_reg for each component of user defined varyings. This is needed as the previous code only stored the dst_reg based on the varying location which meant packed varyings would overwrite each other. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
*	i965/vec4: add support for packing inputs	Timothy Arceri	2016-07-21	1	-0/+2
\| \| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965: add helper for creating packing writemask	Timothy Arceri	2016-07-21	1	-0/+7
\| \| \| \| \| \| \| \| \| \| \|	For example where n=3 first_component=1 this will give us 0xE (WRITEMASK_YZW). V2: Add assert to check first component is <= 4 (Suggested by Ken) Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965: add helpers for creating component layout swizzle	Timothy Arceri	2016-07-21	1	-0/+3
\| \| \| \| \| \| \| \| \|	This will be used to swizzle components to the beginning or end of the vector based on the component layout qualifier and whether we are doing a load or store. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
*	i965: enable ARB_enhanced_layouts for gen8+	Timothy Arceri	2016-07-21	1	-0/+1
\| \| \| \|	Acked-by: Edward O'Callaghan <[email protected]>
*	i965: add component packing support for load_output intrinsics	Timothy Arceri	2016-07-21	1	-5/+33
\| \| \| \| \| \| \| \|	Here we use the component qualifier (which is the first component) as an offset when loading output varyings. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: enable component packing for vs and fs	Timothy Arceri	2016-07-21	4	-25/+16
\| \| \| \| \| \| \| \| \| \| \|	Rather than trying to work out the total number of components used at a location we simply treat all outputs as vec4s. This removes the need for complex code looping over varyings to match packed locations and the need for storing the total number of components used at each location. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: bring back type_size_vec4_times_4()	Timothy Arceri	2016-07-21	2	-0/+14
\| \| \| \| \| \| \| \|	We will use this for output varyings. To make component packing simpler we will just treat all varyings as vec4s. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move VS load_input handling to nir_emit_vs_intrinsic().	Kenneth Graunke	2016-07-20	1	-31/+30
\| \| \| \| \| \| \| \| \| \|	TCS/TES/GS and now FS all handle these in stage-specific functions. CS don't have inputs, so VS was the only one left using this code. Move it to the VS-specific function for clarity. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Delete the FS_OPCODE_INTERPOLATE_AT_CENTROID virtual opcode.	Kenneth Graunke	2016-07-20	4	-10/+0
\| \| \| \| \| \| \| \| \|	We no longer use this message. As far as I can tell, it's fairly useless - the equivalent information is provided in the payload. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Rewrite FS input handling to use the new NIR intrinsics.	Kenneth Graunke	2016-07-20	5	-341/+270
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This eliminates the need to walk the list of input variables, recurse into their types (via logic largely redundant with nir_lower_io), and interpolate all possible inputs up front. The backend no longer has to care about variables at all, which eliminates complications from trying to pack multiple variables into the same location. Instead, each intrinsic specifies exactly what's needed. This should unblock Timothy's work on GL_ARB_enhanced_layouts. Each load_interpolated_input intrinsic corresponds to PLN instructions, while load_barycentric_at_* intrinsics correspond to pixel interpolator messages. The pixel/centroid/sample barycentric intrinsics simply refer to payload fields (delta_xy[]), and don't actually generate any code. Because we use a single intrinsic for both centroid-qualified variables and interpolateAtCentroid(), they become indistinguishable. We stop sending pixel interpolator messages for those, and instead use the payload provided data, which should be considerably faster. On Broadwell: total instructions in shared programs: 9067751 -> 9067570 (-0.00%) instructions in affected programs: 145902 -> 145721 (-0.12%) helped: 422 HURT: 209 total spills in shared programs: 2849 -> 2899 (1.76%) spills in affected programs: 760 -> 810 (6.58%) helped: 0 HURT: 10 total fills in shared programs: 3910 -> 3950 (1.02%) fills in affected programs: 617 -> 657 (6.48%) helped: 0 HURT: 10 LOST: 3 GAINED: 3 The differences mostly appear to be slight changes in MOVs. v2: Use nir_shader_compiler_options::use_interpolated_input_intrinsics flag rather than passing it directly to nir_lower_io. Use the unreachable() macro rather than assert in one place. (Review feedback from Chris Forbes.) Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965: Move load_interpolated_input/barycentric_* intrinsics to the top.	Kenneth Graunke	2016-07-20	1	-0/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, i965 interpolates all FS inputs at the top of the program. This has advantages and disadvantages, but I'd like to keep that policy while reworking this code. We can consider changing it independently. The next patch will make the compiler generate PLN instructions "on the fly", when it encounters an input load intrinsic, rather than doing it for all inputs at the start of the program. To emulate this behavior, we introduce an ugly pass to move all NIR load_interpolated_input and payload-based (not interpolator message) load_barycentric_* intrinsics to the shader's start block. This helps avoid regressions in shader-db for cases such as: if (...) { ...load some input... } else { ...load that same input... } which CSE can't handle, because there's no dominance relationship between the two loads. Because the start block dominates all others, we can CSE all inputs and emit PLNs exactly once, as we did before. Ideally, global value numbering would eliminate these redundant loads, while not forcing them all the way to the start block. When that lands, we should consider dropping this hacky pass. Again, this pass currently does nothing, as i965 doesn't generate these intrinsics yet. But it will shortly, and I figured I'd separate this code as it's relatively self-contained. v2: Dramatically simplify pass - instead of creating new instructions, just remove/re-insert their list nodes (suggested by Jason Ekstrand). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Add a pass to demote sample interpolation intrinsics.	Kenneth Graunke	2016-07-20	1	-0/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When working with a non-multisampled render target, asking for "sample" interpolation locations doesn't make sense. We demote them to centroid. In a couple of patches, brw_compute_barycentric_modes will begin looking at these intrinsics to determine the barycentric modes. fs_visitor also will use them to code-generate pixel interpolator messages or payload references. Handling the "but what if it's not MSAA?" logic ahead of time in a NIR pass simplifies things and prevents duplicated logic. This patch doesn't actually do anything useful yet as we don't generate these intrinsics. I decided to keep it separate as it's self-contained, in the hopes of shrinking the "convert everything" patch for reviewers. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Stop muging cube array lengths by 6	Jason Ekstrand	2016-07-20	5	-38/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From the Sky Lake PRM: "For SURFTYPE_CUBE: For Sampling Engine Surfaces and Typed Data Port Surfaces, the range of this field is [0,340], indicating the number of cube array elements (equal to the number of underlying 2D array elements divided by 6). For other surfaces, this field must be zero." In other words, the depth field for cube maps is in number of cubes not number of 2-D slices so we need to divide by 6. ISL will do this correctly for us assuming that we provide it with the correct array bounds which it expects to be in 2-D slices. It appears as if we've been doing this wrong ever since we first added cube map arrays for Sandy Bridge and the change to ISL made things slightly worse. While we're at it, we now need to remoe the shader hacks we've always done since they were only needed because we were setting the depth field six times too large. v2: Fix the vec4 backend as well (not sure how I missed this). Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/miptree: Set logical_depth0 == 6 for cube maps	Jason Ekstrand	2016-07-20	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	This matches what we do for cube maps where logical_depth0 is in number of face-layers rather than number of cubes. This does mean that we will temporarily be setting the surface bounds too loose for cube map textures but we are already setting them too loose for cube arrays and we will be fixing that in the next commit anyway. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "12.0 11.2 11.1" <[email protected]>
*	i965/miptree: Enforce that height == 1 for 1-D array textures	Jason Ekstrand	2016-07-20	2	-19/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The GL API and mesa internals do this differently than we do. In GL, there is no depth parameter for 1-D arrays and height is used. In the i965 miptree code we do the sane thing and make height == 1 and use depth for number of slices. This makes for a mismatch every time we create a 1-D array texture from GL. Instead of actually solving this problem, we just said "1-D is hard, let's make sure it works no matter which way we pass the parameters" and called it a day. This commit fixes the one GL -> i965 transition point where we weren't already handling 1-D array textures to do the right thing and then replaces the magic fixup code with an assert that you're doing the right thing. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Cc: "12.0 11.2 11.1" <[email protected]>
*	i965: store reference to the context within struct brw_fence (v2)	Emil Velikov	2016-07-20	1	-11/+44
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	As the spec allows for {server,client}_wait_sync to be called without currently bound context, while our implementation requires context pointer. v2: Add a mutex and acquire it for the duration of brw_fence_client_wait() and brw_fence_is_completed() as suggested by Chad. Cc: "11.2 12.0" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Signed-off-by: Tomasz Figa <[email protected]>
*	i965: Use tex_mocs instead of rb_mocs for GL images.	Kenneth Graunke	2016-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a 10-20% performance regression in OglCSDof caused by commit 5a8c89038abab0184ea72664ab390ec6ca58b4d6, which made images (in the image load/store sense) use BDW_MOCS_PTE instead of BDW_MOCS_WB. This seems sketchy, as the default PTE value is supposed to be WB LLC eLLC, which is the same as our MOCS WB setting. It's only supposed to change when using a surface for display, which won't ever happen for images. Something may be wrong in the kernel... Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Enable MESA_shader_integer_functions on all GLSL 1.30 platforms	Ian Romanick	2016-07-19	2	-6/+15
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Don't lower uaddCarry and usubBorrow in both GLSL IR and NIR	Ian Romanick	2016-07-19	1	-3/+1
\| \| \| \| \|	Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Update assertion to account for Gen < 7	Ian Romanick	2016-07-19	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	Previously SHADER_OPCODE_MULH could only exist on Gen7+, so the assertion assumed the Gen7+ accumulator rules. A future patch will allow this instruction on at least Gen6, so update the assertion. v2: Use get_lowered_simd_width instead of open coding it. Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> [v1]
*	i965: Use LZD to implement nir_op_find_lsb on Gen < 7	Ian Romanick	2016-07-19	2	-3/+45
\| \| \| \| \| \| \|	v2: Rebase on changes to previous two patches. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Use LZD to implement nir_op_ifind_msb on Gen < 7	Ian Romanick	2016-07-19	2	-21/+90
\| \| \| \| \| \| \| \| \|	v2: Retype LZD source as UD to avoid potential problems with 0x80000000. Suggested by Matt. Also update comment about problem values with LZD(abs(x)). Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Use LZD to implement nir_op_ufind_msb	Ian Romanick	2016-07-19	4	-1/+54
\| \| \| \| \| \| \| \| \| \|	This uses one less instruction. v2: Move emit_find_msb_using_lzd out of the visitor classes. Suggested by Curro. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Always enable GL_ARB_shading_language_packing	Ian Romanick	2016-07-19	1	-1/+1
\| \| \| \| \| \| \| \| \|	With the existing lowering passes, the functions from this extension become a bunch of bit twiddling operations that have always been supported. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Move enable of EXT_shader_integer_mix	Ian Romanick	2016-07-19	1	-1/+2
\| \| \| \| \| \| \| \|	This extension does not depend on the Gen. It only depends on the availability of GLSL 1.30. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Skip update_texture_surface when the plane doesn't exist	Jason Ekstrand	2016-07-18	1	-8/+10
\| \| \| \| \| \| \| \| \|	Thanks to rebase fail, recent surface state changes (commits 7e951cd56, 8521ce1a7, and 69c0dc5c53) effectively reverted 727a9b24933 and 367cf3a2e3e which was unintentional. This should bring it back. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>