mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Rename define for the PIPE_CONTROL DC flush bit.	Francisco Jerez	2016-02-08	5	-6/+6
\| \| \| \| \| \| \|	Its previous name was somewhat misleading, this really behaves like a RW cache flush rather than an invalidation. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Invalidate state cache before L3 partitioning set-up.	Francisco Jerez	2016-02-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	The state cache is also L3-backed so it seems sensible to make sure it's clean as we do for other RO caches before repartitioning the L3. This wasn't part of my original L3 partitioning code because I was able to reproduce hangs on Gen7 hardware when the state cache invalidation happened asynchronously with previous 3D rendering, which should no longer be possible after the previous change. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix cache pollution race during L3 partitioning set-up.	Francisco Jerez	2016-02-08	1	-8/+23
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We need to split the stalling flush from the RO cache invalidation into a different PIPE_CONTROL command to make sure that the top of the pipe invalidation happens after any previous rendering is complete. Otherwise it's possible for previous rendering to pollute the L3 cache in the short window of time between RO invalidation and the completion of the stalling flush. Fixes rendering artifacts on Unigine Heaven, Metro Last Light Redux and Metro 2033 Redux. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93540 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93599 Tested-by: Darius Spitznagel <[email protected]> Tested-by: Martin Peres <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Don't emit unnecessary SEL instruction from emit_image_atomic().	Francisco Jerez	2016-02-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	The SEL instruction with predication mode NONE emitted when the atomic operation doesn't need to be predicated is a no-op and might rely on undocumented hardware behaviour. Noticed by chance while looking at the assembly output. Reviewed-by: Matt Turner <[email protected]>
*	i965/vec4: Update vec4 unit tests for commit 01dacc83ff.	Matt Turner	2016-02-08	3	-10/+24
\| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94050
*	dri/common: include debug_output.h to silence warning	Brian Paul	2016-02-08	1	-0/+1
\|
*	i965/vec4: don't copy ATTR into 3src instructions with complex swizzles	Matt Turner	2016-02-05	1	-4/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The vec4 backend, at the end, does this: if (inst->is_3src()) { for (int i = 0; i < 3; i++) { if (inst->src[i].vstride == BRW_VERTICAL_STRIDE_0) assert(brw_is_single_value_swizzle(inst->src[i].swizzle)); So make sure that we use the same conditions when trying to copy-propagate. UNIFORMs will be converted to vstride 0 in convert_to_hw_regs, but so will ATTRs when interleaved (as will happen in a GS with multiple attributes). Since the vstride is not set at copy-prop time, infer it by inspecting dispatch_mode and reject ATTRs if they have non-scalar swizzles and are interleaved. Fixes assertion errors in dolphin-generated geometry shaders (or misrendering on opt builds) on Sandybridge or on IVB/HSW with INTEL_DEBUG=nodualobj. Co-authored-by: Ilia Mirkin <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93418 Cc: "11.0 11.1" <[email protected]>
*	main: Use a derived value for the default sample count	Neil Roberts	2016-02-05	1	-0/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously the framebuffer default sample count was taken directly from the value given by the application. On the i965 driver on HSW if the value wasn't one that is supported by the hardware it would hit an assert when it tried to program the state for it. This patch fixes it by adding a derived sample count to the state for the default framebuffer. The driver can then quantize this to one of the valid values in its UpdateState handler when the _NEW_BUFFERS state changes. _mesa_geometric_samples is changed to use the new derived value. Fixes the piglit test arb_framebuffer_no_attachments-query Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93957 Cc: Ilia Mirkin <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	DRI_CONFIG: Add option to override vendor id	Patrick Rudolph	2016-02-04	1	-0/+5
\| \| \| \| \| \| \|	Add config option override_vendorid to report a fake card in d3dadapter9 drm. Signed-off-by: Patrick Rudolph <[email protected]> Reviewed-by: Axel Davy <[email protected]>
*	i965/fs: Allocate single register at a time for constants.	Matt Turner	2016-02-04	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	No instruction counts changed, but: total cycles in shared programs: 64834502 -> 64781530 (-0.08%) cycles in affected programs: 16331544 -> 16278572 (-0.32%) helped: 4757 HURT: 4288 GAINED: 66 LOST: 20 I remember trying this when I first wrote the pass, but it wasn't helpful at the time. Reviewed-by: Francisco Jerez <[email protected]>
*	i965/gen8: Initialize aux_mode to GEN8_SURFACE_AUX_MODE_NONE	Jordan Justen	2016-02-02	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GEN8_SURFACE_AUX_MODE_NONE is 0, so this is a no-op. Yet, this also makes it clear that we can compare aux_mode to the other GEN8_SURFACE_AUX_MODE_ values. We will want to compare to GEN8_SURFACE_AUX_MODE_HIZ. v2: Some very minor cherry-pick conflicts due to moving it around in the series. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Signed-off-by: Ben Widawsky <[email protected]>
*	Revert "i965: Provide sse2 version for rgba8 <-> bgra8 swizzle"	Roland Scheidegger	2016-02-02	2	-62/+12
\| \| \| \| \| \| \| \| \| \|	This reverts commit ab30426e335116e29473faaafe8b57ec760516ee. Apparently the memory isn't quite as aligned when this gets called as it should be, causing crashes. (Albeit this looks independent from this code, should crash just as well if ssse3 is enabled when compiling without this patch.) https://bugs.freedesktop.org/show_bug.cgi?id=93962
*	i965: Provide sse2 version for rgba8 <-> bgra8 swizzle	Roland Scheidegger	2016-02-02	2	-12/+62
\| \| \| \| \| \| \| \| \| \| \| \| \|	The existing code used ssse3, and because it isn't compiled in a separate file compiled with that, it is usually not used (that, of course, could be fixed...), whereas sse2 is always present at least with 64bit builds. This should be pretty much as fast as the pshufb version, albeit those code paths aren't really used on chips without llc in any case. v2: fix andnot argument order, add comments v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments Reviewed-by: Matt Turner <[email protected]>
*	i965/gen7+: Use NIR for lowering of pack/unpack opcodes.	Matt Turner	2016-02-01	3	-19/+29
\| \| \| \|	Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/vec4: Implement nir_op_pack_uvec2_to_uint.	Matt Turner	2016-02-01	1	-0/+18
\| \| \| \| \| \| \|	And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced by lowering pack[SU]norm4x8 which the vec4 backend does not need. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/fs: Implement support for extract_word.	Matt Turner	2016-02-01	5	-0/+56
\| \| \| \| \| \|	The vec4 backend will lower it. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	glsl: Remove 2x16 half-precision pack/unpack opcodes.	Matt Turner	2016-02-01	1	-3/+0
\| \| \| \| \| \|	i965/fs was the only consumer, and we're now doing the lowering in NIR. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/fs: Switch from GLSL IR to NIR for un/packHalf2x16 scalarizing.	Matt Turner	2016-02-01	3	-11/+7
\| \| \| \|	Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: Make separate nir_options for scalar/vector stages.	Matt Turner	2016-02-01	1	-28/+33
\| \| \| \| \| \| \|	We'll want to have different lowering options set for scalar/vector stages. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: Move brw_compiler_create() to new brw_compiler.c.	Matt Turner	2016-02-01	5	-133/+161
\| \| \| \| \| \| \|	A future patch will want to use designated initalizers, which aren't available in C++, but this is C. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965/skl: Utilize new 5th bit for gateway messages	Ben Widawsky	2016-01-27	1	-2/+4
\| \| \| \| \| \| \| \|	Modify comment as spotted by Matt, and Chris Forbes Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	glsl: move to compiler/	Emil Velikov	2016-01-26	15	-19/+19
\| \| \| \| \| \|	Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
*	nir: move to compiler/	Emil Velikov	2016-01-26	7	-9/+8
\| \| \| \| \| \|	Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
*	nir: move glsl_types.{cpp,h} to compiler	Emil Velikov	2016-01-26	6	-6/+6
\| \| \| \| \| \| \| \|	Allows us to remove the SCons workaround :-) Signed-off-by: Emil Velikov <[email protected]> Acked-by: Matt Turner <[email protected]> Acked-by: Jose Fonseca <[email protected]>
*	i965/bxt: Fix conservative wm thread counts.	Ben Widawsky	2016-01-25	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When setting the conservative thread counts, I halved everything. That isn't correct for the wm, which has nothing to do with actual thread counts. I suck. BXT only has 1 slice, and there is some ambiguity about subslices, so just reserve the max possible for now. It looks like this might fix: piglit.spec.glsl-1_50.execution.variable-indexing.gs-output-array-vec4-index-wr.bxtm64. I kind of question why that is, but it is what Jenkins says. Mark is current running some of the other blacklisted tests on this patch. (it effects anything requiring scratch space). Cc: mesa-stable <[email protected]> Cc: Neil Roberts <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Tested-by: Mark Janes <[email protected]>
*	i965: Implement a drirc workaround for broken dual color blending.	Kenneth Graunke	2016-01-22	8	-9/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	OpenGL's dual color blending feature was specified so that an implementation could support both multiple render targets (MRT) and dual source blending. Fragment shader outputs specify both "location" (the render target number) and "index" (either color 0 or 1). I believe DirectX only has the notion of "location" - if using dual color blending, location 0 or 1 will specify the operands. If not, then location means the render target index. The two features can't be used together. As such, some applications mistakenly try to use <loc = 0, index = 0> and <loc = 1, index = 0> in a shader used for dual color blending with a single render target, rather than the correct <loc = 0, index = 0> and <loc = 0, index = 1>. In particular, Unigine Heaven 4.0 and Valley 1.0 suffer from this bug. Unigine is aware of the problem, and quickly developed a fix, but has not bothered to change the download link on their website to a working copy in over a year. People were still using the broken version and complaining. We tried working around this by disabling dual color blending, but that apparently hurts performance, and people were once again unhappy. On i965, dual source blending is achieved by using different framebuffer write messages than normal rendering. So, we have to compile different code for the two cases. We're not being pedantic: we actually have to know in order to function. Normally, dual source blending is detectable in the shader: if a shader has an output with index = 1, then it's meant for blending, not MRT. With the broken inputs, they're indistinguishable, so we can only tell by looking at the current GL state. This patch implements a new drirc workaround: export dual_color_blend_by_location=true which makes the i965 driver detect when OpenGL state is configured for dual source blending, and recompile the fragment shader to use the right messages. In that case, we allow either location = 1 or index = 1 to specify the second source for the blending equations. It also re-enables GL_ARB_blend_func_extended for Unigine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92233 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
*	i965/fs: Remove unused count from vs urb setup	Ben Widawsky	2016-01-22	1	-6/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This was originally removed here: commit 031d3501322aee0a1474c7f2a9b79f9fa9947430 Author: Kenneth Graunke <[email protected]> Date: Tue Aug 25 16:59:12 2015 -0700 i965/vs: Unify URB entry size/read length calculations between backends. Then added back: commit bd198b9f0a292a9ff4ffffec3a29bad23d62caba Author: Kenneth Graunke <[email protected]> Date: Fri Aug 14 16:01:33 2015 -0700 i965/vs: Simplify fs_visitor's ATTR file. Note that the authorship dates are out of order, but the above reflects the order of the commit dates. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i915: correctly parse/set the context flags	Emil Velikov	2016-01-22	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With an earlier commit we've spit the flags parsing to a separate function, but forgot to update all the dri modules to use it. Noticed when we've enabled KHR_debug for every dri module - fdo#93048 Fixes: 38366c0c6e7 "dri_util: Don't assume __DRIcontext->driverPrivate is a gl_context" Cc: Mark Janes <[email protected]> Cc: "11.0 11.1" <[email protected]> Cc: Kristian Høgsberg <[email protected]> Cc: Ian Romanick <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Mark Janes <[email protected]> Tested-by: Mark Janes <[email protected]>
*	i965/vec4/tcs: Return NULL instead of false in brw_compile_tcs()	Eduardo Lima Mitev	2016-01-21	1	-1/+1
\| \| \| \| \| \| \|	brw_compile_tcs() is expected to return 'const unsigned *', so the compiler complains. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Implement compute sampler state atom.	Francisco Jerez	2016-01-19	4	-1/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes a number of GLES31 CTS failures and hangs on various hardware: ES31-CTS.texture_gather.plain-gather-depth-2d ES31-CTS.texture_gather.plain-gather-depth-2darray ES31-CTS.texture_gather.plain-gather-depth-cube ES31-CTS.texture_gather.offset-gather-depth-2d ES31-CTS.texture_gather.offset-gather-depth-2darray ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers ES31-CTS.compute_shader.resources-texture Some of them were actually passing by luck on some generations even though we weren't uploading sampler state tables explicitly for the compute stage, most likely because they relied on the cached sampler state left from previous rendering to be close enough. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725 Reported-by: Marta Lofstedt <[email protected]> Reviewed-by: Marta Lofstedt <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Trigger CS state reemission when new sampler state is uploaded.	Francisco Jerez	2016-01-19	2	-1/+2
\| \| \| \| \| \| \| \| \| \|	This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used on pre-Gen7 hardware) to signal that the sampler state tables have changed in order to make sure that the GPGPU interface descriptor is updated. Reviewed-by: Marta Lofstedt <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/vec4: Spaces around operators.	Matt Turner	2016-01-19	1	-1/+1
\|
*	i965: Inform compiler of variable range to silence warning.	Matt Turner	2016-01-19	1	-1/+2
\| \| \| \| \| \| \|	Extends commit 6531ccb70 to silence the warning in release builds as well. Reviewed-by: Ilia Mirkin <[email protected]>
*	i965: adding missing headers to the dist tarball	Emil Velikov	2016-01-18	1	-0/+2
\| \| \| \|	Signed-off-by: Emil Velikov <[email protected]>
*	i965/fs: Always set channel 2 of texture headers in some stages	Jason Ekstrand	2016-01-15	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In the vertex and fragment stages, the hardware is nice to us and leaves g0.2 zerod out for us so we can use it for headers. However, in compute, geometry, and tessellation stages, the hardware is not so nice. In particular, for compute shaders on BDW, the hardware places some debug bits in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha channel mask. This means that if you use a texturing instruction with a header in a compute shader, you may randomly get the alpha channel disabled. Since channel masks affect the return length of the sampler message, this can lead the GPU to expect a different mlen to the one you specified in the shader and this, in turn, hangs your GPU. Cc: "11.1" <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965/fs/generator: Take an actual shader stage rather than a string	Jason Ekstrand	2016-01-15	7	-11/+14
\| \| \| \| \| \|	Cc: "11.1" <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/vec4: Use UW type for multiply into accumulator on GEN8+	Jason Ekstrand	2016-01-15	1	-1/+5
\| \| \| \| \| \| \| \|	BDW adds the following restriction: "When multiplying DW x DW, the dst cannot be accumulator." Cc: "11.1,11.0" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Apply add_const_offset_to_base for vec4 VS inputs too.	Kenneth Graunke	2016-01-14	1	-5/+5
\| \| \| \| \| \| \| \|	This shouldn't hurt anything, and I'm about to introduce a pass that will want it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Make add_const_offset_to_base() work at the shader level.	Kenneth Graunke	2016-01-14	1	-17/+21
\| \| \| \| \| \| \| \|	This makes it a pass, hiding the parameter structs and block callbacks so it's simpler to work with. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Make an is_scalar boolean in brw_compile_vs().	Kenneth Graunke	2016-01-14	1	-5/+5
\| \| \| \| \| \| \| \|	Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can help with line-wrapping. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/gen7.5+: Disable resource streamer during GPGPU workloads.	Francisco Jerez	2016-01-14	3	-1/+42
\| \| \| \| \| \| \| \| \| \|	The RS and hardware binding tables are only supported on the 3D pipeline and can lead to corruption if left enabled during a GPGPU workload. Disable it when switching to the GPGPU (or media) pipeline and re-enable it when switching back to the 3D pipeline. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Abdiel Janulgue <[email protected]>
*	i965/gen7: Emit stall and dummy primitive draw after switching to the 3D ↵	Francisco Jerez	2016-01-14	1	-0/+24
\| \| \| \| \| \| \| \| \|	pipeline. This hardware bug can supposedly lead to a hang on IVB and VLV. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/gen4-5: Emit MI_FLUSH as required prior to switching pipelines.	Francisco Jerez	2016-01-14	1	-0/+13
\| \| \| \| \| \| \| \| \|	AFAIK brw_emit_select_pipeline() is only called once during context init on Gen4-5, at which point the pipeline is likely to be already idle so it may just happen to work by luck regardless of the MI_FLUSH. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/gen6-7: Implement stall and flushes required prior to switching pipelines.	Francisco Jerez	2016-01-14	1	-0/+37
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Switching the current pipeline while it's not completely idle or the read and write caches aren't flushed can lead to corruption. Fixes misrendering of at least the following Khronos CTS test: ES31-CTS.shader_image_load_store.basic-allTargets-store-fs The stall and flushes are no longer required on Gen8+. v2: Emit PIPE_CONTROL with non-zero post-sync op before the write cache flush on SNB due to hardware bug. (Ken) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323 Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/gen8+: Invalidate color calc state when switching to the GPGPU pipeline.	Francisco Jerez	2016-01-14	1	-0/+20
\| \| \| \| \| \| \| \| \| \| \|	This hardware bug can cause a hang on context restore while the current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to clearing the valid bit, mark the CC state as dirty to make sure that the CC indirect state pointer is re-emitted when we switch back to the 3D pipeline. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Add state bit to trigger re-emission of color calculator state.	Francisco Jerez	2016-01-14	3	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \|	This will be used on Gen8+ to make sure that the color calculator state pointers are re-emitted when switching back to the 3D pipeline after some GPGPU workload due to a hardware workaround. There are other state bits already defined that could be used to achieve the same effect but they all cause a ton of unrelated state to be re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new one, state bits are cheap. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: Lower bitfield_extract.	Matt Turner	2016-01-14	3	-0/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The OpenGL specifications for bitfieldExtract() says: The result will be undefined if <offset> or <bits> is negative, or if the sum of <offset> and <bits> is greater than the number of bits used to store the operand. Therefore passing bits=32, offset=0 is legal and defined in GLSL. But the earlier SM5 ubfe/ibfe opcodes are specified to accept a bitfield width ranging from 0-31. As such, Intel and AMD instructions read only the low 5 bits of the width operand, making them not able to implement the GLSL-specified behavior directly. This commit adds ubfe/ibfe operations from SM5 and a lowering pass for bitfield_extract to to handle the trivial case of <bits> = 32 as bitfieldExtract: bits > 31 ? value : bfe(value, offset, bits) Fixes: ES31-CTS.shader_bitfield_operation.bitfieldExtract.uvec3_0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92595 Reviewed-by: Connor Abbott <[email protected]> Tested-by: Marta Lofstedt <[email protected]>
*	i965: Remove unused hw_must_use_separate_stencil	Ben Widawsky	2016-01-13	3	-5/+1
\| \| \| \| \| \| \| \| \| \|	I spotted this while looking for what needs updating in future platforms. I'm too lazy to go through the git logs, but it was probably missed by Jason when all the brw refactoring happened. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Drop extra newline from shader compile messages.	Matt Turner	2016-01-13	2	-2/+2
\| \| \| \| \|	Ilia changed shader-db's run.c to not expect messages to contain a newline in shader-db commit 51bbc8035.
*	glsl: Delete the ir_binop_bfm and ir_triop_bfi opcodes.	Kenneth Graunke	2016-01-13	4	-29/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	TGSI doesn't use these - it just translates ir_quadop_bitfield_insert directly. NIR can handle ir_quadop_bitfield_insert as well. These opcodes were only used for i965, and with Jason's recent patches, we can do this lowering in NIR (which also gains us SPIR-V handling). So there's not much point to retaining this GLSL IR lowering code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>