mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Move brw_cs_fill_local_id_payload() to libi965_compiler	Kristian Høgsberg Kristensen	2015-12-11	4	-40/+43
\| \| \| \| \| \|	This is a helper function for setting up the local invocation ID payload according to the cs_prog_data generated by the compiler. It's intended to be available to users of libi965_compiler so move it there.
*	i965/gen9: Don't do fast clears when GL_FRAMEBUFFER_SRGB is enabled	Neil Roberts	2015-12-11	1	-0/+11
\| \| \| \| \| \| \| \| \|	When GL_FRAMEBUFFER_SRGB is enabled any single-sampled renderbuffers are resolved in intel_update_state because the hardware can't cope with fast clears on SRGB buffers. In that case it's pointless to do a fast clear because it will just be immediately resolved. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/gen9: Allow fast clears for non-MSRT SRGB buffers	Neil Roberts	2015-12-11	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SRGB buffers are not marked as losslessly compressible so previously they would not be used for fast clears. However in practice the hardware will never actually see that we are using SRGB buffers for fast clears if we use the linear equivalent format when clearing and make sure to resolve the buffer as a linear format before sampling from it. This is an important use case because by default the window system framebuffers are created as SRGB so without this fast clears won't be used there. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/gen9: Resolve SRGB color buffers when GL_FRAMEBUFFER_SRGB enabled	Neil Roberts	2015-12-11	1	-0/+27
\| \| \| \| \| \| \| \| \| \|	SKL can't cope with the CCS buffer for SRGB buffers. Normally the hardware won't see the SRGB formats because when GL_FRAMEBUFFER_SRGB is disabled these get mapped to their linear equivalents. In order to avoid relying on the CCS buffer when it is enabled this patch now makes it flush the renderbuffers. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/gen8+: Don't upload the MCS buffer for single-sampled textures	Neil Roberts	2015-12-11	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \|	For single-sampled textures the MCS buffer is only used to implement fast clears. However the surface always needs to be resolved before being used as a texture anyway so the the MCS buffer doesn't actually achieve anything. This is important for Gen9 because in that case SRGB surfaces are not supported for fast clears and we don't want the hardware to see the MCS buffer in that case. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965/meta-fast-clear: Disable GL_FRAMEBUFFER_SRGB during clear	Neil Roberts	2015-12-11	1	-0/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Adds MESA_META_FRAMEBUFFER_SRGB to the meta save state so that GL_FRAMEBUFFER_SRGB will be disabled when performing the fast clear. That way the render surface state will be programmed with the linear equivalent format during the clear. This is important for Gen9 because the SRGB formats are not marked as losslessly compressible so in theory they aren't support for fast clears. It shouldn't make any difference whether GL_FRAMEBUFFER_SRGB is enabled for the fast clear operation because the color is not actually written to the framebuffer so there is no chance for the hardware to apply the SRGB conversion on it anyway. Reviewed-by: Topi Pohjolainen <[email protected]>
*	nir: Get rid of *_indirect variants of input/output load/store intrinsics	Jason Ekstrand	2015-12-10	5	-139/+138
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There is some special-casing needed in a competent back-end. However, they can do their special-casing easily enough based on whether or not the offset is a constant. In the mean time, having the _indirect variants adds special cases a number of places where they don't need to be and, in general, only complicates things. To complicate matters, NIR had no way to convdert an indirect load/store to a direct one in the case that the indirect was a constant so we would still not really get what the back-ends wanted. The best solution seems to be to get rid of the _indirect variants entirely. This commit is a bunch of different changes squashed together: - nir: Get rid of *_indirect variants of input/output load/store intrinsics - nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect - nir/lower_io: Get rid of load/store_foo_indirect - i965/fs: Get rid of load/store_foo_indirect - i965/vec4: Get rid of load/store_foo_indirect - tgsi_to_nir: Get rid of load/store_foo_indirect - ir3/nir: Use the new unified io intrinsics - vc4: Do all uniform loads with byte offsets - vc4/nir: Use the new unified io intrinsics - vc4: Fix load_user_clip_plane crash - vc4: add missing src for store outputs - vc4: Fix state uniforms - nir/lower_clip: Update to the new load/store intrinsics - nir/lower_two_sided_color: Update to the new load intrinsic NIR and i965 changes are Reviewed-by: Kenneth Graunke <[email protected]> NIR indirect declarations and vc4 changes are Reviewed-by: Eric Anholt <[email protected]> ir3 changes are Reviewed-by: Rob Clark <[email protected]> NIR changes are Acked-by: Rob Clark <[email protected]>
*	i965/fs_nir: Refactor store_output, load_input, and load_uniform	Jason Ekstrand	2015-12-10	1	-26/+19
\| \| \| \| \| \| \| \|	There was way too much incrementing of things going on. Instead, let's just start everything off at the right base location, and then increment in the loop. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Check base format to determine whether to use tiled memcpy	Neil Roberts	2015-12-10	2	-6/+8
\| \| \| \| \| \| \| \| \| \| \| \| \|	The tiled memcpy doesn't work for copying from RGBX to RGBA because it doesn't override the alpha component to 1.0. Commit 2cebaac479d4 added a check to disable it for RGBX formats by looking at the TexFormat. However a lot of the rest of the code base is written with the assumption that an RGBA texture can be used internally to implement a GL_RGB texture. If that is done then this check breaks. This patch makes it instead check the base format of the texture which I think more directly matches the intention. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gen8: Allow rendering to B8G8R8X8	Neil Roberts	2015-12-10	1	-4/+5
\| \| \| \| \| \| \| \| \|	Since Gen8 this is allowed as a rendering target so we don't need to override it to B8G8R8A8. This is helpful on Gen9+ where using this override causes fast clears not to work. Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
*	i965/gen9: Allow fast clear for MSRT formats matching render	Neil Roberts	2015-12-10	1	-4/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously fast clear was disallowed on Gen9 for MSRTs with the claim that some formats don't work but we didn't understand why. On further investigation it seems the formats that don't work are the ones where the render surface format is being overriden to a different format than the one used for texturing. The one used for texturing is not actually a renderable format. It arguably makes sense that the sampler hardware doesn't handle the fast color correctly in these cases because it shouldn't be possible to end up with a fast cleared surface that is non-renderable. This patch changes the limitation to prevent fast clear for surfaces where the format for rendering is overriden. Reviewed-by: Ben Widawsky <[email protected]>
*	i965/gen9/fast-clear: Handle linear→SRGB conversion	Neil Roberts	2015-12-10	1	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If GL_FRAMEBUFFER_SRGB is enabled when writing to an SRGB-capable framebuffer then the color will be converted from linear to SRGB before being written. There is no chance for the hardware to do this itself because it can't modify the clear color that is programmed in the surface state so it seems pretty clear that the driver should be handling this itself. Note that this wasn't a problem before Gen9 because previously we were only able to do fast clears to 0 or 1 and those values are the same in linear and SRGB space. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Enable ARB_compute_shader extension on supported hardware	Jordan Justen	2015-12-09	2	-5/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Enable ARB_compute_shader on gen7+, on hardware that supports the OpenGL 4.3 requirements of a local group size of 1024. With SIMD16 support, this is limited to Ivy Bridge and Haswell. Broadwell will work with a local group size up to 896 on SIMD16 meaning programs that use this size or lower should run when setting MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir: Implement shared variable atomic operations	Jordan Justen	2015-12-09	2	-0/+60
\| \| \| \| \| \| \| \| \|	v3: * Update based on latest SSBO code (Iago) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Lower shared variable references to intrinsic calls	Jordan Justen	2015-12-09	1	-0/+3
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Enable shared local memory for CS shared variables	Jordan Justen	2015-12-09	3	-0/+27
\| \| \| \| \| \| \| \| \|	v3: * Check shared variable size at link time Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Handle nir shared variable store intrinsic	Jordan Justen	2015-12-09	1	-0/+48
\| \| \| \| \| \| \| \| \| \| \| \|	v4: * Apply similar optimization for shared variable stores as 0cb7d7b4b7c32246d4c4225a1d17d7ff79a7526d. This was causing a OpenGLES 3.1 CTS failure, but 867c436ca841b4196b4dde4786f5086c76b20dd7 fixes that. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Handle nir shared variable load intrinsic	Jordan Justen	2015-12-09	1	-0/+28
\| \| \| \| \| \| \| \| \| \|	v3: * Remove extra #includes (Iago) * Use recently added GEN7_BTI_SLM instead of BRW_SLM_SURFACE_INDEX (curro) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Disable vector splitting on shared variables	Jordan Justen	2015-12-09	1	-0/+1
\| \| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	meta: Fix a typo in a print message	Andreas Boll	2015-12-09	1	-1/+1
\| \| \| \| \| \| \|	s/Unkown/Unknown/ Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	i965: Resolve color and flush for all active shader images in ↵	Francisco Jerez	2015-12-09	1	-0/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	intel_update_state(). Fixes arb_shader_image_load_store/execution/load-from-cleared-image.shader_test. Couldn't reproduce any significant FPS regression in CPU-bound benchmarks from the Finnish benchmarking system on neither VLV nor BSW after 30 runs with 95% confidence level. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92849 Cc: Chris Wilson <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: "11.0 11.1" <[email protected]> Tested-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Document inconsistent units the URB size is represented in.	Francisco Jerez	2015-12-09	2	-1/+12
\| \| \| \| \| \|	Every other gen the representation of the URB size was changed and previous ones weren't updated. I'd be willing to write a series normalizing this to be KB on all generations if anybody else cares.
*	i965: Hook up L3 partitioning state atom.	Francisco Jerez	2015-12-09	2	-2/+6
\| \| \| \| \|	Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Work around L3 state leaks during context switches.	Francisco Jerez	2015-12-09	4	-5/+73
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is going to require some rather intrusive kernel changes to fix properly, in the meantime (and forever on at least pre-v4.1 kernels) we'll have to restore the hardware defaults at the end of every batch in which the L3 configuration was changed to avoid interfering with the DDX and GL clients that use an older non-L3-aware version of Mesa. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]> v2: Optimize look-up of the default configuration by assuming it's the first entry of the L3 config array in order to avoid an FPS regression in GpuTest Triangle and SynMark OglBatch2-7 on most affected platforms. Reviewed-by: Jordan Justen <[email protected]>
*	i965: Add debug flag to print out the new L3 state during transitions.	Francisco Jerez	2015-12-09	3	-0/+19
\| \| \| \| \| \| \|	Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement L3 state atom.	Francisco Jerez	2015-12-09	3	-0/+88
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	The L3 state atom calculates the target L3 partition weights when the program bound to some shader stage is modified, and in case they are far enough from the current partitioning it makes sure that the L3 state is re-emitted. v2: Fix for inconsistent units the context URB size is expressed in. Clamp URB size to 1008 KB on SKL due to FF hardware limitation. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Calculate appropriate L3 partition weights for the current pipeline state.	Francisco Jerez	2015-12-09	2	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This calculates a rather conservative partitioning of the L3 cache based on the shaders currently bound to the pipeline and whether they use SLM, atomics, images or scratch space. The result is intended to be fine-tuned later on based on other pipeline state. Note that the L3 partitioning calculated for VLV in the non-SLM non-DC case differs from the hardware defaults in that it doesn't include a DC partition and has twice as much RO cache space -- This is an intentional functional change that improves performance in several bandwidth-bound benchmarks on VLV (5% significance): SynMark OglTexFilterAniso by 14.18%, SynMark OglTexFilterTri by 7.15%, Unigine Heaven by 4.91%, SynMark OglShMapPcf by 2.15%, GpuTest Fur by 1.83%, SynMark OglDrvRes by 1.80%, SynMark OglVsTangent by 1.71%, and a few other benchmarks from the Finnish system by less than 1%. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement selection of the closest L3 configuration based on a vector ↵	Francisco Jerez	2015-12-09	1	-0/+95
\| \| \| \| \| \| \| \| \| \| \| \|	of weights. The input of the L3 set-up code is a vector giving the approximate desired relative size of each partition. This implements logic to compare the input vector against the table of validated configurations for the device and pick the closest compatible one. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Define and use REG_MASK macro to make masked MMIO writes slightly more ↵	Francisco Jerez	2015-12-09	4	-3/+9
\| \| \| \| \| \| \| \|	readable. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/hsw: Enable L3 atomics.	Francisco Jerez	2015-12-09	1	-0/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Improves performance of the arb_shader_image_load_store-atomicity piglit test by over 25x (which isn't a real benchmark it's just heavy on atomics -- the improvement in a microbenchmark I wrote a while ago seemed to be even greater). The drawback is one needs to be extra-careful not to hang the GPU (in fact the whole system). A DC partition must have been allocated on L3, the "convert L3 cycle for DC to UC" bit may not be set, the MOCS L3 cacheability bit must be set for all surfaces accessed using DC atomics, and the SCRATCH1 and ROW_CHICKEN3 bits must be kept in sync. A fairly recent kernel is required for the command parser to allow writes to these registers. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement programming of the L3 configuration.	Francisco Jerez	2015-12-09	1	-0/+95
\| \| \| \| \| \|	Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Import tables enumerating the set of validated L3 configurations.	Francisco Jerez	2015-12-09	2	-0/+168
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It should be possible to use additional L3 configurations other than the ones listed in the tables of validated allocations ("BSpec » 3D-Media-GPGPU Engine » L3 Cache and URB [IVB+] » L3 Cache and URB [*] » L3 Allocation and Programming"), but it seems sensible for now to hard-code the tables in order to stick to the hardware docs. Instead of setting up the arbitrary L3 partitioning given as input, the closest validated L3 configuration will be looked up in these tables and used to program the hardware. The included tables should work for Gen7-9. Note that the quantities are specified in ways rather than in KB, this is because the L3 control registers expect the value in ways, and because by doing that we can re-use a single table for all GT variants of the same generation (and in the case of IVB/HSW and CHV/SKL across different generations) which generally have different L3 way sizes but allow the same combinations of way allocations. v2: Use slice count from the devinfo structure instead of the gt number to implement get_l3_way_size(). Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Add slice count to the brw_device_info structure.	Francisco Jerez	2015-12-09	2	-0/+25
\| \| \| \| \|	Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/gen8: Don't add workaround bits to PIPE_CONTROL stalls if DC flush is set.	Francisco Jerez	2015-12-09	1	-1/+3
\| \| \| \| \| \| \| \| \|	According to the hardware docs a DC flush is sufficient to make CS_STALL happy, there's no need to add STALL_AT_SCOREBOARD whenever it's present. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Define state flag to signal that the URB size has been altered.	Francisco Jerez	2015-12-09	3	-0/+6
\| \| \| \| \| \| \| \|	This will make sure that we recalculate the URB layout anytime the URB size is modified by the L3 partitioning code. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Keep track of whether LRI is allowed in the context struct.	Francisco Jerez	2015-12-09	2	-1/+8
\| \| \| \| \| \| \| \| \| \| \| \|	This stores the result of can_do_pipelined_register_writes() in the context struct so we can find out later whether LRI can be used to program the L3 configuration. v2: * Split change of gen check in can_do_pipelined_register_writes (jljusten) Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Adjust gen check in can_do_pipelined_register_writes	Francisco Jerez	2015-12-09	1	-2/+5
\| \| \| \| \| \| \| \| \| \|	Allow for pipelined register writes for gen < 7. v2: * Split from another patch and adjust comment (jljusten) Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Define symbolic constants for some useful L3 cache control registers.	Francisco Jerez	2015-12-09	1	-0/+53
\| \| \| \| \|	Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Make uniform offsets be in terms of bytes	Jason Ekstrand	2015-12-07	6	-22/+49
\| \| \| \| \| \| \| \| \| \|	This commit pushes makes uniform offsets be terms of bytes starting with nir_lower_io. They get converted to be in terms of vec4s or floats when we cram them in the UNIFORM register file but reladdr remains in terms of bytes all the way down to the point where we lower it to a pull constant load. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/nir_uniforms: Replace comps_per_unit with an is_scalar boolean	Jason Ekstrand	2015-12-07	1	-13/+15
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/nir: Remove unused indirect handling	Jason Ekstrand	2015-12-07	1	-33/+11
\| \| \| \| \| \| \| \| \|	The one and only place where the FS backend allows reladdr is on uniforms. For locals, inputs, and outputs, we lower it away before the backend ever sees it. This commit gets rid of the dead indirect handling code. Cc: "11.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/state: Get rid of dword_pitch arguments to buffer functions	Jason Ekstrand	2015-12-07	6	-38/+19
\| \| \| \| \|	Cc: "11.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vec4: Use a stride of 1 and byte offsets for UBOs	Jason Ekstrand	2015-12-07	3	-27/+7
\| \| \| \| \| \|	Cc: "11.0" <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92909 Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Use a stride of 1 and byte offsets for UBOs	Jason Ekstrand	2015-12-07	3	-16/+13
\| \| \| \| \|	Cc: "11.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vec4: Use byte offsets for UBO pulls on Sandy Bridge	Jason Ekstrand	2015-12-07	3	-10/+31
\| \| \| \| \| \| \| \| \| \| \|	Previously, the VS_OPCODE_PULL_CONSTANT_LOAD opcode operated on vec4-aligned byte offsets on Iron Lake and below and worked in terms of vec4 offsets on Sandy Bridge. On Ivy Bridge, we add a new *LOAD_GEN7 variant which works in terms of vec4s. We're about to change the GEN7 version to work in terms of bytes, so this is a nice unification. Cc: "11.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Fix texture views of 2d array surfaces	Ben Widawsky	2015-12-07	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It is legal to have a texture view of a single layer from a 2D array texture; you can sample from it, or render to it. Intel hardware needs to be made aware when it is using a 2d array surface in the surface state. The texture view is just a 2d surface with the backing miptree actually being a 2d array surface. This caused the previous code would not set the right bit in the surface state since it wasn't considered an array texture. I spotted this early on in debug but brushed it off because it is clearly not needed on other platforms (since they all pass). I have no idea how this works properly on other platforms (I think gen7 introduced the bit in the state, but I am too lazy to check). As such, I have opted not to modify gen7, though I believe the current code is wrong there as well. Thanks to Chris for helping me debug this. v2: Just use the underlying mt's target type to make the array determination. This replaces a bug in the first patch which was incorrectly relying only on non-zero depth (not sure how that had no failures). (Ilia) Cc: Chris Forbes <[email protected]> Reported-by: Mark Janes <[email protected]> (Jenkins) References: https://www.opengl.org/registry/specs/ARB/texture_view.txt Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92609 Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
*	i965: Add brw_device_info::min_ds_entries field.	Kenneth Graunke	2015-12-07	2	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From the 3DSTATE_URB_DS documentation: "Project: IVB, HSW If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 10 URB entries." "Project: BDW+ If Domain Shader Thread Dispatch is Enabled then the minimum number of handles that must be allocated is 34 URB entries." When the HS is run in SINGLE_PATCH mode (the only mode we support today), there is no minimum for HS - it's just zero. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Add state bits for tess stages	Chris Forbes	2015-12-07	4	-2/+28
\| \| \| \| \| \|	Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Add backend structures for tess stages	Chris Forbes	2015-12-07	6	-0/+98
\| \| \| \| \| \|	Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Set core tessellation-related limits	Chris Forbes	2015-12-07	1	-2/+6
\| \| \| \| \| \|	Signed-off-by: Chris Forbes <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>