mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965/gen9: Add a condition for starting pixel in fast copy blit	Anuj Phogat	2015-09-28	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \|	This condition restricts the use of fast copy blit to cases where starting pixel of src and dst is oword (16 byte) aligned. Many piglit tests (if using fast copy blit in Mesa) failed earlier because I missed adding this condition.Fast copy blit is currently enabled for use only with Yf/Ys tiling. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Rename intel_miptree_get_dimensions_for_image()	Anuj Phogat	2015-09-28	5	-10/+14
\| \| \| \| \| \| \| \| \| \| \|	This function isn't specific to miptrees. So, drop the "miptree" from function name. V3: Add a comment explaining how the 1D Array texture height and depth is interpreted by Intel hardware. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965/gen9: Fix {src, dst}_pitch alignment check for XY_FAST_COPY_BLT	Anuj Phogat	2015-09-28	1	-11/+7
\| \| \| \| \| \| \| \| \|	I misinterpreted the alignmnet restriction in XY_FAST_COPY_BLT earlier. Instead of checking pitch for 64KB alignmnet we need to check it for tile widh alignment. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Fix {src, dst}_pitch alignment check for XY_SRC_COPY_BLT	Anuj Phogat	2015-09-28	1	-2/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Current code checks the alignment restrictions only for Y tiling. From Broadwell PRM vol 10: "pitch is of 512Byte granularity for Tile-X: This means the tiled-x surface pitch can be (512, 1024, 1536, 2048...)/4 (in Dwords)." This patch adds the restriction for X tiling as well. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Move conversion of {src, dst}_pitch to dwords outside if/else	Anuj Phogat	2015-09-28	1	-16/+9
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Delete temporary variable 'src_pitch'	Anuj Phogat	2015-09-28	1	-5/+1
\| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Use helper function intel_get_tile_dims() in surface setup	Anuj Phogat	2015-09-28	1	-2/+12
\| \| \| \| \| \| \| \| \| \|	It takes care of using the correct tile width if we later use other tiling patterns for aux miptree. V2: Remove the comment about using Yf for aux miptree. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Use intel_get_tile_dims() to get tile masks	Anuj Phogat	2015-09-28	4	-33/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will require change in the parameters passed to intel_miptree_get_tile_masks(). V2: Rearrange the order of parameters. (Ben) Change the name to intel_get_tile_masks(). (Topi) V3: Use temporary variables in intel_get_tile_masks() for clarity. Fix mask_y computation. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Add a helper function intel_get_tile_dims()	Anuj Phogat	2015-09-28	2	-22/+64
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	V2: - Do the tile width/height computations in the new helper function and use it later in intel_miptree_get_tile_masks(). - Change the name to intel_get_tile_dims(). V3: Return the tile_h in number of rows in place of bytes. Document the units of tile_w, tile_h parameters. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965/fs: Fix hang on IVB and VLV with image format mismatch.	Francisco Jerez	2015-09-28	1	-4/+38
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	IVB and VLV hang sporadically when an untyped surface read or write message is used to access a surface of format other than RAW, as may happen when there is a mismatch between the format qualifier of the image uniform and the format of the actual image bound to the pipeline. According to the spec this condition gives undefined results but may not lead to program termination (which is one of the possible outcomes of the hang). Fix it by checking at runtime whether the surface is of the right type. Fixes the "arb_shader_image_load_store.invalid/format mismatch" piglit subtest. Reported-by: Mark Janes <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91718 CC: [email protected] Reviewed-by: Ian Romanick <[email protected]>
*	i965/gs: Optimize away the EOT write on Gen8+ with static vertex count.	Kenneth Graunke	2015-09-26	1	-0/+15
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With static vertex counts, the final EOT write doesn't actually write any data - it's just there to end the thread. Typically, the last thing before ending the thread will be an EmitVertex() call, resulting in a URB write. We can just set EOT on that. Note that this isn't always possible - there might be an intervening SSBO write/image store, or the URB write may have been in a loop. shader-db statistics for geometry shaders only: total instructions in shared programs: 3173 -> 3149 (-0.76%) instructions in affected programs: 176 -> 152 (-13.64%) helped: 8 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/gs: Allow src0 immediates in GS_OPCODE_SET_WRITE_OFFSET.	Kenneth Graunke	2015-09-26	2	-2/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	GS_OPCODE_SET_WRITE_OFFSET is a MUL with a constant src[1] and special strides. We can easily make the generator handle constant src[0] arguments by instead generating a MOV with the product of both operands. This isn't necessarily a win in and of itself - instead of a MUL, we generate a MOV, which should be basically the same cost. However, we can probably avoid the earlier MOV to put src[0] into a register. shader-db statistics for geometry shaders only: total instructions in shared programs: 3207 -> 3173 (-1.06%) instructions in affected programs: 3207 -> 3173 (-1.06%) helped: 11 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Implement "Static Vertex Count" geometry shader optimization.	Kenneth Graunke	2015-09-26	5	-4/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Move GS_THREAD_END mlen calculations out of the generator.	Kenneth Graunke	2015-09-26	2	-2/+2
\| \| \| \| \| \| \| \| \| \|	The visitor was setting a mlen that was wrong for Broadwell, but the generator was ignoring it and doing the right thing regardless. We may as well move the logic fully into the visitor. This will be useful in the next commit as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Simplify handling of VUE map changes.	Kenneth Graunke	2015-09-26	4	-42/+17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/gs: Remove the dependency on the VS VUE map.	Kenneth Graunke	2015-09-26	2	-11/+14
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because we only support geometry shaders in core profile, we can safely ignore any driver-extending of VS outputs. Those are: - Legacy userclipping (doesn't exist in core profile) - Edgeflag copying (Gen4-5 only, no GS support) - Point coord replacement (Gen4-5 only, no GS support) - front/back color hacks (Gen4-5 only, no GS support) v2: Rebase; leave a comment about why SSO works. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Don't re-layout varyings for separate shader programs.	Kenneth Graunke	2015-09-26	5	-18/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/vue: Make assign_vue_map() take an explicit slot.	Kenneth Graunke	2015-09-26	1	-16/+19
\| \| \| \| \| \| \| \| \| \| \| \|	Our plan of assigning consecutive slots doesn't work properly for separate shader objects - at least, if we want to avoid recompiling them whenever the interface changes. As a first step, make assign_vue_map take an explicit slot parameter, rather than implicitly incrementing it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Initialize unused VUE map slots to BRW_VARYING_SLOT_PAD.	Kenneth Graunke	2015-09-26	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	Nothing actually relies on unused slots being initialized to BRW_VARYING_SLOT_COUNT. Soon, we're going to have VUE maps with holes in them, at which point pre-filling with BRW_VARYING_SLOT_PAD make a lot more sense. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Fix BRW_VARYING_SLOT_PAD handling in the scalar VS backend.	Kenneth Graunke	2015-09-26	1	-4/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	We can't just break for padding slots. Instead, treat them like unwritten output variables, so we handle flushing and incrementing urb_offset correctly. Paul introduced the concept of padding slots back in 2011, but we've never actually used them for anything. So it's unsurprising that the scalar VS backend didn't handle them quite right. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Enable ARB_shader_storage_buffer_object extension for gen7+	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+1
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/vec4: Implement nir_intrinsic_ssbo_atomic_*	Iago Toral Quiroga	2015-09-25	2	-0/+79
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_*	Iago Toral Quiroga	2015-09-25	2	-0/+79
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/vec4: Implement nir_intrinsic_load_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+54
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/fs: Implement nir_intrinsic_load_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+62
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/vec4: Implement nir_intrinsic_store_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+148
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/fs: Implement nir_intrinsic_store_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+71
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Import surface message builder functions.	Francisco Jerez	2015-09-25	2	-0/+273
\| \| \| \| \| \| \| \| \| \| \|	Implement helper functions that can be used to construct and send untyped and typed surface read, write and atomic messages to the shared dataport unit. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Import helpers to convert vectors into arrays and back.	Francisco Jerez	2015-09-25	3	-0/+130
\| \| \| \| \| \| \| \| \| \| \| \| \|	These functions handle the conversion of a vec4 into the form expected by the dataport unit in message and message return payloads. The conversion is not always trivial because some messages don't support SIMD4x2 for some generations, in which case a strided copy may be necessary. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Introduce VEC4 IR builder.	Francisco Jerez	2015-09-25	2	-0/+603
\| \| \| \| \| \| \| \| \| \| \| \|	See "i965/fs: Introduce FS IR builder." for the rationale. v2: Drop scalarizing VEC4 builder. v3: Take a backend_shader as constructor argument. Improve handling of debug annotations and execution control flags. Rename "instr" variable. Initialize cursor to NULL by default and add method to explicitly point the builder at the end of the program. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/wm: surfaces should have the API buffer size, not the drm buffer size	Samuel Iglesias Gonsalvez	2015-09-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The returned drm buffer object has a size multiple of 4096 but that should not be exposed to the API user, which is working with a different size. As far as I can see this problem is only visible in the calculation of the length of unsized arrays used in SSBOs, as the implementation of this needs to query the underlying buffer size via a message. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/wm: emit null buffer surfaces when null buffers are attached	Samuel Iglesias Gonsalvez	2015-09-25	1	-18/+26
\| \| \| \| \| \| \| \| \| \| \|	Otherwise we can expect odd things to happen if, for example, we ask for the size of the attached buffer from shader code, since that might query this value from the surface we uploaded and get random results. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs/nir: implement nir_intrinsic_get_buffer_size	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+24
\| \| \| \| \| \| \| \| \|	v2: - Remove inst->regs_written assignment as the instruction only writes to one register. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Implement FS_OPCODE_GET_BUFFER_SIZE	Samuel Iglesias Gonsalvez	2015-09-25	5	-0/+55
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4/nir: implement nir_intrinsic_get_buffer_size	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+26
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE	Samuel Iglesias Gonsalvez	2015-09-25	5	-0/+44
\| \| \| \| \| \| \| \|	Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	glsl: Add parser/compiler support for unsized array's length()	Samuel Iglesias Gonsalvez	2015-09-25	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The unsized array length is computed with the following formula: array.length() = max((buffer_object_size - offset_of_array) / stride_of_array, 0) Of these, only the buffer size needs to be provided by the backends, the frontend already knows the values of the two other variables. This patch identifies the cases where we need to get the length of an unsized array, injecting ir_unop_ssbo_unsized_array_length expressions that will be lowered (in a later patch) to inject the formula mentioned above. It also adds the ir_unop_get_buffer_size expression that drivers will implement to provide the buffer length. v2: - Do not define a triop that will force backends to implement the entire formula, they should only need to provide the buffer size since the other values are known by the frontend (Curro). v3: - Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead of using state->ARB_shader_storage_buffer_object_enable (Tapani). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Do not split buffer variables	Iago Toral Quiroga	2015-09-25	1	-0/+1
\| \| \| \| \| \| \| \|	Buffer variables are the same as uniforms, only that read/write, so we want the same treatment. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: handle visiting of ir_var_shader_storage variables	Iago Toral Quiroga	2015-09-25	1	-2/+3
\| \| \| \| \|	Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Upload Shader Storage Buffer Object surfaces	Iago Toral Quiroga	2015-09-25	2	-13/+57
\| \| \| \| \| \| \| \| \| \| \|	Since these are a special kind of UBOs we emit them together reusing the same infrastructure, however, we use a RAW surface so we can reuse existing untyped read/write/atomic messages which include a pixel mask header that we need to set to obtain correct behavior with helper invocations of the fragment shader. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Set MaxShaderStorageBuffers for compute shaders	Iago Toral Quiroga	2015-09-25	1	-0/+3
\| \| \| \| \| \| \| \|	v2: - Set it after the driver's MaxShaderStorageBuffers value assignment. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: set ARB_shader_storage_buffer_object related constant values	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	v2: - Add tessellation shader constants assignment v3: - Set MaxShaderStorageBufferBindings to 36. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement DriverFlags.NewShaderStorageBuffer	Iago Toral Quiroga	2015-09-25	2	-0/+3
\| \| \| \| \| \| \| \|	We use the same dirty state for SSBOs and UBOs because they share the same infrastructure. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Use 64-byte offset alignment for shader storage buffers	Iago Toral Quiroga	2015-09-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should be a cacheline (64 bytes) so that we can safely have the CPU and GPU writing the same SSBO on non-cachecoherent systems (our Atom CPUs). With UBOs, the GPU never writes, so there's no problem. For an SSBO, the GPU and the CPU can be updating disjoint regions of the buffer simultaneously and that will break if the regions overlap the same cacheline. v2: - Use cacheline size (64 bytes) instead of 16 bytes (Kristian). - Update commit log and add a comment in the code explaining why we use cacheline size (Ben). Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Implement DispatchComputeIndirect support	Jordan Justen	2015-09-24	3	-4/+60
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: check swizzle before discarding a uniform on a 3src operand	Alejandro Piñeiro	2015-09-24	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this commit, copy propagation is discarded if it involves a uniform with an instruction that has 3 sources. But 3 sourced instructions can access scalar values. For example, this is what vec4_visitor::fix_3src_operand() is already doing: if (src.file == UNIFORM && brw_is_single_value_swizzle(src.swizzle)) return src; Shader-db results (unfiltered) on NIR: total instructions in shared programs: 6259650 -> 6241985 (-0.28%) instructions in affected programs: 812755 -> 795090 (-2.17%) helped: 7930 HURT: 0 Shader-db results (unfiltered) on IR: total instructions in shared programs: 6445822 -> 6441788 (-0.06%) instructions in affected programs: 296630 -> 292596 (-1.36%) helped: 2533 HURT: 0 v2: - Updated commit message, using Matt Turner suggestions - Move the check after we've created the final value, as Jason Ekstrand suggested - Clean up the condition v3: - Move the check back to the original place, to keep things tidy, as suggested by Jason Ekstrand v4: - Fixed missing is_single_value_swizzle() as pointed by Jason Ekstrand Reviewed-by: Matt Turner <[email protected]>
*	i965: Respect stride and subreg_offset for ATTR registers	Kristian Høgsberg Kristensen	2015-09-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	When we assign hw regs to attributes, we don't incorporate the stride and subreg_offset from the fs_reg. It's rarely used, but the integer multiplication lowering uses unusual stride and subreg_offset combination breaks when one source is an attribute. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970 Cc: "10.6 11.0" <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	mesa: rework Driver.CopyImageSubData() and related code	Brian Paul	2015-09-24	3	-31/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, core Mesa's _mesa_CopyImageSubData() created temporary textures to wrap renderbuffer sources/destinations. This caused a bit of a mess in the Mesa/gallium state tracker because we had to basically undo that wrapping. Instead, change ctx->Driver.CopyImageSubData() to take both gl_renderbuffer and gl_texture_image src/dst pointers (one being null, the other non-null) so the driver can handle renderbuffer vs. texture as needed. For the i965 driver, we basically moved the code that wrapped textures around renderbuffers from copyimage.c down into the met and driver code. The old code in copyimage.c also made some questionable calls to _mesa_BindTexture(), etc. which weren't undone at the end. v2 (Jason Ekstrand): Rework the intel bits v3 (Brian Paul): Update the temporary st_CopyImageSubData() function. Reviewed-by: Topi Pohjolainen <[email protected]> Tested-by: Kai Wasserbäch <[email protected]> Tested-by: Nick Sarnie <[email protected]>
*	i965: add ARB_texture_barrier support	Ilia Mirkin	2015-09-23	2	-0/+10
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gs: Fix extra level of indentation left by the previous commit.	Kenneth Graunke	2015-09-23	2	-115/+111
\| \| \| \| \| \| \| \|	I left a bunch of code indented a level in the previous patch to make the diff easier to read. But now we should fix that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>