mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965/nir/fs: Implement nir_intrinsic_ssbo_atomic_*	Iago Toral Quiroga	2015-09-25	2	-0/+79
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/vec4: Implement nir_intrinsic_load_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+54
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/fs: Implement nir_intrinsic_load_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+62
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/vec4: Implement nir_intrinsic_store_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+148
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/nir/fs: Implement nir_intrinsic_store_ssbo	Iago Toral Quiroga	2015-09-25	1	-0/+71
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Import surface message builder functions.	Francisco Jerez	2015-09-25	2	-0/+273
\| \| \| \| \| \| \| \| \| \| \|	Implement helper functions that can be used to construct and send untyped and typed surface read, write and atomic messages to the shared dataport unit. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Import helpers to convert vectors into arrays and back.	Francisco Jerez	2015-09-25	3	-0/+130
\| \| \| \| \| \| \| \| \| \| \| \| \|	These functions handle the conversion of a vec4 into the form expected by the dataport unit in message and message return payloads. The conversion is not always trivial because some messages don't support SIMD4x2 for some generations, in which case a strided copy may be necessary. v2: Split from the FS implementation. v3: Rewrite to avoid evil array_reg, emit_collect and emit_zip. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Introduce VEC4 IR builder.	Francisco Jerez	2015-09-25	2	-0/+603
\| \| \| \| \| \| \| \| \| \| \| \|	See "i965/fs: Introduce FS IR builder." for the rationale. v2: Drop scalarizing VEC4 builder. v3: Take a backend_shader as constructor argument. Improve handling of debug annotations and execution control flags. Rename "instr" variable. Initialize cursor to NULL by default and add method to explicitly point the builder at the end of the program. Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/wm: surfaces should have the API buffer size, not the drm buffer size	Samuel Iglesias Gonsalvez	2015-09-25	1	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \|	The returned drm buffer object has a size multiple of 4096 but that should not be exposed to the API user, which is working with a different size. As far as I can see this problem is only visible in the calculation of the length of unsized arrays used in SSBOs, as the implementation of this needs to query the underlying buffer size via a message. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/wm: emit null buffer surfaces when null buffers are attached	Samuel Iglesias Gonsalvez	2015-09-25	1	-18/+26
\| \| \| \| \| \| \| \| \| \| \|	Otherwise we can expect odd things to happen if, for example, we ask for the size of the attached buffer from shader code, since that might query this value from the surface we uploaded and get random results. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs/nir: implement nir_intrinsic_get_buffer_size	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+24
\| \| \| \| \| \| \| \| \|	v2: - Remove inst->regs_written assignment as the instruction only writes to one register. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Implement FS_OPCODE_GET_BUFFER_SIZE	Samuel Iglesias Gonsalvez	2015-09-25	5	-0/+55
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4/nir: implement nir_intrinsic_get_buffer_size	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+26
\| \| \| \| \|	Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: Implement VS_OPCODE_GET_BUFFER_SIZE	Samuel Iglesias Gonsalvez	2015-09-25	5	-0/+44
\| \| \| \| \| \| \| \|	Notice that Skylake needs to include a header in the sampler message so it will need some tweaks to work there. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	glsl: Add parser/compiler support for unsized array's length()	Samuel Iglesias Gonsalvez	2015-09-25	2	-0/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The unsized array length is computed with the following formula: array.length() = max((buffer_object_size - offset_of_array) / stride_of_array, 0) Of these, only the buffer size needs to be provided by the backends, the frontend already knows the values of the two other variables. This patch identifies the cases where we need to get the length of an unsized array, injecting ir_unop_ssbo_unsized_array_length expressions that will be lowered (in a later patch) to inject the formula mentioned above. It also adds the ir_unop_get_buffer_size expression that drivers will implement to provide the buffer length. v2: - Do not define a triop that will force backends to implement the entire formula, they should only need to provide the buffer size since the other values are known by the frontend (Curro). v3: - Call state->has_shader_storage_buffer_objects() in ast_function.cpp instead of using state->ARB_shader_storage_buffer_object_enable (Tapani). Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/fs: Do not split buffer variables	Iago Toral Quiroga	2015-09-25	1	-0/+1
\| \| \| \| \| \| \| \|	Buffer variables are the same as uniforms, only that read/write, so we want the same treatment. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: handle visiting of ir_var_shader_storage variables	Iago Toral Quiroga	2015-09-25	1	-2/+3
\| \| \| \| \|	Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Upload Shader Storage Buffer Object surfaces	Iago Toral Quiroga	2015-09-25	2	-13/+57
\| \| \| \| \| \| \| \| \| \| \|	Since these are a special kind of UBOs we emit them together reusing the same infrastructure, however, we use a RAW surface so we can reuse existing untyped read/write/atomic messages which include a pixel mask header that we need to set to obtain correct behavior with helper invocations of the fragment shader. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Set MaxShaderStorageBuffers for compute shaders	Iago Toral Quiroga	2015-09-25	1	-0/+3
\| \| \| \| \| \| \| \|	v2: - Set it after the driver's MaxShaderStorageBuffers value assignment. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: set ARB_shader_storage_buffer_object related constant values	Samuel Iglesias Gonsalvez	2015-09-25	1	-0/+12
\| \| \| \| \| \| \| \| \| \| \| \|	v2: - Add tessellation shader constants assignment v3: - Set MaxShaderStorageBufferBindings to 36. Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement DriverFlags.NewShaderStorageBuffer	Iago Toral Quiroga	2015-09-25	2	-0/+3
\| \| \| \| \| \| \| \|	We use the same dirty state for SSBOs and UBOs because they share the same infrastructure. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Use 64-byte offset alignment for shader storage buffers	Iago Toral Quiroga	2015-09-25	1	-0/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This should be a cacheline (64 bytes) so that we can safely have the CPU and GPU writing the same SSBO on non-cachecoherent systems (our Atom CPUs). With UBOs, the GPU never writes, so there's no problem. For an SSBO, the GPU and the CPU can be updating disjoint regions of the buffer simultaneously and that will break if the regions overlap the same cacheline. v2: - Use cacheline size (64 bytes) instead of 16 bytes (Kristian). - Update commit log and add a comment in the code explaining why we use cacheline size (Ben). Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Implement DispatchComputeIndirect support	Jordan Justen	2015-09-24	3	-4/+60
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/vec4: check swizzle before discarding a uniform on a 3src operand	Alejandro Piñeiro	2015-09-24	1	-3/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Without this commit, copy propagation is discarded if it involves a uniform with an instruction that has 3 sources. But 3 sourced instructions can access scalar values. For example, this is what vec4_visitor::fix_3src_operand() is already doing: if (src.file == UNIFORM && brw_is_single_value_swizzle(src.swizzle)) return src; Shader-db results (unfiltered) on NIR: total instructions in shared programs: 6259650 -> 6241985 (-0.28%) instructions in affected programs: 812755 -> 795090 (-2.17%) helped: 7930 HURT: 0 Shader-db results (unfiltered) on IR: total instructions in shared programs: 6445822 -> 6441788 (-0.06%) instructions in affected programs: 296630 -> 292596 (-1.36%) helped: 2533 HURT: 0 v2: - Updated commit message, using Matt Turner suggestions - Move the check after we've created the final value, as Jason Ekstrand suggested - Clean up the condition v3: - Move the check back to the original place, to keep things tidy, as suggested by Jason Ekstrand v4: - Fixed missing is_single_value_swizzle() as pointed by Jason Ekstrand Reviewed-by: Matt Turner <[email protected]>
*	i965: Respect stride and subreg_offset for ATTR registers	Kristian Høgsberg Kristensen	2015-09-24	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \|	When we assign hw regs to attributes, we don't incorporate the stride and subreg_offset from the fs_reg. It's rarely used, but the integer multiplication lowering uses unusual stride and subreg_offset combination breaks when one source is an attribute. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91970 Cc: "10.6 11.0" <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	mesa: rework Driver.CopyImageSubData() and related code	Brian Paul	2015-09-24	3	-31/+154
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, core Mesa's _mesa_CopyImageSubData() created temporary textures to wrap renderbuffer sources/destinations. This caused a bit of a mess in the Mesa/gallium state tracker because we had to basically undo that wrapping. Instead, change ctx->Driver.CopyImageSubData() to take both gl_renderbuffer and gl_texture_image src/dst pointers (one being null, the other non-null) so the driver can handle renderbuffer vs. texture as needed. For the i965 driver, we basically moved the code that wrapped textures around renderbuffers from copyimage.c down into the met and driver code. The old code in copyimage.c also made some questionable calls to _mesa_BindTexture(), etc. which weren't undone at the end. v2 (Jason Ekstrand): Rework the intel bits v3 (Brian Paul): Update the temporary st_CopyImageSubData() function. Reviewed-by: Topi Pohjolainen <[email protected]> Tested-by: Kai Wasserbäch <[email protected]> Tested-by: Nick Sarnie <[email protected]>
*	i965: add ARB_texture_barrier support	Ilia Mirkin	2015-09-23	2	-0/+10
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gs: Fix extra level of indentation left by the previous commit.	Kenneth Graunke	2015-09-23	2	-115/+111
\| \| \| \| \| \| \| \|	I left a bunch of code indented a level in the previous patch to make the diff easier to read. But now we should fix that. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965/gs: Use new NIR intrinsics.	Kenneth Graunke	2015-09-23	4	-26/+48
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	By performing the vertex counting in NIR, we're able to elide a ton of useless safety checks around every EmitVertex() call: total instructions in shared programs: 3952 -> 3720 (-5.87%) instructions in affected programs: 3491 -> 3259 (-6.65%) helped: 11 HURT: 0 Improves performance in Gl32GSCloth by 0.671742% +/- 0.142202% (n=621) on Haswell GT3e at 1024x768. This should also make it easier to implement Broadwell's "Static Vertex Count" feature someday. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i915: Make hw_prim[] const	Ville Syrjälä	2015-09-23	1	-1/+1
\| \| \| \| \| \| \| \|	The table used to map the GL primitive to the hw primitive never changes so make it const. Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	mesa: Remove unused HAVE_TRI_STRIP_1 defines	Ian Romanick	2015-09-23	5	-5/+0
\| \| \| \| \| \| \|	Defined to 0 in a few places, but it's not used anywhere. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	t_dd_dmatmp: Remove HAVE_QUADS support	Ian Romanick	2015-09-23	2	-2/+0
\| \| \| \| \| \| \| \| \|	Two drivers use this file, and neither supports quads. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	t_dd_dmatmp: Remove HAVE_QUAD_STRIPS support	Ian Romanick	2015-09-23	2	-2/+0
\| \| \| \| \| \| \| \| \|	Two drivers use this file, and neither supports quad strips. No piglit regressions on i915 (G33) or radeon (Radeon 7500). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	t_dd_dmatmp: Make "count" actually be the count	Ian Romanick	2015-09-23	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The value passed in count previously was "vertex after the last vertex to be processed." Calling that "count" was misleading and kind of mean. Looking at the code, many functions immediately do "count-start" to get back the true count. That's just silly. If it is better for the loops to be 'for (j = start; j < (start + count); j++)', GCC will do that transformation. NOTE: There is some strange formatting left by this patch. That was done to make it more obvious that the before and after code is equivalent. These will be fixed in the next patch. No piglit regressions on i915 (G33) or radeon (Radeon 7500). v2: Fix a remaining (count-start) in render_quad_strip_verts. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> [v1] Cc: "10.6 11.0" <[email protected]>
*	i965/vec4: Don't coalesce regs in Gen6 MATH ops if reswizzle/writemask needed	Antia Puentes	2015-09-23	2	-3/+12
\| \| \| \| \| \| \| \|	Gen6 MATH instructions can not execute in align16 mode, so swizzles or writemasking are not allowed. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92033 Reviewed-by: Matt Turner <[email protected]>
*	i965/vec4: Detect and delete useless MOVs.	Matt Turner	2015-09-22	1	-0/+22
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With NIR: instructions in affected programs: 111508 -> 109193 (-2.08%) helped: 507 Without NIR: instructions in affected programs: 28763 -> 28474 (-1.00%) helped: 186 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vec4: Add support for fdph_replicated	Jason Ekstrand	2015-09-22	1	-0/+5
\| \| \| \| \|	Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Add defines for tessellation stages	Chris Forbes	2015-09-22	1	-0/+72
\| \| \| \| \| \| \| \| \| \| \| \| \|	v2 (Ken): - Squash together commits for HS, DS, and TE, as well as fixes. - Add INTEL_MASK variants so we can use SET_FIELD if we want. - Rename GEN7_HS_INSTANCE_CONTROL to GEN7_HS_INSTANCE_COUNT to match the documentation. - Add some more fields from the PRMs. - Add Broadwell variants. Signed-off-by: Chris Forbes <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
*	i965/vec4: refactor brw_vec4_copy_propagation.	Alejandro Piñeiro	2015-09-22	1	-14/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now it is more similar to brw_fs_copy_propagation, with three clear stages: 1) Build up the value we are propagating as if it were the source of a single MOV: 2) Check that we can propagate that value 3) Build the final value Previously everything was somewhat messed up, making the implementation on some specific cases, like knowing if you can propagate from a previous instruction even with type mismatches, even messier (for example, with the need of maintaining more of one has_source_modifiers). The refactoring clears stuff, and gives support to this mentioned use case without doing anything extra (for example, only one has_source_modifiers is used). Shader-db results for vec4 programs on Haswell: total instructions in shared programs: 1683842 -> 1669037 (-0.88%) instructions in affected programs: 739837 -> 725032 (-2.00%) helped: 6237 HURT: 0 v2: using 'arg' index to get the from inst was wrong v3: rebased against last change on the previous patch of the series v4: don't need to track instructions on struct copy_entry, as we only set the source on a direct copy v5: change the approach for a refactoring v6: tweaked comments Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: fix textureGrad for cubemaps	Tapani Pälli	2015-09-22	1	-19/+182
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes bugs exposed by commit 2b1cdb0eddb73f62e4848d4b64840067f1f70865 in: ES3-CTS.gtf.GL3Tests.shadow.shadow_execution_frag No regressions observed in deqp, CTS or Piglit. v2: address review feedback from Iago Toral: - move rho calculation to else branch - optimize dx and dy calculation - fix documentation inconsistensies Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91114 Cc: "10.6 11.0" <[email protected]>
*	i965: Clean up GLSL compiler option setup	Jason Ekstrand	2015-09-21	1	-26/+20
\| \| \| \| \| \| \| \| \|	The only functional change here is that we now set EmitNoIndirectOutput and EmitNoIndirectTemp for compute shaders. Compute shaders don't have outputs per-se and we should have been setting EmitNoIndirectTemp all along. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/skl: Use larger URB size where available.	Ben Widawsky	2015-09-21	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	All SKL SKUs except the lowest one which has half the L3 size actually have 384K of URB per slice. For once, I can explain how this mistake was made and how it was missed in review... Historically when we enable a platform and put the production sizes, you can simply look at the "smallest" SKU and see what its URB size is (and we assumed it was the 1 slice variant). Since on newer platforms the URB sizes are scaled automatically by HW, this was sufficient. On SKL, this is a bit different as the lowest SKU actually has half of the L3 fused off. GT2 is the 1 slice (not GT1) variant and it has 384K. There are no Jenkins tests fixed (or regressions) and we don't expect any fixes here because you can always run with less URB size. Thanks to Sarah for bringing this to my attention. Cc: Sarah Sharp <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Fix MRF register number assertions for compr4.	Kenneth Graunke	2015-09-21	1	-2/+2
\| \| \| \| \| \| \| \| \| \|	compr4 is represented by setting the high bit on the MRF number. We need to mask it out before sanity checking the register number. Fixes ~8000 assert fails on Ironlake and G45. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92066 Signed-off-by: Kenneth Graunke <[email protected]>
*	i965/vec4: Use MRF registers 21-23 for spilling in gen6	Iago Toral Quiroga	2015-09-21	1	-4/+6
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/fs: Use MRF registers 21-23 for spilling in gen6	Iago Toral Quiroga	2015-09-21	1	-4/+7
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Turn BRW_MAX_MRF into a macro that accepts a hardware generation	Iago Toral Quiroga	2015-09-21	8	-28/+28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	There are some bug reports about shaders failing to compile in gen6 because MRF 14 is used when we need to spill. For example: https://bugs.freedesktop.org/show_bug.cgi?id=86469 https://bugs.freedesktop.org/show_bug.cgi?id=90631 Discussion in bugzilla pointed to the fact that gen6 might actually have 24 MRF registers available instead of 16, so we could use other MRF registers and avoid these conflicts (we still need to investigate why some shaders need up to MRF 14 anyway, since this is not expected). Notice that the hardware docs are not clear about this fact: SNB PRM Vol4 Part2's "Table 5-4. MRF Registers Available in Device Hardware" says "Number per Thread" - "24 registers" However, SNB PRM Vol4 Part1, 1.6.1 Message Register File (MRF) says: "Normal threads should construct their messages in m1..m15. (...) Regardless of actual hardware implementation, the thread should not assume th at MRF addresses above m15 wrap to legal MRF registers." Therefore experimentation was necessary to evaluate if we had these extra MRF registers available or not. This was tested in gen6 using MRF registers 21..23 for spilling and doing a full piglit run (all.py) forcing spilling of everything on the FS backend. It was also tested by doing spilling of everything on both the FS and the VS backends with a piglit run of shader.py. In both cases no regressions were observed. In fact, many of these tests where helped in the cases where we forced spilling, since that triggered the same underlying problem described in the bug reports. Here are some results using INTEL_DEBUG=spill_fs,spill_vec4 for a shader.py run on gen6 hardware: Using MRFs 13..15 for spilling: crash: 2, fail: 113, pass: 6621, skip: 5461 Using MRFs 21..23 for spilling: crash: 2, fail: 12, pass: 6722, skip: 5461 This patch sets the ground for later patches to implement spilling using MRF registers 21..23 in gen6. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Move MRF register asserts out of brw_reg.h	Iago Toral Quiroga	2015-09-21	4	-7/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In a later patch we will make BRW_MAX_MRF return a different value depending on the hardware generation, but it is inconvenient to add a gen parameter to the brw_reg functions only for the assertions, so move these to places where we have the hardware generation available. Ken suggested to add the asserts to brw_set_src0 and brw_set_dest since that would make sure that we catch all uses of MRF registers, even those coming from modules that generate native code directly, like blorp. Unfortunately, this is very late in the process which can make things harder to debug, so add asserts to the generator as well. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Maximum allowed size of SEND messages is 15 (4 bits)	Iago Toral Quiroga	2015-09-21	4	-2/+10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Until now we only used MRFs 1..15 for regular SEND messages, so the message length could not possibly exceed the maximum size. Soon we'll allow to use MRF registers 1..23 in gen6, so we need to be careful not to build messages that can go beyond the limit. That could occur, specifically, when building URB write messages, which we may need to split in chunks due to their size. Previously we would simply go and create a new message when we reached MRF 13 (since 13..15 were reserved for spilling), now we also want to check the size of the message explicitly. Besides adding that condition to split URB write messages properly, this patch also adds asserts in the generator. Notice that brw_inst_set_mlen already asserts for this, but asserting in the generators is easy and can make debugging easier in some cases. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/vec4/nir: Remove all "this->" snippets	Eduardo Lima Mitev	2015-09-20	1	-16/+15
\| \| \| \| \| \| \| \|	For consistency, either we have all class members dereferenced, or none. In this case, very few are so lets get rid of them all. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	dri/common: fix gbm-symbols-check regression	Marcin Ślusarz	2015-09-20	1	-1/+1
\| \| \| \| \| \| \| \|	Broken by commit c228514c72cb2fd5fb9e510808e29204fc9e7ae1 "dri/common: use sysconfdir when looking for drirc". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92054 Signed-off-by: Marcin Ślusarz <[email protected]>