aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_state.h
Commit message (Collapse)AuthorAgeFilesLines
* i965: Reuse libdrm's header for AUB definitions.Eric Anholt2014-07-021-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pass brw to translate_wrap_mode().Kenneth Graunke2014-06-051-1/+2
| | | | | | | | This lets us do generation checks. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/wm: Surface state overrides for configuring w-tiled as y-tiledTopi Pohjolainen2014-05-151-0/+6
| | | | | | | | | | v2: Use intel_mipmap_tree::total_width in order to get correct alignment automatically. Also use "mt->total_height / mt->physical_depth0" as surface height allowing hardware to offset to correct slice. Cc: "10.2" <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Merge gen8_upload_constant_state into gen7_upload_constant_state.Eric Anholt2014-05-021-5/+0
| | | | | | | The two paths are really similar, and the extra conditionals will be dwarfed by the cost of the actual upload. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Update GS state for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | | | | | This is quite similar to the Gen7 code. The main changes: - 48-bit relocations - Thread count is specified as U/2-1 instead of U-1. - An extra DWord (DW9) with clip planes, URB entry output length/offsets - We need to program the "Expected Vertex Count" (VerticesIn) v2: Set the number of binding table entries so they can be prefetched (requested by Eric Anholt). v3: Add a WARN_ONCE for a missing workaround. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update multisampling state for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | | | | | | | | | | On previous platforms, 3DSTATE_MULTISAMPLE contained the number of samples, pixel location, and the positions of each sample within a pixel for each multisampling mode (4x and 8x). It was also a non-pipelined command, presumably since changing the sample positions is fairly drastic. Broadwell improves upon this by splitting the sample positions out into a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet. Broadwell also supports 2x and 16x multisampling, in addition to the 4x and 8x supported by Gen7. This patch, however, does not implement 2x and 16x. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update BLEND_STATE for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | v2: Allow logic ops on all surface types. The UNORM restriction was lifted with Haswell and I simply hadn't noticed. Also, add missing BRW_NEW_STATE_BASE_ADDRESS dirty bit. Both caught by Eric Anholt. v3: Fix swapped per-RT DWord pairs. Eliminates bizarre hacks. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update SF_CLIP_VIEWPORT for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | It has additional fields to support clipping to the viewport even if guardband clipping is enabled. v2: Update for viewport array changes. v3: No, seriously, update for viewport array changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1]
* i965: Rework SURFACE_STATE entries for Broadwell.Kenneth Graunke2014-01-311-0/+3
| | | | | | | | | | | | | | | | | | | | | | v2: Add missing SCS setting in gen8_emit_buffer_surface_state (caught by Eric Anholt). v3: Use stored QPitch rather than recomputing it. v4: Shift QPitch by 2 when setting it in the packet; bits 14:0 store bits 16:2 of the actual value (fixes myriads of cube and array texturing tests). Also, only enable cube face bits for cubemaps (matches Chris Forbes' commit on master). Port to use offset64. v5: s/gl_format/mesa_format/g v6: Fix DW5 of renderbuffer state, which neglected to subtract irb->mt->first_level. Use vertical_alignment() rather than hardcoding 4. Use ffs for multisample counts rather than a large switch statement (all caught/suggested by Eric). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update SOL state for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | | | | | | | | | | Unlike on Gen7, we can directly set the offset via the state packet. We also -have- to: the kernel SOL reset code won't work anymore. v2: Fix copy and paste mistake in buffer stride setup; drop stale comment (caught by Eric Anholt). Add a perf_debug for missing MOCS setup. v3: Rebase on Paul Berry's changes to CurrentVertexProgram. v4: Fix SO Write Offset handling. We need to set bits 20 and 21 so the hardware both loads and saves the offset. There's also a restriction that 3DSTATE_SO_BUFFER can only be programmed once per buffer between primitives, so the "reset to zero" code needed reworking. Fixes most of the transform feedback Piglit tests. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v2]
* i965: Update the code that disables unused shader stages for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | v2: Also disable 3DSTATE_WM_CHROMAKEY for safety. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1]
* i965: Rework vertex uploads for Broadwell.Kenneth Graunke2014-01-311-0/+3
| | | | | | | | | | | | | | v2: Emit a dummy 3DSTATE_VF_SGVS packet when not needed. v3: Add WARN_ONCE and perf_debugs requested by Eric Anholt. v4: Program 3DSTATE_SGVS even in the no-elements case so gl_VertexID continues working. Fix 3DSTATE_VF_INSTANCING to not use an element index to access the buffers array. Some ARB_draw_indirect prep work. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update STATE_BASE_ADDRESS for Broadwell.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | | | | | v2: Fix missing "change" bit on instruction state base address (caught by Haihao Xiang). v3: Add a perf_debug for missing MOCS setup, requested by Eric. v4: Fix buffer sizes. The value, specified at bit 12 and up, is actually measured in 4k pages. We need to round up to the next multiple of 4k. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]> [v4]
* i965: Update 3DSTATE_PS, 3DSTATE_WM, and add 3DSTATE_PS_EXTRA.Kenneth Graunke2014-01-311-0/+3
| | | | | | | | | | | | | | | | v2: Fix setting of GEN8_PSX_ATTRIBUTE_ENABLE after rebases. v3: Add missing binding table entry counts. Don't worry about alpha testing or alpha to coverage when setting the "Kill Pixel" bit; those are specified in 3DSTATE_PS_BLEND (caught by Eric Anholt). Drop unused _NEW_BUFFERS. Tidy comments. v4: Rebase on Paul Berry's changes to CurrentFragmentProgram. v5: Re-enable line stippling. It doesn't crash or anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v3]
* i965: Rework 3DSTATE_VS for Broadwell.Kenneth Graunke2014-01-311-1/+6
| | | | | | | | | | | | | | | | | | v2: Remove incorrect MOCS shifts; rename urb_entry_write_offset to urb_entry_output_offset to closer match the documentation. v3: Only emit a non-zero constant buffer read length when active. v4: Add missing binding table counts (caught by Eric). v5: Rebase on Paul Berry's changes to CurrentVertexProgram. v6: Drop bogus SBE read length/offset field code. We were programming the wrong values, and our 3DSTATE_SBE code overrides any value we put here anyway with the correct one. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v4]
* i965: Add the new 3DSTATE_PS_BLEND state packet.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | v2: Only set GEN8_PS_BLEND_HAS_WRITEABLE_RT if color buffer writes are enabled (caught by Eric Anholt). v3: Set non-blending flags (writeable RT, alpha test, alpha to coverage) for integer formats too. +14 Piglits. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v2]
* i965: Replace DEPTH_STENCIL_STATE with Gen8's 3DSTATE_WM_DEPTH_STENCIL.Kenneth Graunke2014-01-311-0/+1
| | | | | | | | | | | | | | | v2: Use stencil->_WriteEnabled instead of setting GEN8_WM_DS_STENCIL_BUFFER_WRITE_ENABLE twice (suggested by Eric). v3: Mask stencil->WriteMask and stencil->ValueMask with 0xff. The field is only 8-bits, so we'd trip the new SET_FIELD assertion when core Mesa gave us a value like 0xFFFFFFFF. The Gen7 code uses structure field widths to implicitly do this truncation. Fixes Piglit tests. v4: Use uint32_t for dw1/dw2, not uint8_t. Worst. Typo. Ever. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v2]
* i965: Update SF, SBE, and RASTER state for Broadwell.Kenneth Graunke2014-01-311-0/+3
| | | | | | | | | | | | | | | | The attribute override portion of 3DSTATE_SBE was split out into 3DSTATE_SBE_SWIZ; various bits of 3DSTATE_SF were split out into 3DSTATE_RASTER. v2: Set Force URB Read Offset bit. Eventually the URB read offset should be set in 3DSTATE_VS, but that will require some refactoring. v3: Rebase on viewport array changes. v4: Improve comments about URB read length/offset overrides. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: change gl_format to mesa_formatMark Mueller2014-01-271-2/+2
| | | | s/\bgl_format\b/mesa_format/g. Use better name for Mesa Formats enum
* s/Tungsten Graphics/VMware/José Fonseca2014-01-171-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
* i965: Ensure that all necessary state is re-emitted if we run out of aperture.Paul Berry2014-01-131-0/+1
| | | | | | | | | | | | | | | | | | | | | Prior to this patch, if we ran out of aperture space during brw_try_draw_prims(), we would rewind the batch buffer pointer (potentially throwing some state that may have been emitted by brw_upload_state()), flush the batch, and then try again. However, we wouldn't reset the dirty bits to the state they had before the call to brw_upload_state(). As a result, when we tried again, there was a danger that we wouldn't re-emit all the necessary state. (Note: prior to the introduction of hardware contexts, this wasn't a problem because flushing the batch forced all state to be re-emitted). This patch fixes the problem by leaving the dirty bits set at the end of brw_upload_state(); we only clear them after we have determined that we don't need to rewind the batch buffer. Cc: 10.0 9.2 <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove unused depth_mode parameter from translate_tex_format().Kenneth Graunke2013-12-291-1/+0
| | | | | | | | | | | | According to git blame, this hasn't been used in over two years: commit d2235b0f4681f75d562131d655a6d7b7033d2d8b Author: Eric Anholt <[email protected]> Date: Thu Nov 17 17:01:58 2011 -0800 i965: Always handle GL_DEPTH_TEXTURE_MODE through the shader. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Drop trailing whitespace from the rest of the driver.Kenneth Graunke2013-12-051-5/+5
| | | | | | | Performed via: $ for file in *; do sed -i 's/ *//g'; done Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Fix BRW_BATCH_STRUCT to specify RENDER_RING, not UNKNOWN_RING.Kenneth Graunke2013-12-031-2/+2
| | | | | | | | | | | | | | | | I missed this in the boolean -> enum conversion. C cheerfully casts false -> 0 -> UNKNOWN_RING. On Gen4-5, this causes the render ring prelude hook to get called in the middle of the batch, which is crazy. BRW_BATCH_STRUCT is not used on Gen6+. Fixes regressions since 395a32717df494353703f3581edcd3ba380f16d6 ("i965: Introduce an UNKNOWN_RING state."). Fixes "fips -v glxgears" on Ironlake. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Make swizzle_to_scs non-static.Kenneth Graunke2013-11-161-0/+1
| | | | | | | | | | | | | | We'll need this for Broadwell code as well. Normally, when we make things public, we add the "brw" prefix. I'm not crazy about that in this case, since it deals with prog_instruction.h's SWIZZLE_XYZW values, rather than the BRW_SWIZZLE_XYZW enums. However, I can't think of a better name, and at least the comments and code make it clear. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Simplify the shader time code by using atomic counter helpers.Francisco Jerez2013-10-291-2/+0
| | | | Reviewed-by: Paul Berry <[email protected]>
* i965: Implement ABO surface state emission.Francisco Jerez2013-10-291-0/+3
| | | | | | | | | | | | The maximum number of atomic buffer objects is somewhat arbitrary, we can change it in the future easily if it turns out it's not enough... v2: Add comments with the relevant mesa dirty bits. Fix usage of BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom. v3: Update binding table layout diagrams. v4: Resolve conflicts with the recent dynamic surface index assignment changes. Reviewed-by: Paul Berry <[email protected]>
* i965: Make a brw_stage_prog_data for storing the SURF_INDEX information.Eric Anholt2013-10-151-6/+0
| | | | | | | | | | | It would be nice to be able to pack our binding table so that programs that use 1 render target don't upload an extra BRW_MAX_DRAW_BUFFERS - 1 binding table entries. To do that, we need the compiled program to have information on where its surfaces go. v2: Rename size to size_bytes to be more explicit. Reviewed-by: Paul Berry <[email protected]>
* i965: Generalize brw_vec4_upload_binding_table() beyond vec4 stages.Kenneth Graunke2013-09-191-4/+5
| | | | | | | | | | | | | | Instead of passing in a brw_vec4_prog_data structure, we can simply pass the one field it needs: the number of entries in the binding table. We also need to pass in the shader time surface index rather than hardcoding SURF_INDEX_VEC4_SHADER_TIME. Since the resulting function is stage-agnostic, this patch removes "vec4_" from the name. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/sf: Consolidate common code for setting up gen6-7 attribute overrides.Paul Berry2013-09-161-3/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: Add a state atom to set up geometry shader state.Paul Berry2013-09-111-0/+2
| | | | | | | | | | | | v2: Do not attempt to share the code that uploads 3DSTATE_BINDING_TABLE_POINTERS_GS, 3DSTATE_SAMPLER_STATE_POINTERS_GS, or 3DSTATE_GS with VS. Reviewed-by: Ian Romanick <[email protected]> v3: Add _NEW_TRANSFORM to gen7_gs_state. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7: Extract a function for setting up a shader stage's constants.Paul Berry2013-09-111-0/+6
| | | | | | | | This will allow us to reuse some code when setting up the geometry shader stage. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: Implement support for geometry shader samplers.Paul Berry2013-08-311-0/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: make the state atom for compiling Gen7 geometry shaders.Paul Berry2013-08-311-0/+1
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> v2: Use "unsigned" rather than "GLuint".
* i965/gs: Implement support for geometry shader surfaces.Paul Berry2013-08-311-0/+3
| | | | | | | | | | | | | | This patch implements pull constant upload, binding table upload, and surface setup for geometry shaders, by re-using vertex shader code that was generalized in previous patches. Based on work by Eric Anholt <[email protected]>. v2: Update ditry bits for brw_gs_ubo_surfaces to account for commit 77d8fbc (mesa: add & use a new driver flag for UBO updates instead of _NEW_BUFFER_OBJECT). Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vs: generalize brw_vs_binding_table in preparation for GS.Paul Berry2013-08-311-0/+6
| | | | | | | v2: Use GLbitfield instead of GLbitfield64 in brw_vec4_upload_binding_table. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: generalize brw_vs_pull_constants in preparation for GS.Paul Berry2013-08-311-0/+8
| | | | | | | v2: Use GLbitfield instead of GLbitfield64 in brw_upload_vec4_pull_constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gs: Allocate push constant space for use by GS.Paul Berry2013-08-311-3/+1
| | | | | | | | | | | | | | | | | | Previously, we would always use the same push constant allocation regardless of what shader programs were being run: the available push constant space was split into 2 equal size partitions, one for the vertex shader, and one for the fragment shader. Now that we are adding geometry shader support, we need to do something smarter. This patch adjusts things so that when a geometry shader is in use, we split the available push constant space into 3 nearly-equal size partitions instead of 2. Since the push constant allocation is now affected by GL state, it can no longer be set up by brw_upload_initial_gpu_state(); instead it must be set up by a state atom. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: rename legacy gs structs and functions to ff_gs.Paul Berry2013-08-311-1/+1
| | | | | | | | "ff" is for "fixed function". This frees up the name "gs" to refer to user-defined geometry shaders. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965: Split the brw_samplers atom into separate FS/VS stages.Kenneth Graunke2013-08-191-1/+2
| | | | | | | | This allows us to avoid uploading the VS sampler state table if only the fragment program changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Make upload_sampler_state_table a virtual function.Kenneth Graunke2013-08-191-1/+4
| | | | | | | This allows us to coalesce the brw_samplers and gen7_samplers atoms. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Un-hardcode border color table from upload_default_color.Kenneth Graunke2013-08-191-1/+2
| | | | | | | | | When we begin uploading separate sampler state tables for VS and FS, we won't be able to use &brw->wm.sdc_offset[ss_index]. By passing it in as a parameter, we push the problem up to the caller. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965/gen7+: Mark upload_3dstate_so_decl_list as non-static (v2)Kenneth Graunke2013-08-131-0/+4
| | | | | | | | We will reuse this for Broadwell. v2: Prefix function name with 'gen7'. (chadv) Reviewed-by: Chad Versace <[email protected]>
* i965 Gen4/5: Introduce 'interpolation map' alongside the VUE mapChris Forbes2013-08-011-0/+1
| | | | | | | | | | | | | | | | | | | | | | The interpolation map (in brw->interpolation_mode) is a new auxiliary structure alongside the post-GS VUE map, which describes the interpolation modes for each VUE slot, for use by the clip and SF stages. This patch introduces a new state atom to compute the interpolation map, and adjusts the program keys for the clip and SF stages, but it is not actually used yet. [V1-2]: Signed-off-by: Olivier Galibert <galibert at pobox.com> V3: Updated for vue_map changes, intel -> brw merge, etc. (Chris Forbes) V4: Compute interpolation map as a new state atom rather than tacking it on the front of the clip setup V5: Rework commit message, make interpolation_mode_map a struct. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Pass brw_context to functions rather than intel_context.Kenneth Graunke2013-07-091-2/+2
| | | | | | | | | | | | | | This makes brw_context available in every function that used intel_context. This makes it possible to start migrating fields from intel_context to brw_context. Surprisingly, this actually removes some code, as functions that use OUT_BATCH don't need to declare "intel"; they just use "brw." Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Chris Forbes <[email protected]> Acked-by: Paul Berry <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* i965: Drop unused argument to translate_tex_format().Eric Anholt2013-06-261-1/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit the depth/stencil state pointer directly, not via atoms.Kenneth Graunke2013-06-111-1/+0
| | | | | | | | | | | | | | | | | | | | | See two commits ago for the rationale. This allows us to delete the whole gen7_cc_state.c file. This does move these commands before the depth stall flushes from brw_emit_depthbuffer, which may be a problem. The documentation for 3DSTATE_DEPTH_BUFFER mentions that depth stall flushes are required before changing any depth/stencil buffer state, but explicitly lists 3DSTATE_DEPTH_BUFFER, 3DSTATE_HIER_DEPTH_BUFFER, 3DSTATE_STENCIL_BUFFER, and 3DSTATE_CLEAR_PARAMS. It does not mention this particular packet (_3DSTATE_DEPTH_STENCIL_STATE_POINTERS). No observed Piglit regressions on Sandybridge or Ivybridge. Together with the last two commits, this makes a cairo-gl benchmark faster by 0.324552% +/- 0.258355% on Ivybridge. No statistically significant change on Sandybridge. (Thanks to Eric for the numbers.) Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Emit the CC state pointer directly rather than via atoms.Kenneth Graunke2013-06-111-1/+0
| | | | | | See the previous commit for the rationale. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Emit the BLEND_STATE pointer directly rather than via atoms.Kenneth Graunke2013-06-111-1/+0
| | | | | | | | | | | | | | | | | | | | | | | | Previously, we would: 1. Emit the new indirect state. 2. Flag CACHE_NEW_BLEND_STATE. 3. Rely on later state atoms to notice CACHE_NEW_BLEND_STATE and emit a pointer to the new indirect state. This is rather cumbersome: it requires two state atoms instead of one, and there's a strict ordering dependency in the list. Plus, the code gets spread across two functions (or even files in the case of Gen7+). Gen7+ has a packet to update just the blend state pointer, so it makes a lot of sense to simply emit that right away. Gen6 has a combined packet which updates blending, the color calculator, and depth/stencil state; however, each can still be modified independently. This drops the Gen6 micro-optimization where we tried to only emit one packet that changed all three states. State updates are pretty cheap. CACHE_NEW_BLEND_STATE is no longer necessary, so drop it. Signed-off-by: Kenneth Graunke <[email protected]>
* Revert "i965: Disable unused pipeline stages once at startup on Gen7+."Kenneth Graunke2013-06-111-3/+1
| | | | | | | | | | This reverts commit 6c966ccf07bcaf64fba1a9b699440c30dc96e732. Apparently causes GPU hangs. Conflicts: src/mesa/drivers/dri/i965/brw_state.h src/mesa/drivers/dri/i965/brw_state_upload.c