mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	i965: Add lossless compression to surface format table	Ben Widawsky	2015-11-20	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Background: Prior to Skylake and since Ivybridge Intel hardware has had the ability to use a MCS (Multisample Control Surface) as auxiliary data in "compression" operations on the surface. This reduces memory bandwidth. This hardware was either used for MSAA compression, or fast clear operations. On Gen8, a similar mechanism exists to allow the hiz buffer to be sampled from, and therefore this feature is sometimes referred to more generally as "AUX buffers". Skylake adds the ability to have the display engine directly source compressed surfaces on top of the ability to sample from them. Inference dictates that enabling this display features adds a restriction to the formats which could actually be compressed. This is backed up by a blurb in the AUX_CCS_D section from the RENDER_SURFACE_STATE: "In addition, if the surface is bound to the sampling engine, Surface Format must be supported for Render Target Compression for surfaces bound to the sampling engine." The current set of surfaces seems to be a subset as compared to previous gens (see the next patch). Also, if I had to guess I would guess that future gens add support for more surface formats. To make handling this a bit easier to read, and more future proof, the support for this is moved into the surface formats table. Along with the modifications to the table, a helper function is also provided to determine if a surface is CCS_E compatible. Because fast clears are currently disabled on SKL, we can plumb the helper all the way through here, and not actually have anything break. v2: - rename ccs to ccs_e; Requested-by: Chad - rename lossless_compression to lossless_compression Requested-by: Chad - change meaning of brw_losslessly_compressible_format Requested-by: Chad - related changes to the code to reflect this. - remove excess ccs (Chad) v3: - Commit message changes (Topi) - Const some things which could be const (Topi) Requested-by: Chad Versace <[email protected]> Requested-by: Neil Roberts <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
*	i965: Add INTEL_DEBUG=shader_time support for tessellation shaders.	Kenneth Graunke	2015-11-17	1	-0/+2
\| \| \| \| \|	Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	i965: Combine BRW_NEW_*_BINDING_TABLE dirty bits.	Kenneth Graunke	2015-11-11	1	-6/+2
\| \| \| \| \| \| \| \| \| \| \|	A while back, we moved to directly emitting the Gen7+ state when constructing the binding tables. These flags are only used on Gen4-6, which emit all the binding table pointers at once. We gain nothing by having separate flags, so combine them. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Setup pull constant state for compute programs	Jordan Justen	2015-11-01	1	-1/+1
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: remove cache_aux_free_func array	Emil Velikov	2015-10-28	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \|	There is only one function that can be called, which is well known at compilation time. The abstraction used here seems unnecessary, so let's use a direct call to brw_stage_prog_data_free() when appropriate, cut down the size of struct brw_cache. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: keep track of intra-stage indices for atomics	Timothy Arceri	2015-10-27	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This is more optimal as it means we no longer have to upload the same set of ABO surfaces to all stages in the program. This also fixes a bug where since commit c0cd5b var->data.binding was being used as a replacement for atomic buffer index, but they don't have to be the same value they just happened to end up the same when binding is 0. Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: Ilia Mirkin <[email protected]> Cc: Alejandro Piñeiro <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90175
*	i965/vec4: print predicate control at brw_vec4 dump_instruction	Alejandro Piñeiro	2015-10-22	1	-0/+1
\| \| \| \| \| \| \|	v2: externalize pred_ctrl_align16 from brw_disasm.c instead of adding a copy on brw_vec4.c, as suggested by Matt Turner Reviewed-by: Matt Turner <[email protected]>
*	i965: Move the entire compiler API into a single file	Jason Ekstrand	2015-10-19	1	-355/+1
\| \| \| \| \| \| \| \| \| \| \|	At this point, the compiler API has been substantially simplified. In the spirit of Kristian's making a compiler library, this commit makes a single header file that contains, more-or-less, the entire compiler API. There's still a bit of cleanup to do particularly in the area of geometry shaders. However, this gets us much closer to having a separate compiler. Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Adapt SSBOs to work with their own separate index space	Iago Toral Quiroga	2015-10-14	1	-3/+1
\| \| \| \|	Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/gs: Make MAX_GS_INPUT_VERTICES a #define in brw_context.h.	Kenneth Graunke	2015-10-10	1	-0/+2
\| \| \| \| \| \| \| \|	For scalar VS, I'll need this in brw_fs.cpp as well. It seems silly to redeclare it in three places. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965/vs: Unify URB entry size/read length calculations between backends.	Kenneth Graunke	2015-10-10	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both the vec4 and scalar VS backends had virtually identical URB entry size and read length calculations. We can move those up a level to backend-agnostic code and reuse it for both. Unfortunately, the backends need to know nr_attributes to compute first_non_payload_grf, so I had to store that in prog_data. We could use urb_read_length, but that's nr_attributes rounded up to a multiple of two, so doing so would waste a register in some cases. There's more code to be removed in the vec4 backend, but that will come in a follow-on patch. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Use util_next_power_of_two() for brw_get_scratch_size()	Kristian Høgsberg Kristensen	2015-10-08	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \|	This function computes the next power of two, but at least 1024. We can do that by bitwise or'ing in 1023 and calling util_next_power_of_two(). We use brw_get_scratch_size() from the compiler so we need it out of brw_program.c. We could move it to brw_shader.cpp, but let's make it a small inline function instead. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
*	i965/cs: Split out helper for building local id payload	Kristian Høgsberg Kristensen	2015-10-08	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \|	The initial motivation for this patch was to avoid calling brw_cs_prog_local_id_payload_dwords() in gen7_cs_state.c from the compiler. This commit ends up refactoring things a bit more so as to split out the logic to build the local id payload to brw_fs.cpp. This moves the payload building closer to the compiler code that uses the payload layout and makes it available to other users of the compiler. Reviewed-by: Topi Pohjolainen <[email protected]> Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
*	i965: add EXT_polygon_offset_clamp support to gen4/gen5	Ilia Mirkin	2015-10-05	1	-0/+2
\| \| \| \| \|	Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
*	i965: Assert on the number of combined UBO and SSBO binding table entries	Iago Toral Quiroga	2015-10-05	1	-0/+3
\| \| \| \| \| \| \| \|	In theory we can't break this assertion since the compiler frontend checks that we don't exceed any of the individual limits, but it does not hurt to be extra safe. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Reserve binding table space for SSBO surfaces	Iago Toral Quiroga	2015-10-05	1	-0/+1
\| \| \| \| \| \| \|	These share the space with UBO surfaces but we need to make sure we allocate enough space for both sets (12 of each) Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Define BRW_MAX_SSBO	Iago Toral Quiroga	2015-10-05	1	-0/+3
\| \| \| \| \| \|	Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Define BRW_MAX_UBO	Iago Toral Quiroga	2015-10-05	1	-1/+4
\| \| \| \| \| \|	Instead of using hard-coded values. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/cs: Upload UBO/SSBO surfaces	Jordan Justen	2015-09-30	1	-1/+1
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
*	i965: Get rid of prog_data compare functions	Jason Ekstrand	2015-09-30	1	-26/+1
\| \| \| \| \| \|	They are no longer used. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/state_cache: Remove the aux_compare fields	Jason Ekstrand	2015-09-30	1	-7/+0
\| \| \| \| \| \| \|	They haven't been used since 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf so there's no good reason to keep them around. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965/cs: Setup surface binding for gl_NumWorkGroups	Jordan Justen	2015-09-29	1	-1/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will only be setup when the prog_data uses_num_work_groups boolean is set. At this point nothing will set uses_num_work_groups, but soon code will set it when emitting code for the intrinsic that loads gl_NumWorkGroups. We can't emit this surface information earlier at the start of the DispatchCompute* call because we may not have generated the program yet. Until we generate the program, we don't know if the gl_NumWorkGroups variable is accessed. We also can't emit the surface as part of the brw_cs_state atom, because we might not need the surface if gl_NumWorkGroups is not used by the program. Lastly, we cannot emit the surface later (after state upload) in the DispatchCompute* call, because it needs to be run before the brw_cs_state atom is emitted, since it changes the surface state. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Add a binding table entry for gl_NumWorkGroups	Jordan Justen	2015-09-29	1	-1/+10
\| \| \| \| \| \| \| \| \| \| \| \| \|	If glDispatchComputeIndirect is used, then the value for this variable must be read from the indirect BO. To allow the same generated code to support indirect and glDispatchCompute, we will also setup a BO for the number of work groups using the intel_upload_data mechanism. This will only be required if the gl_NumWorkGroups variable is accessed. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Store compute invocation information in brw context	Jordan Justen	2015-09-29	1	-0/+11
\| \| \| \| \| \| \| \|	We will need this in an atom to setup a surface to read the gl_NumWorkGroups values from. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Implement "Static Vertex Count" geometry shader optimization.	Kenneth Graunke	2015-09-26	1	-0/+5
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Broadwell's 3DSTATE_GS contains new "Static Output" and "Static Vertex Count" fields, which control a new optimization. Normally, geometry shaders can output arbitrary numbers of vertices, which means that resource allocation has to be done on the fly. However, if the number of vertices is statically known, the hardware can pre-allocate resources up front, which is more efficient. Thanks to the new NIR GS intrinsics, this is easy. We just call the function introduced in the previous commit to get the vertex count. If it obtains a count, we stop emitting the extra 32-bit "Vertex Count" field in the VUE, and instead fill out the 3DSTATE_GS fields. Improves performance of Gl32GSCloth by 5.16347% +/- 0.12611% (n=91) on my Lenovo X250 laptop (Broadwell GT2) at 1024x768. shader-db statistics for geometry shaders only: total instructions in shared programs: 3227 -> 3207 (-0.62%) instructions in affected programs: 242 -> 222 (-8.26%) helped: 10 v2: Don't break non-NIR paths (just skip this optimization). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
*	i965: Simplify handling of VUE map changes.	Kenneth Graunke	2015-09-26	1	-11/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The old code was disasterously complex - spread across multiple atoms which may not even run, inspecting the dirty bits to try and decide whether it was necessary to do checks...storing VS information in brw_context...extra flagging... This code tripped me and Carl up very badly when working on the shader cache code. It's very fragile and hard to maintain. Now that geometry shaders only depend on their inputs and don't have to worry about the VS VUE map, we can dramatically simplify this: just compute the VUE map coming out of the geometry shader stage in brw_upload_programs. If it changes, flag it. Done. v2: Also check vue_map.separable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Don't re-layout varyings for separate shader programs.	Kenneth Graunke	2015-09-26	1	-1/+13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, our VUE map code always assigned slots to varyings sequentially, in one contiguous block. This was a bad fit for separate shaders - the GS input layout depended or the VS output layout, so if we swapped out vertex shaders, we might have to recompile the GS on the fly - which rather defeats the point of using separate shader objects. (Tessellation would suffer from this as well - we could have to recompile the HS, DS, and GS.) Instead, this patch makes the VUE map for separate shaders use a fixed layout, based on the input/output variable's location field. (This is either specified by layout(location = ...) or assigned by the linker.) Corresponding inputs/outputs will match up by location; if there's a mismatch, we're allowed to have undefined behavior. This may be less efficient - depending what locations were chosen, we may have empty padding slots in the VUE. But applications presumably use small consecutive integers for locations, so it hopefully won't be much worse in practice. 3% of Dota 2 Reborn shaders are hurt, but only by 2 instructions. This seems like a small price to pay for avoiding recompiles. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Upload Shader Storage Buffer Object surfaces	Iago Toral Quiroga	2015-09-25	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	Since these are a special kind of UBOs we emit them together reusing the same infrastructure, however, we use a RAW surface so we can reuse existing untyped read/write/atomic messages which include a pixel mask header that we need to set to obtain correct behavior with helper invocations of the fragment shader. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Enable barrier in MEDIA_INTERFACE_DESCRIPTOR	Jordan Justen	2015-09-10	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Enable barrier in MEDIA_INTERFACE_DESCRIPTOR if the program uses the barrier() GLSL function. On Ivy Bridge and Haswell, this allows the piglit test tests/spec/arb_compute_shader/execution/simple-barrier-atomics.shader_test to pass. On gen8, this enables a similar test with a local group size of 896 to pass. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965/cs: Emit texture surfaces to enable CS sampling	Jordan Justen	2015-09-10	1	-1/+1
\| \| \| \| \|	Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kristian Høgsberg <[email protected]>
*	i965: Remove legacy clip plane handling from geometry shaders.	Kenneth Graunke	2015-09-03	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We only support geometry shaders in core profiles, where gl_ClipVertex doesn't exist. Presumably the even older behavior of clipping to gl_Position isn't supported either. In fact, GLSL 1.50 page 76 claims: "The shader must also set all values in gl_ClipDistance that have been enabled via the OpenGL API, or results are undefined." So we don't need to handle legacy clipping in geometry shaders. I think Paul added this back when we were considering supporting the old GL_ARB_geometry_shader4 extension. This removes a non-orthagonal state dependency on GS compilation. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965/cs: Setup push constant data for uniforms	Jordan Justen	2015-09-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	brw_upload_cs_push_constants was based on gen6_upload_push_constants. v2: * Add FINISHME comments about more efficient ways to push uniforms Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
*	i965/gen7-8: Set up early depth/stencil control appropriately for image ↵	Francisco Jerez	2015-08-11	1	-0/+1
\| \| \| \| \| \| \| \| \|	load/store. v2: Store early fragment test mode in brw_wm_prog_data instead of getting it from core mesa data structures (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Hook up image state upload.	Francisco Jerez	2015-08-11	1	-2/+8
\| \| \| \| \| \| \| \|	v2: Add CS support. Move the image_params array back to brw_stage_prog_data. Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
*	i965: Define and initialize image parameter structure.	Francisco Jerez	2015-08-11	1	-0/+54
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This will be used to pass image meta-data to the shader when we cannot use typed surface reads and writes. All entries except surface_idx and size are otherwise unused and will get eliminated by the uniform packing pass. size will be used for bounds checking with some image formats and will be useful for ARB_shader_image_size too. surface_idx is always used. v2: Add CS support. Move the image_params array back to brw_stage_prog_data. v3: Improve documentation. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Implement surface state set-up for shader images.	Francisco Jerez	2015-08-11	1	-0/+2
\| \| \| \| \| \|	v2: Add SKL support. Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Change the type of max_{vs, hs, ...}_threads variables to unsigned	Anuj Phogat	2015-07-29	1	-6/+6
\| \| \| \| \| \| \| \| \|	Fixes following compiler warning: brw_cs.cpp:386:27: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	i965: Enable hardware-generated binding tables on render path.	Abdiel Janulgue	2015-07-18	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch implements the binding table enable command which is also used to allocate a binding table pool where where hardware-generated binding table entries are flushed into. Each binding table offset in the binding table pool is unique per each shader stage that are enabled within a batch. Also insert the required brw_tracked_state objects to enable hw-generated binding tables in normal render path. v2: - Use MOCS in binding table pool alloc for GEN8 - Fix spurious offset when allocating binding table pool entry and start from zero instead. v3: - Include GEN8 fix for spurious offset above. v4: - Fixup wrong packet length in enable/disable hw-binding table for GEN8 (Ville). - Don't invoke HW-binding table disable command when we dont have resource streamer (Chris). v5: - Reorder the state cache invalidate flush so it happens in-between enabling hw-generated binding tables and the previous sw-binding table GPU state (Chris). v6: - Do the same fix in v5 for gen7_disable_hw_binding_tables(). - Adhere to coding guidelines and make comments more informative. Cc: [email protected] Cc: [email protected] Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
*	i965: Enable resource streamer for the batchbuffer	Abdiel Janulgue	2015-07-18	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Check first if the hardware and kernel supports resource streamer. If this is allowed, tell the kernel to enable the resource streamer enable bit on MI_BATCHBUFFER_START by specifying I915_EXEC_RESOURCE_STREAMER execbuffer flags. v2: - Use new I915_PARAM_HAS_RESOURCE_STREAMER ioctl to check if kernel supports RS (Ken). - Add brw_device_info::has_resource_streamer and toggle it for Haswell, Broadwell, Cherryview, Skylake, and Broxton (Ken). v3: - Update I915_PARAM_HAS_RESOURCE_STREAMER to match updated kernel. v4: - Always inspect the getparam.value (Chris Wilson). v5: - Fold redundant devinfo->has_resource_streamer check in context create into init screen. Cc: [email protected] Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
*	i965: Optimize batchbuffer macros.	Matt Turner	2015-07-15	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously OUT_BATCH was just a macro around an inline function which does brw->batch.map[brw->batch.used++] = dword; When making consecutive calls to intel_batchbuffer_emit_dword() the compiler isn't able to recognize that we're writing consecutive memory locations or that it doesn't need to write batch.used back to memory each time. We can avoid both of these problems by making a local pointer to the next location in the batch in BEGIN_BATCH(). Cuts 18k from the .text size. text data bss dec hex filename 4946956 195152 26192 5168300 4edcac i965_dri.so before 4928956 195152 26192 5150300 4e965c i965_dri.so after This series (including commit c0433948) improves performance of Synmark OglBatch7 by 8.01389% +/- 0.63922% (n=83) on Ivybridge. Reviewed-by: Chris Wilson <[email protected]>
*	i965: Set brw->batch.emit only #ifdef DEBUG.	Matt Turner	2015-07-09	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It's only used inside #ifdef DEBUG. Cuts ~1.7k of .text, and more importantly prevents a larger code size regression in the next commit when the .used field is replaced and calculated on demand. text data bss dec hex filename 4945468 195152 26192 5166812 4ed6dc i965_dri.so before 4943740 195152 26192 5165084 4ed01c i965_dri.so after And surround the emit and total fields with #ifdef DEBUG to prevent such mistakes from happening again. Reviewed-by: Iago Toral Quiroga <[email protected]>
*	i965: Move pipecontrol workaround bo to brw_pipe_control	Chris Wilson	2015-07-08	1	-4/+8
\| \| \| \| \| \| \| \| \| \|	With the exception of gen8, the sole user of the workaround bo are for emitting pipe controls. Move it out of the purview of the batchbuffer and into the pipecontrol. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Martin Peres <[email protected]>
*	i965/skl: Set the pulls bary bit in 3DSTATE_PS_EXTRA	Neil Roberts	2015-07-06	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \|	On Gen9+ there is a new bit in 3DSTATE_PS_EXTRA that must be set if the shader sends a message to the pixel interpolator. This fixes the interpolateAt* tests on SKL, apart from interpolateatsample-nonconst but that is not implemented anywhere so it's not a regression. Reviewed-by: Ben Widawsky <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "10.6 10.5" <[email protected]>
*	i965/bxt: Add basic Broxton infrastructure	Ben Widawsky	2015-06-24	1	-0/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The thread counts and URB information are all speculative numbers that were based on some CHV numbers at the time. v2: Originally this patch had PCI IDs. I've moved that to a new patch at the end of the series. Remove is_cherryview hack. Add PCI ids. These match the ones defined in the kernel. The only one tested by us is 0x0a84. Capitalize the hex string (Mark) Signed-off-by: Ben Widawsky <[email protected]> Tested-by: "Lecluse, Philippe" <[email protected]> Reviewed-by: Mark Janes <[email protected]>
*	i965: Rename intel_emit* to reflect their new location in brw_pipe_control	Chris Wilson	2015-06-24	1	-3/+3
\| \| \| \| \|	Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Transplant PIPE_CONTROL routines to brw_pipe_control	Chris Wilson	2015-06-24	1	-0/+11
\| \| \| \| \| \| \| \| \|	Start trimming the fat from intel_batchbuffer.c. First by moving the set of routines for emitting PIPE_CONTROLS (along with the lore concerning hardware workarounds) to a separate brw_pipe_control.c Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Use a single index per shader for shader_time.	Jason Ekstrand	2015-06-23	1	-11/+3
\| \| \| \| \| \| \| \| \| \|	Previously, each shader took 3 shader time indices which were potentially at arbirary points in the shader time buffer. Now, each shader gets a single index which refers to 3 consecutive locations in the buffer. This simplifies some of the logic at the cost of having a magic 3 a few places. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Add compiler options to brw_compiler	Jason Ekstrand	2015-06-23	1	-1/+0
\| \| \| \| \| \| \| \| \| \| \| \| \|	This creates the options at screen cration time and then we just copy them into the context at context creation time. We also move is_scalar to the brw_compiler structure. We also end up manually setting some values that the core would have set by default for us. Fortunately, there are only two non-zero shader compiler option defaults that we aren't overriding anyway so this isn't a big deal. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
*	i965: Rename use_linear_1d_layout() and make it global	Anuj Phogat	2015-06-16	1	-0/+4
\| \| \| \| \| \| \| \| \|	This function will be utilised in later patches. V2: Make both pointers constants (Topi) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
*	i965: Create a shader_dispatch_mode enum to replace VS/GS fields.	Kenneth Graunke	2015-06-01	1	-9/+7
\| \| \| \| \| \| \| \| \| \| \| \| \|	We used to store the GS dispatch mode in brw_gs_prog_data while separately storing the VS dispatch mode in brw_vue_prog_data::simd8. This patch introduces an enum to represent all possible dispatch modes, and stores it in brw_vue_prog_data::dispatch_mode, unifying the two. Based on a suggestion by Matt Turner. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>