mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	mesa: add MESA_SHADER_KERNEL	Karol Herbst	2019-01-21	1	-0/+4
\| \| \| \| \| \| \| \|	used for CL kernels Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	nir: rename nir_var_ssbo to nir_var_mem_ssbo	Karol Herbst	2019-01-19	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	nir: rename nir_var_ubo to nir_var_mem_ubo	Karol Herbst	2019-01-19	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	nir: rename nir_var_function to nir_var_function_temp	Karol Herbst	2019-01-19	2	-2/+2
\| \| \| \| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	nir: rename nir_var_private to nir_var_shader_temp	Karol Herbst	2019-01-19	1	-1/+1
\| \| \| \| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	glsl: don't skip GLSL IR opts on first-time compiles	Timothy Arceri	2019-01-19	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This basically reverts c2bc0aa7b188. By running the opts we reduce memory using in Team Fortress 2 from 1.5GB -> 1.3GB from start-up to game menu. This will likely increase Deus Ex start up times as per commit c2bc0aa7b188. However currently 32bit games like Team Fortress 2 can run out of memory on low memory systems, so that seems more important. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Optionally override RGB/RGBX dst alpha blend factors	Kenneth Graunke	2019-01-15	5	-2/+67
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Intel's blending hardware does not properly return 1.0 for destination alpha for RGBX formats; it requires the factors to be overridden to either zero or one. Broadcom vc4 and v3d also could use this override. While overriding these factors is safe in general, Nouveau and Radeon would prefer not to. Their blending hardware already returns correct values for RGB/RGBX formats, and would like to avoid the resulting per-buffer blending and independent blend factors (rgb != a) since it can cause additional overhead. I considered simply handling this in the driver, but it's not as nice. pipe_blend_state doesn't have any format information, so we'd need the hardware blend state to depend on both pipe_blend_state and pipe_framebuffer_state. Furthermore, Intel GPUs don't have a native RGBX_SNORM format, so I avoid exposing one, which makes Gallium fall back to RGBA_SNORM. The pipe_surfaces we get in the driver have an RGBA format, making it impossible to tell that there shouldn't be an alpha channel. One could argue that st not handling it in that case is a bug. To work around this, we'd have to expose RGBX pipe formats, mapped to RGBA hardware formats, and add format swizzling special cases. All doable, but it ends up being more code than I'd like. st_atom_blend already has access to the right information and it's trivial to accomplish there, so we just add a cap bit and do that. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	gallium: Add the ability to query a single pipeline statistics counter	Kenneth Graunke	2019-01-15	4	-2/+45
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gallium historically has treated pipeline statistics queries as a single query, PIPE_QUERY_PIPELINE_STATISTICS, which returns a block of 11 values. This was originally patterned after the D3D1x API. Much later, Brian introduced an OpenGL extension that exposed these counters - but it exposes 11 separate queries, each of which returns a single value. Today, st/mesa simply queries all 11 values, and returns a single value. While pipeline statistics counters aren't typically performance critical, this is still not a great fit. A D3D1x->GL translator might request all 11 counters by creating 11 separate GL queries...which Gallium would map to reads of all 11 values each time, resulting in a total 121 counter reads. That's not ideal. This patch adds a new cap, PIPE_CAP_QUERY_PIPELINE_STATISTICS_SINGLE, and corresponding query type PIPE_QUERY_PIPELINE_STATISTICS_SINGLE. When calling create_query(), q->index should be set to one of the PIPE_STAT_QUERY_* enums to select a counter. Unlike the block query, this returns the value in pipe_query_result::u64 (as it's a single value) instead of the pipe_query_data_pipeline_statistics group. We update st/mesa to expose ARB_pipeline_statistics_query if either capability is set, preferring the new SINGLE variant when available. Thanks to Roland, Ilia, and Marek for helping me sort this out. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	st/mesa: Rearrange PIPE_QUERY_PIPELINE_STATISTICS result fetching.	Kenneth Graunke	2019-01-15	1	-43/+45
\| \| \| \| \| \| \| \| \| \| \|	This just changes the order of the switch statements, so we only look at target if the query type is PIPE_QUERY_PIPELINE_STATISTICS. The next commit will introduce a new SINGLE query type which can be used for the same GL query types, and it won't want this processing. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	st/mesa: Make an enum for pipeline statistics query result indices.	Kenneth Graunke	2019-01-15	1	-11/+11
\| \| \| \| \| \| \| \| \| \| \| \|	Gallium handles pipeline statistics queries as a single query (PIPE_QUERY_PIPELINE_STATISTICS) which returns a struct with 11 values. Sometimes it's useful to refer to each of those values individually, rather than as a group. To avoid hardcoding numbers, we define a new enum for each value. Here, the name and enum value correspond to the index in the struct pipe_query_data_pipeline_statistics result. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	i965: Drop mark_surface_used mechanism.	Kenneth Graunke	2019-01-13	3	-5/+7
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The original idea was that the backend compiler could eliminate surfaces, so we would have it mark which ones are actually used, then shrink the binding table accordingly. Unfortunately, it's a pretty blunt mechanism - it can only prune things from the end, not the middle - since we decide the layout before we even start the backend compiler, and only limit the size. It also basically gives up if it sees indirect array access. Besides, we do the vast majority of our surface elimination in NIR anyway, not the backend - and I don't see that trend changing any time soon. Vulkan abandoned this plan a long time ago, and I don't use it in Iris, but it's still been kicking around in i965. I hacked shader-db to print the binding table size in bytes, and observed no changes with this patch. So, this code appears to do nothing useful. Acked-by: Jason Ekstrand <[email protected]>
*	st/nir: Lower TES gl_PatchVerticesIn to a constant if linked with a TCS.	Kenneth Graunke	2019-01-11	1	-0/+23
\| \| \| \| \| \| \| \|	If the TCS and TES are linked together, we can simply replace the TES's gl_PatchVerticesIn system value with a constant, possibly allowing extra optimization or letting the driver avoid uploading a special value. Reviewed-by: Timothy Arceri <[email protected]>
*	blorp: Pass the batch to lookup/upload_shader instead of context	Kenneth Graunke	2019-01-10	1	-4/+4
\| \| \| \| \| \| \| \| \|	This will allow drivers to pin shader buffers if necessary. i965 and anv do not need to do this today, but iris will. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	blorp: Add blorp_get_surface_address to the driver interface.	Kenneth Graunke	2019-01-10	1	-0/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Currently, BLORP expects drivers to provide two functions for dealing with buffers: blorp_emit_reloc and blorp_surface_reloc. Both record a relocation and combine the BO address and offset into a full 64-bit address. Traditionally, blorp_surface_reloc has written that combined address to an implicitly-known buffer where surface states are stored. (In contrast, blorp_emit_reloc returns the value.) The upcoming Iris driver stores surface states in multiple buffers, which makes it impossible for blorp_surface_reloc to write the combined address - it only takes an offset, not the actual buffer to write to. This commit adds a third function, blorp_get_surface_address, which combines and returns an address, which is then passed to ISL's surface state fill functions. Softpin-only drivers can return a real address here and skip writing it in blorp_surface_reloc. Relocation-based drivers are have options. They can simply return 0 from the new function, and continue writing the address from blorp_surface_reloc. Or, they can return a presumed address from blorp_get_surface_address, and have other relocation processing write the real value later. For now, i965 and anv simply return 0. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	i965: Compile fp64 funcs only if we do not have 64-bit hardware support	Matt Turner	2019-01-10	1	-1/+1
\| \| \| \|	Brown bag fix...
*	intel/isl: move tiled_memcpy static libs from i965 to isl	Tapani Pälli	2019-01-10	14	-1459/+91
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Patch moves intel_tiled_memcpy[_sse41] libraries to isl, renames some functions and types and makes the required build system changes for meson, automake and Android. No functional changes are introduced. v2: code cleanups, move isl_get_memcpy_type to i965 (Jason) v3: move isl_mem_copy_fn to priv header, cleanups (Jason, Dylan) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: Enable 64-bit GLSL extensions	Matt Turner	2019-01-09	1	-4/+4
\| \| \| \| \| \| \|	Now that we have software implementations of ARB_gpu_shader_int64 and ARB_gpu_shader_fp64 we can unconditionally enable these extensions. Reviewed-by: Kenneth Graunke <[email protected]>
*	i965: Compile fp64 software routines and lower double-ops	Matt Turner	2019-01-09	3	-1/+63
\| \| \| \|	Reviewed-by: Kenneth Graunke <[email protected]>
*	glsl: Create file to contain software fp64 functions	Matt Turner	2019-01-09	1	-0/+1
\| \| \| \| \| \| \|	The following patches will add implementations of various double-precision operations to this file. Reviewed-by: Kenneth Graunke <[email protected]>
*	st/mesa: don't leak pipe_surface if pipe_context is not current	Marek Olšák	2019-01-09	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	We have found some pipe_surface leaks internally. This is the same code as surface_destroy in radeonsi. Ideally, surface_destroy would be in pipe_screen. Cc: 18.3 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
*	st/mesa: don't reference pipe_surface locally in PBO code	Marek Olšák	2019-01-09	1	-3/+1
\| \| \| \|	Reviewed-by: Brian Paul <[email protected]>
*	st/mesa: unify window-system renderbuffer initialization	Marek Olšák	2019-01-09	3	-21/+28
\| \| \| \|	Reviewed-by: Brian Paul <[email protected]>
*	radeon: fix printf format specifier.	Maya Rashish	2019-01-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	From glibc printf(3): Z A nonstandard synonym for z that predates the appearance of z. Do not use in new code. Z may not exist on non-glibc systems. Prefer the standard symbol. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
*	nir: rename global/local to private/function memory	Karol Herbst	2019-01-08	2	-3/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	spirv: Sort supported capabilities	Jason Ekstrand	2019-01-07	1	-5/+5
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	spirv: Add support for using derefs for UBO/SSBO access	Jason Ekstrand	2019-01-08	1	-0/+1
\| \| \| \| \| \| \| \| \|	For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
*	nir: Move propagation of cast derefs to a new nir_opt_deref pass	Jason Ekstrand	2019-01-08	1	-1/+1
\| \| \| \| \| \| \| \| \|	We're going to want to do more deref optimizations going forward and this gives us a central place to do them. Also, cast propagation will get a bit more complicated with the addition of ptr_as_array derefs. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	glsl_type: Add support for explicitly laid out matrices and arrays	Jason Ekstrand	2019-01-08	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	SPIR-V allows for matrix and array types to be decorated with explicit byte stride decorations and matrix types to be decorated row- or column-major. This commit adds support to glsl_type to encode this information. Because this doesn't work nicely with std430 and std140 alignments, we add asserts to ensure that we don't use any of the std430 or std140 layout functions with explicitly laid out types. In SPIR-V, the layout information for matrices is applied to the parent struct member instead of to the matrix type itself. However, this is gets rather clumsy when you're walking derefs trying to compute offsets because, the moment you hit a matrix, you have to crawl back the deref chain and find the struct. Instead, we take the same path here as we've taken in spirv_to_nir and put the decorations on the matrix type itself. This also subtly adds support for strided vector types. These don't come up in SPIR-V directly but you can get one as the result of taking a column from a row-major matrix or a row from a column-major matrix. Reviewed-by: Alejandro Piñeiro <[email protected]>
*	nir: Distinguish between normal uniforms and UBOs	Jason Ekstrand	2019-01-08	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Previously, NIR had a single nir_var_uniform mode used for atomic counters, UBOs, samplers, images, and normal uniforms. This commit splits this into nir_var_uniform and nir_var_ubo where nir_var_uniform is still a bit of a catch-all but the nir_var_ubo is specific to UBOs. While we're at it, we also rename shader_storage to ssbo to follow the convention. We need this so that we can distinguish between normal uniforms and UBO access at the deref level without going all the way back variable and seeing if it has an interface type. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	st/glsl: refactor st_link_nir()	Timothy Arceri	2019-01-07	1	-36/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The functional change here is moving the nir_lower_io_to_scalar_early() calls inside st_nir_link_shaders() and moving the st_nir_opts() call after the call to nir_lower_io_arrays_to_elements(). This fixes a bug with the following piglit test due to the current code not cleaning up dead code after we lower arrays. This was causing an assert in the new duplicate varyings link time opt introduced in 70be9afccb23. tests/spec/glsl-1.10/execution/vsfs-unused-array-member.shader_test Moving the nir_lower_io_to_scalar_early() calls also allows us to tidy up the code a little and merge some loops. Reviewed-by: Eric Anholt <[email protected]>
*	i965: add CS stall on VF invalidation workaround	Lionel Landwerlin	2019-01-04	2	-2/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Even with the previous commit, hangs are still happening. The problem there is that the VF cache invalidate do happen immediately without waiting for previous rendering to complete. What happens is that we invalidate the cache the moment the PIPE_CONTROL is parsed but we still have old rendering in the pipe which continues to pull data into the cache with the old high address bits. The later rendering with the new high address bits then doesn't have the clean cache that it expects/needs. v2: Update commit message/explanation with Jason's Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Fixes: a363bb2cd0e2a1 ("i965: Allocate VMA in userspace for full-PPGTT systems.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
*	i965: include draw_params/derived_draw_params for VF cache workaround	Lionel Landwerlin	2019-01-04	1	-5/+18
\| \| \| \| \| \| \| \| \| \| \|	These buffers are using VB slots and should be included in the workaround decision. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Fixes: a363bb2cd0e2a1 ("i965: Allocate VMA in userspace for full-PPGTT systems.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109072
*	i965: limit VF caching workaround to gen8/9/10	Lionel Landwerlin	2019-01-04	2	-2/+4
\| \| \| \| \| \| \| \| \|	Documentation of the 3DSTATE_VERTEX_BUFFERS packet says this is only needed before ICL. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	nir: rename nir_link_constant_varyings() nir_link_opt_varyings()	Timothy Arceri	2019-01-02	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	The following patches will add support for an additional optimisation so this function will no longer just optimise varying constants. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	st/glsl_to_nir: call nir_lower_load_const_to_scalar() in the st	Timothy Arceri	2019-01-02	1	-1/+3
\| \| \| \| \| \| \| \| \|	This will help the new opt introduced in the following patches allowing us to remove extra duplicate varyings. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	st/mesa: expose GL_NV_shader_atomic_float when ATOMFADD is supported	Ilia Mirkin	2018-12-26	1	-0/+1
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: select ATOMFADD when source type is float	Ilia Mirkin	2018-12-26	2	-0/+3
\| \| \| \| \|	Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: allow glDrawElements to work with GL_SELECT feedback	Ilia Mirkin	2018-12-26	2	-23/+42
\| \| \| \| \| \| \| \| \| \| \|	Not sure if this ever worked, but the current logic for setting the min/max index is definitely wrong for indexed draws. While we're at it, bring in all the usual logic from the non-indirect drawing path. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109086 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	st/nir: Drop unused gl_program parameter in VS input handling helper.	Kenneth Graunke	2018-12-21	1	-2/+2
\| \| \| \| \| \| \|	Nobody uses this, so let's drop it. This makes the helper callable from places without a gl_program. Reviewed-by: Marek Olšák <[email protected]>
*	st/nir: Gather info after applying lowering FS variant features	Kenneth Graunke	2018-12-21	1	-0/+4
\| \| \| \| \| \| \| \|	DrawPixels lowering, for example, adds new varyings that need to be accounted for in inputs_read. The earlier info gathering at link time cannot account for this. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Combine the DrawPixels and Bitmap passthrough VS programs.	Kenneth Graunke	2018-12-21	4	-33/+22
\| \| \| \| \| \|	They're now identical, so we can just compile it once. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Don't open code the drawpixels vertex shader.	Kenneth Graunke	2018-12-21	1	-27/+11
\| \| \| \| \| \|	Now that we always copy color, we can just use the util function. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Drop !passColor optimization in drawpixels shaders.	Kenneth Graunke	2018-12-21	2	-28/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The glDrawPixels passthrough vertex shader copies position and texcoord vertex attributes to varying outputs. It also optionally copies a third gl_Color attribute, which sometimes is unnecessary. Until now, we've compiled separate variants of the shader, one of which does this extra copy, and the other of which doesn't. We have done this since 2007. But, the vertex shader runs for a whopping four vertices, and so the cost of a copying a single input to output is likely inconsequential. In theory, we could bind one fewer vertex element - but we always bind all three regardless. So, we don't even get that savings. This patch unifies the two, so we always copy the optional color, and save having to compile the variant. It also makes the VS input interface match up with the vertex element state without any dead (unused) input attributes. Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: Drop dead 'passthrough_fs' field.	Kenneth Graunke	2018-12-21	1	-2/+0
\| \| \| \| \| \| \|	Dead since 2015 (commit 5142564734bd68f165b02e29e384ebbcf91cce38). Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	st/mesa: remove sampler associated with buffer texture in pbo logic	Ilia Mirkin	2018-12-20	1	-5/+0
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	A long time ago, when this was first implemented, not having a sampler bound would cause problems on Fermi. I didn't work out the reasons, but the solution was simple -- just put the samplers back in. Since then, regular texturing paths appear to have lost their associated samplers which required a fuller investigation and fix in nouveau. Now that this is done, this code should no longer need a sampler state for fetching texels from a buffer texture. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
*	i965: Don't override subslice count to 4 on Gen11.	Kenneth Graunke	2018-12-17	1	-1/+1
\| \| \| \| \| \| \| \|	Gen9-10 have fewer than 4 subslices per slice, so they need this to be rounded up. Gen11 isn't documented as needing this hack, and it can also have more than 4 subslices, so the hack actually can break things. Reviewed-by: Anuj Phogat <[email protected]>
*	nir/opt_peephole_select: Don't peephole_select expensive math instructions	Ian Romanick	2018-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On some GPUs, especially older Intel GPUs, some math instructions are very expensive. On those architectures, don't reduce flow control to a csel if one of the branches contains one of these expensive math instructions. This prevents a bunch of cycle count regressions on pre-Gen6 platforms with a later patch (intel/compiler: More peephole select for pre-Gen6). v2: Remove stray #if block. Noticed by Thomas. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	nir/opt_peephole_select: Don't try to remove flow control around indirect loads	Ian Romanick	2018-12-17	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	That flow control may be trying to avoid invalid loads. On at least some platforms, those loads can also be expensive. No shader-db changes on any Intel platform (even with the later patch "intel/compiler: More peephole select"). v2: Add a 'indirect_load_ok' flag to nir_opt_peephole_select. Suggested by Rob. See also the big comment in src/intel/compiler/brw_nir.c. v3: Use nir_deref_instr_has_indirect instead of deref_has_indirect (from nir_lower_io_arrays_to_elements.c). v4: Fix inverted condition in brw_nir.c. Noticed by Lionel. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/nir: Use nir_src_as_uint for tokens	Jason Ekstrand	2018-12-16	1	-5/+1
\| \| \| \|	Reviewed-by: Marek Olšák <[email protected]>
*	i965/gen9: Add workarounds for object preemption.	Rafael Antognolli	2018-12-14	1	-0/+63
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Gen9 hardware requires some workarounds to disable preemption depending on the type of primitive being emitted. We implement this by adding a function that checks the primitive type and number of instances right before the 3DPRIMITIVE. For now, we just ignore blorp. The only primitive it emits is 3DPRIM_RECTLIST, and since it's not listed in the workarounds, we can safely leave preemption enabled when it happens. Or it will be disabled by a previous 3DPRIMITIVE, which should be fine too. v3: - Apply missing workarounds for instanced rendering and line loop (Ken) - Move workaround code to brw_draw_single_prim() Signed-off-by: Rafael Antognolli <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>