mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	glsl: remember which SSBOs are not read-only and pass it to gallium	Marek Olšák	2019-04-04	2	-1/+7
\| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]>
*	gallium: add writable_bitmask parameter into set_shader_buffers	Marek Olšák	2019-04-04	2	-3/+3
\| \| \| \| \| \| \|	to indicate write usage per buffer. This is just a hint (it will be used by radeonsi). Reviewed-by: Timothy Arceri <[email protected]>
*	st/mesa: Fix GL_MAP_COLOR with glDrawPixels GL_COLOR_INDEX	Danylo Piliaiev	2019-04-04	1	-2/+32
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Documentation for glDrawPixels with GL_COLOR_INDEX says: "If the GL is in color index mode, and if GL_MAP_COLOR is true, the index is replaced with the value that it references in lookup table GL_PIXEL_MAP_I_TO_I" We are always in RGBA mode and there is nothing in documentation about GL_MAP_COLOR in RGBA mode for GL_COLOR_INDEX. Scale and bias are also only applicable for RGBA format and not mentioned for GL_COLOR_INDEX. Thus the behaviour will be on par with i965. Fixes: gl-1.0-drawpixels-color-index Signed-off-by: Danylo Piliaiev <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
*	st/nir: run st_nir_opts after 64bit ops lowering	Tapani Pälli	2019-04-04	1	-1/+1
\| \| \| \| \| \| \| \|	CID: 1444309 Fixes: 9ab1b1d0227 "st/nir: Move 64-bit lowering later" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
*	gallium: implement ARB/KHR_parallel_shader_compile	Marek Olšák	2019-04-01	1	-1/+58
\|
*	mesa: implement ARB/KHR_parallel_shader_compile	Marek Olšák	2019-04-01	8	-0/+44
\| \| \| \|	Tested by piglit.
*	meson: strip rpath from megadrivers	Eric Engestrom	2019-04-01	1	-0/+3
\| \| \| \| \| \| \| \| \| \|	More specifically, use the library file that has been post-processed by Meson when creating the hardlinks. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=108766 Fixes: 3218056e0eb375eeda47 "meson: Build i965 and dri stack" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
*	i965: perf: update render basic configs for big core gen9/gen10	Lionel Landwerlin	2019-04-01	8	-23/+24
\| \| \| \| \| \| \| \| \|	This updates allows an MI_LRI to trigger a OA report write in the global OA buffer. This isn't really useful for us, we just keep close to the internal public configs. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: add ring busyness metric for cfl gt2	Lionel Landwerlin	2019-04-01	1	-1/+165
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: enable Icelake metrics	Lionel Landwerlin	2019-03-31	3	-3/+11
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: add Icelake metrics	Lionel Landwerlin	2019-03-31	1	-0/+11899
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: sklgt2: drop programming of an unused NOA register	Lionel Landwerlin	2019-03-31	1	-11/+6
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: hsw: drop register programming not needed on HSW	Lionel Landwerlin	2019-03-31	1	-2/+1
\| \| \| \| \| \| \|	This register is flagged as IVB only in the documentation. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: chv: fixup counters names	Lionel Landwerlin	2019-03-31	1	-25/+25
\| \| \| \| \|	Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: add PMA stall metrics	Lionel Landwerlin	2019-03-31	10	-10/+1140
\| \| \| \| \| \| \| \|	These are new metrics for Gen8/9 to measure the effect of the PMA stall workaround fix. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: sklgt2: update memory write config	Lionel Landwerlin	2019-03-31	1	-7/+49
\| \| \| \| \| \| \| \|	This rework the programming between older pre-production steppings & new ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: sklgt2: update compute metrics config	Lionel Landwerlin	2019-03-31	1	-8/+2
\| \| \| \| \| \| \| \|	This unifies some of the programming between pre-production stepping and production ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	i965: perf: sklgt2: update a priority for register programming	Lionel Landwerlin	2019-03-31	1	-2/+2
\| \| \| \| \| \| \|	This makes no difference in term of programming, it's just a cleanup. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
*	nir: add lower_all_io_to_elements	Rob Clark	2019-03-30	1	-0/+1
\| \| \| \| \| \| \|	I need this part of lower_all_io_to_temps but without the actual lowering to temps part. Signed-off-by: Rob Clark <[email protected]>
*	i965,iris/blorp: do not blit 0-sizes	Sergii Romantsov	2019-03-30	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Seems there is no sense in blitting 0-sized sources or destinations. Additionaly it may cause segfaults for i965. v2: Function call replaced with inline check v3: Added check to avoid devision by zero (L. Landwerlin) v4: Added simillar check for Iris (L. Landwerlin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239 Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	i965/blorp: Remove unused parameter from blorp_surf_for_miptree.	Rafael Antognolli	2019-03-28	1	-24/+12
\| \| \| \| \| \|	It seems pretty useless nowadays. Reviewed-by: Jason Ekstrand <[email protected]>
*	st/mesa: Fix blitting from GL_DEPTH_STENCIL to GL_STENCIL_INDEX	Kenneth Graunke	2019-03-28	1	-0/+1
\| \| \| \| \| \| \| \| \|	Fixes assertion failures in Piglit's "framebuffer-blit-levels {draw,read} stencil" tests on iris. Also fixes assert failures in frameretrace, which tries to ReadPixels the stencil values (only) from a Z24S8 depth/stencil attachment. Reviewed-by: Kristian H. Kristensen <[email protected]>
*	st/nir: Free the GLSL IR after linking.	Kenneth Graunke	2019-03-28	1	-0/+4
\| \| \| \| \| \| \| \| \|	i965 does this, and st's tgsi path does this. st/nir did not. Cuts 138MB of memory from a DiRT Rally trace, which is about 44% of the total GLSL IR memory. Reviewed-by: Timothy Arceri <[email protected]>
*	st/glsl_to_nir: Calculate num_uniforms from NumParameterValues	Kristian H. Kristensen	2019-03-27	1	-5/+5
\| \| \| \| \| \| \| \| \| \| \| \|	We don't need to determine the number of uniform slots here, it's already available as prog->Parameters->NumParameterValues. The way we previously determined the number of slots was also broken for PackedDriverUniformStorage, where we would add loc (in dwords) and type_size() (in vec4s). Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Rob Clark <[email protected]>
*	i965,iris,anv: Make alpha to coverage work with sample mask	Danylo Piliaiev	2019-03-25	1	-6/+12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) \| 0x0808 * (m & 2) \| 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <[email protected]> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
*	st/mesa: fix texture deletion context mix-up issues (v2)	Brian Paul	2019-03-25	1	-12/+39
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When we destroy a context, we need to temporarily make that context the current one for the thread. That's because during context tear-down we make many calls to _mesa_reference_texobj(&texObj, NULL). Note there's no context parameter. If the texture's refcount goes to zero and we need to delete it, we use the thread's current context. But if that context isn't the context we're tearing down, we get into trouble when deallocating sampler views. See patch 593e36f956 ("st/mesa: implement "zombie" sampler views (v2)") for background information. Also, we need to release any sampler views attached to the fallback textures. Fixes a crash on exit with a glretrace of the Nobel Clinician application. v2: at end of st_destroy_context(), check if save_ctx == ctx and unbind the context if so. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
*	android: static link with libexpat with Android O+	Kishore Kadiyala	2019-03-25	1	-1/+9
\| \| \| \| \| \| \| \| \| \| \| \| \|	In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
*	st/mesa: fix warnings about implicit conversion on enumeration type	Tapani Pälli	2019-03-25	2	-2/+2
\| \| \| \| \| \| \| \|	These enums match but compiler warns about implicit conversion. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	st/mesa: fix compilation warning on storage_flags_to_buffer_flags	Tapani Pälli	2019-03-25	1	-1/+1
\| \| \| \| \| \| \| \|	(warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
*	spirv: Add an execution environment to the options	Caio Marcelo de Oliveira Filho	2019-03-23	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Also updates gl_spirv to pick the right one. At the moment nothing uses it, but upcoming functionality part of ARB_gl_spirv will use it, and we also later can be more assertful when handling certain features for each of the execution environments. Reviewed-by: Alejandro Piñeiro <[email protected]> Acked-by: Karol Herbst <[email protected]>
*	mesa/st: use ESSL cap top enable gpu_shader5	Rob Clark	2019-03-22	1	-3/+14
\| \| \| \| \| \| \| \| \| \| \| \| \|	For GLES2+ contexts, enable EXT_gpu_shader5 if the driver exposes a sufficiently high ESSL feature level, even if the GLSL feature level isn't high enough. This allows drivers to support EXT_gpu_shader5 in GLES contexts before they support all the additional features of ARB_gpu_shader5 in GL contexts. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	mesa: Fix GL_NUM_DEVICE_UUIDS_EXT	Józef Kucia	2019-03-22	1	-0/+3
\| \| \| \| \|	Cc: [email protected] Reviewed-by: Tapani Pälli <[email protected]>
*	gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.	Kenneth Graunke	2019-03-19	1	-15/+19
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The glMemoryBarrier() function makes shader memory stores ordered with respect to things specified by the given bits. Until now, st/mesa has ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT, saying that drivers should implicitly perform the needed flushing. This seems like a pretty big assumption to make. Instead, this commit opts to translate them to new PIPE_BARRIER bits, and adjusts existing drivers to continue ignoring them (preserving the current behavior). The i965 driver performs actions on these memory barriers. Shader memory stores go through a "data cache" which is separate from the render cache and other read caches (like the texture cache). All memory barriers need to flush the data cache (to ensure shader memory stores are visible), and possibly invalidate read caches (to ensure stale data is no longer visible). The driver implicitly flushes for most caches, but not for data cache, since ARB_shader_image_load_store introduced MemoryBarrier() precisely to order these explicitly. I would like to follow i965's approach in iris, flushing the data cache on any MemoryBarrier() call, so I need st/mesa to actually call the pipe->memory_barrier() callback. Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on the iris driver. Roland said this looks reasonable to him. Reviewed-by: Eric Anholt <[email protected]>
*	i965/icl: Add WA_2204188704 to disable pixel shader panic dispatch	Anuj Phogat	2019-03-19	2	-0/+10
\| \| \| \| \| \|	Signed-off-by: Anuj Phogat <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	st/mesa: stop using pipe_sampler_view_release()	Brian Paul	2019-03-17	2	-5/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	In all instances here we can replace pipe_sampler_view_release(pipe, view) with pipe_sampler_view_reference(view, NULL) because the views in question are private to the state tracker context. So there's no danger of freeing a sampler view with the wrong context. Testing done: google chrome, misc GL demos, games Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-By: Jose Fonseca <[email protected]>
*	st/mesa: implement "zombie" shaders list	Brian Paul	2019-03-17	3	-20/+166
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	As with the preceding patch for sampler views, this patch does basically the same thing but for shaders. However, reference counting isn't needed here (instead of calling cso_delete_XXX_shader() we call st_save_zombie_shader(). The Redway3D Watch is one app/demo that needs this change. Otherwise, the vmwgfx driver generates an error about trying to destroy a shader ID that doesn't exist in the context. Note that if PIPE_CAP_SHAREABLE_SHADERS = TRUE, then we can use/delete any shader with any context and this mechanism is not used. Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos and a few Linux games. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-By: Jose Fonseca <[email protected]>
*	st/mesa: implement "zombie" sampler views (v2)	Brian Paul	2019-03-17	5	-4/+131
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When st_texture_release_all_sampler_views() is called the texture may have sampler views belonging to several contexts. If we unreference a sampler view and its refcount hits zero, we need to be sure to destroy the sampler view with the same context which created it. This was not the case with the previous code which used pipe_sampler_view_release(). That function could end up freeing a sampler view with a context different than the one which created it. In the case of the VMware svga driver, we detected this but leaked the sampler view. This led to a crash with google-chrome when the kernel module had too many sampler views. VMware bug 2274734. Alternately, if we try to delete a sampler view with the correct context, we may be "reaching into" a context which is active on another thread. That's not safe. To fix these issues this patch adds a per-context list of "zombie" sampler views. These are views which are to be freed at some point when the context is active. Other contexts may safely add sampler views to the zombie list at any time (it's mutex protected). This avoids the context/view ownership mix-ups we had before. Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos a few Linux games. If anyone can recomment some other multi-threaded, multi-context GL apps to test, please let me know. v2: avoid potential race issue by always adding sampler views to the zombie list if the view's context doesn't match the current context, ignoring the refcount. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-By: Jose Fonseca <[email protected]>
*	mesa: Add assert to _mesa_primitive_restart_index.	Mathias Fröhlich	2019-03-15	1	-0/+3
\| \| \| \| \| \| \|	Make sure the inde_size parameter is meant to be in bytes. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	vbo: Fix GL_PRIMITIVE_RESTART_FIXED_INDEX in display list compiles.	Mathias Fröhlich	2019-03-15	1	-5/+9
\| \| \| \| \| \| \| \|	The maximum value primitive restart index is different for each index data type. Use the appropriate fixed restart index value. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	vbo: Fix basevertex handling in display list compiles.	Mathias Fröhlich	2019-03-15	1	-5/+12
\| \| \| \| \| \| \| \| \|	The standard requires that the primitive restart comparison happens before the basevertex value is added. Do this now, drop a reference to the standard why this happens at this place. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Use mapping tools in debug prints.	Mathias Fröhlich	2019-03-15	1	-45/+12
\| \| \| \| \|	Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Remove _ae_{,un}map_vbos and dependencies.	Mathias Fröhlich	2019-03-15	2	-100/+0
\| \| \| \| \| \| \| \| \|	Since mapping and unmapping the buffer objects in a VAO is handled directly from the VAO, this part of the _NEW_ARRAY state is no longer used. So remove this part of array element state. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Replace _ae_{,un}map_vbos with _mesa_vao_{,un}map_arrays	Mathias Fröhlich	2019-03-15	2	-13/+11
\| \| \| \| \| \| \| \| \| \|	Due to the use of bitmaps, the _mesa_vao_{,un}map_arrays functions should provide comparable runtime efficienty to the currently used _ae_{,un}map_vbos functions. So use this functions and enable further cleanup. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Use _mesa_array_element in dlist save.	Mathias Fröhlich	2019-03-15	1	-4/+19
\| \| \| \| \| \| \| \| \| \|	Make use of the newly factored out _mesa_array_element function in display list compilation. For now that duplicates out the primitive restart logic. But that turns out to need a fix in display list handling anyhow. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Factor out _mesa_array_element.	Mathias Fröhlich	2019-03-15	2	-19/+32
\| \| \| \| \| \| \| \| \|	The factored out function handles emitting the vertex attributes at the given index. The now public accessible function gets used in the following patches. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	mesa: Implement helper functions to map and unmap a VAO.	Mathias Fröhlich	2019-03-15	2	-0/+102
\| \| \| \| \| \| \| \| \| \|	Provide a set of functions that maps or unmaps all VBOs held in a VAO. The functions will be used in the following patches. v2: Update comments. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
*	st/mesa: Let NIR lower UBO and SSBO access when we have it	Jason Ekstrand	2019-03-15	2	-1/+11
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	i965: Stop setting LowerBuferInterfaceBlocks	Jason Ekstrand	2019-03-15	1	-0/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Instead, we do UBO and SSBO deref lowering in NIR after we've given it a chance to optimize SSBO access: Shader-db results on Kaby Lake: total instructions in shared programs: 15235775 -> 15235484 (<.01%) instructions in affected programs: 14992 -> 14701 (-1.94%) helped: 19 HURT: 20 total cycles in shared programs: 339220331 -> 339027307 (-0.06%) cycles in affected programs: 79831981 -> 79638957 (-0.24%) helped: 540 HURT: 602 total loops in shared programs: 4402 -> 4348 (-1.23%) loops in affected programs: 186 -> 132 (-29.03%) helped: 27 HURT: 0 total spills in shared programs: 23261 -> 23234 (-0.12%) spills in affected programs: 38 -> 11 (-71.05%) helped: 1 HURT: 0 total fills in shared programs: 31442 -> 31371 (-0.23%) fills in affected programs: 98 -> 27 (-72.45%) helped: 1 HURT: 0 LOST: 12 GAINED: 12 Most of the help and hurt in instruction counts was just churn caused by re-ordering of optimizations and the fact that the NIR deref lowering code is emitting slightly different instructions. Nothing was hurt by more than three instructions and most things weren't helped by more than four. The primary exception to this is one Car Chase shader: shaders/non-free/gfxbench4/carchase/341.shader_test CS SIMD32: 1144 -> 821 (-28.23%) There is also one compute shader in Manhattan 3.1 and a fragment shader in the UE4 Shooter Game demo that now get a loop partially unrolled. Those showed up in the results as hurt instructions but were manually removed to get the results above. The lost/gained was a dozen Car Chase shaders that went from SIMD8 to SIMD16 thanks to improved register pressure: shaders/non-free/gfxbench4/carchase/366.shader_test CS shaders/non-free/gfxbench4/carchase/368.shader_test CS shaders/non-free/gfxbench4/carchase/370.shader_test CS shaders/non-free/gfxbench4/carchase/372.shader_test CS shaders/non-free/gfxbench4/carchase/376.shader_test CS shaders/non-free/gfxbench4/carchase/378.shader_test CS shaders/non-free/gfxbench4/carchase/380.shader_test CS shaders/non-free/gfxbench4/carchase/382.shader_test CS shaders/non-free/gfxbench4/carchase/384.shader_test CS shaders/non-free/gfxbench4/carchase/388.shader_test CS shaders/non-free/gfxbench4/carchase/4.shader_test CS shaders/non-free/gfxbench4/carchase/6.shader_test CS Given how much it appeared to be improved, I ran Car Chase on my laptop. Unfortunately, I wasn't able to see any measurable improvement. It might be helped by 1-2% but it's in the noise. It does render correctly as far as I can tell so the improvement is legitimate. All of the loops that got delete were in dolphin uber shaders. I've had no opportunity to test them for correctness or performance. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	mesa/st: Fix leaks of TGSI tokens in VP variants.	Eric Anholt	2019-03-14	1	-14/+20
\| \| \| \| \| \| \| \| \| \|	Starting a glxgears and closing it, I was seeing a lot of leaked TGSI for the fixed function VPs. v2: drop unused delete_ir() arg. Fixes: 3b4929ec6e64 ("st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.") Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa/st: Make sure that prog_to_nir NIR gets freed.	Eric Anholt	2019-03-14	1	-0/+6
\| \| \| \| \| \| \| \| \| \|	GLSL NIR gets freed on relink by _mesa_delete_program(), but for ARB programs we need to free the old NIR when PSN is used to set up new NIR in the same gl_program. Additionally, set the base .nir field so that it will get freed by _mesa_delete_program(). Fixes: 3d7611e9a6c6 ("st/nir: use NIR for asm programs") Reviewed-by: Kenneth Graunke <[email protected]>