mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	iris: Track per-stage bind history, reduce work accordingly	Kenneth Graunke	2019-09-18	4	-6/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	We now track per-stage bind history for constant and shader buffers, shader images, and sampler views by adding an extra res->bind_stages field to go with res->bind_history. This lets us flag IRIS_DIRTY_CONSTANTS for only the specific stages involved, and also skip some CPU overhead in iris_rebind_buffer. Cuts 4% of 3DSTATE_CONSTANT_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	iris: Don't flag IRIS_DIRTY_BINDINGS for constant usage history	Kenneth Graunke	2019-09-18	1	-2/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The underlying buffer isn't changing - so we don't need to update any SURFACE_STATE descriptors - we just might have new constants, meaning we need to re-emit 3DSTATE_CONSTANT_XS. On Gen9, this means we need to update 3DSTATE_BINDING_TABLE_POINTERS_XS too, but that's now handled by the explicit check in the previous patch. On Gen9, this should cause us to re-emit the binding table /pointer/ on writing to a buffer with PIPE_BIND_CONSTANT_BUFFER, rather than emitting a whole new /table/. On Gen8 and Gen11, this avoids binding table churn altogether. Cuts 61% of 3DSTATE_BINDING_TABLE_POINTERS_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	iris: Explicitly emit 3DSTATE_BTP_XS on Gen9 with DIRTY_CONSTANTS_XS	Kenneth Graunke	2019-09-18	1	-1/+6
\| \| \| \| \| \| \| \| \| \| \| \| \|	Right now, we usually flag both IRIS_DIRTY_{CONSTANTS,BINDINGS}_XS, because we have SURFACE_STATE for constant buffers in case the shaders access them via pull mode. But this flagging is overkill in many cases. Gen8 and Gen11 don't need it at all. Gen9 doesn't need that large of a hammer in all cases. Just handle it explicitly so the right thing happens. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebinds	Kenneth Graunke	2019-09-18	1	-1/+2
\| \| \| \| \| \| \|	We upload a new SURFACE_STATE for the UBO/SSBO in question, which means that we need new binding tables as well. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	radv: Add DFSM support.	Bas Nieuwenhuizen	2019-09-18	1	-5/+17
\| \| \| \| \| \| \| \| \| \| \| \|	Apparently we already enabled it without having support ... Not sure if we also need to set disable_start_of_prim when the PS has memory writes, but this mirrors radeonsi. Doubles fillrate in my dual_quad_bench from ~16 pixels/cycles to ~32 pixels/cycle on a Raven. Reviewed-by: Samuel Pitoiset <[email protected]>
*	radv: Disable dfsm by default even on Raven.	Bas Nieuwenhuizen	2019-09-18	2	-3/+4
\| \| \| \| \| \|	When actually implementing it, Talos on low is still 3% slower. Reviewed-by: Samuel Pitoiset <[email protected]>
*	radv: Only break batch on framebuffer change with dfsm.	Bas Nieuwenhuizen	2019-09-18	1	-1/+1
\| \| \| \|	Reviewed-by: Samuel Pitoiset <[email protected]>
*	nir/opt_if: Fix undef handling in opt_split_alu_of_phi()	Connor Abbott	2019-09-18	1	-55/+20
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The pass assumed that "Most ALU ops produce an undefined result if any source is undef" which is completely untrue. Due to how we lower if statements to selects and then optimize on those selects later, we simply cannot make that assumption. In particular this pass tried to replace an ior of undef and true, which had been generated by optimizing a select which itself came from flattening an if statement, to undef causing a miscompilation for a CTS test with radeonsi NIR. We fix this by always doing what the non-undef path did, i.e. duplicate the instruction twice. If there are cases where the instruction before the loop can be folded away due to having an undef source, we should add these to opt_undef instead. The comment above the pass says that if the phi source from before the loop is undef, and we can fold the instruction before the loop to undef, then we can ignore sources of the original instruction that don't dominate the block before the loop because we don't need them to create the instruction before the loop. This is incorrect, because the instruction at the bottom of the loop would get those sources from the wrong loop iteration. The code never actually did what the comment said, so we only have to update the comment to match what the pass actually does. We also update the example to more closely match what most actual loops look like after vtn and peephole_select. There are no shader-db changes with i965, radeonsi NIR, or radv. With anv and my vkpipeline-db there's only one change: total instructions in shared programs: 14125290 -> 14125300 (<.01%) instructions in affected programs: 2598 -> 2608 (0.38%) helped: 0 HURT: 1 total cycles in shared programs: 2051473437 -> 2051473397 (<.01%) cycles in affected programs: 36697 -> 36657 (-0.11%) helped: 1 HURT: 0 Fixes KHR-GL45.shader_subroutine.control_flow_and_returned_subroutine_values_used_as_subroutine_input with radeonsi NIR.
*	gl: drop incorrect pkg-config file for glvnd	Eric Engestrom	2019-09-18	1	-12/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Akin to 1a25980c469b38d2c645 ("egl: drop incorrect pkg-config file for glvnd") and b01524fff05eef66e8cd ("meson: don't build libGLES*.so with GLVND") , removes a pkg-config file that shouldn't have been there in the first place, but was needed because of that GLVND bug. Now that the glvnd bug has been fixed, it was apparent that this gl.pc pkg-config file was forgotten to be removed, so let's do just that :) Suggested-by: Matt Turner <[email protected]> Cc: [email protected] Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	nir/opcodes: Clear variable names confusion	Andres Gomez	2019-09-18	1	-10/+15
\| \| \| \| \| \| \| \| \| \| \|	Having Python and C variables sharing name in the same block of code makes its understanding a bit confusing. Make it explicit that the Python bit_size variable refers to the destination bit size. Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	radv: never kill a NGG GS shader	Rhys Perry	2019-09-18	1	-1/+3
\| \| \| \| \| \| \| \|	Seems to fix a hang with excessive vertex emissions when NGG is used for GS. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv/gfx10: fix VK_KHR_pipeline_executable_properties with NGG GS	Samuel Pitoiset	2019-09-18	1	-4/+13
\| \| \| \| \| \| \| \| \| \| \|	No GS copy shader if a pipeline enables NGG GS. This fixes dEQP-VK.pipeline.executable_properties.graphics.geometry_stage. Fixes: 86864eedd2d ("radv: Implement radv_GetPipelineExecutablePropertiesKHR.") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radeonsi: include drm_fourcc.h to fix the build	Marek Olšák	2019-09-18	1	-0/+1
\|
*	radeonsi: implement pipe_screen::resource_get_param	Marek Olšák	2019-09-18	1	-22/+78
\| \| \| \| \| \|	v2: return DRM_FORMAT_MOD_INVALID from the function Reviewed-by: Kenneth Graunke <[email protected]> (v1)
*	gallium: extend resource_get_param to be as capable as resource_get_handle	Marek Olšák	2019-09-18	7	-16/+56
\| \| \| \| \|	Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	ac: move ac_get_num_physical_vgprs into radeon_info	Marek Olšák	2019-09-18	3	-13/+5
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: move ac_get_num_physical_sgprs into radeon_info	Marek Olšák	2019-09-18	5	-17/+17
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: move ac_get_max_wave64_per_simd into radeon_info	Marek Olšák	2019-09-18	4	-18/+6
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: move num_sdp_interfaces into radeon_info	Marek Olšák	2019-09-18	4	-30/+17
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	ac: move PBB MAX_ALLOC_COUNT into radeon_info	Marek Olšák	2019-09-18	4	-62/+35
\| \| \| \| \|	Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
*	etnaviv: fix two-sided stencil	Jonathan Marek	2019-09-18	5	-30/+44
\| \| \| \| \| \| \| \| \| \| \| \| \|	* Set missing STENCIL_CONFIG_EXT2 bits * Swap stencil sides when rendering CCW Fixes following deqp tests (which were 99% failing): dEQP-GLES2.functional.fragment_ops.depth_stencil.* Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0 Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
*	radv: fix loading 64-bit GS inputs	Samuel Pitoiset	2019-09-18	1	-0/+35
\| \| \| \| \| \| \| \| \| \| \|	We have to load 2 32-bit integer and to cast correctly. This fixes crashes with gs-double-interpolator.vk_shader_test. Cc: 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111734 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	tu: Set up glsl types.	Bas Nieuwenhuizen	2019-09-18	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \|	Addresses this assert: deqp-vk: ../mesa-freedreno-9999/src/compiler/glsl_types.cpp:1244: static const glsl_type glsl_type::get_interface_instance(const glsl_struct_field , unsigned int, enum glsl_interface_packing, bool, const char *): Assertion `glsl_type_users > 0' failed. running dEQP-VK.api.smoke.triangle . Fixes: 624789e3708 "compiler/glsl: handle case where we have multiple users for types" Reviewed-by: Lionel Landwerlin <[email protected]>
*	radv: fix writing depth/stencil clear values to image	Samuel Pitoiset	2019-09-18	1	-3/+4
\| \| \| \| \| \| \| \| \|	Use the fastest way only if both aspects are used. Oops. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111728 Fixes: 218ce34962c ("radv: add mipmap support for the clear depth/stencil values") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	i965: support AYUV/XYUV for external import only	Haihao Xiang	2019-09-18	1	-0/+2
\| \| \| \| \| \| \| \| \|	Fixes: 89785e2d56e7fa ("i965: add support for sampling from AYUV") Fixes: 7cab8d3661f243 ("i965: Add support for sampling from XYUV images") Cc: Vivek Kasireddy <[email protected]> Cc: Lionel Landwerlin <[email protected]> Signed-off-by: Haihao Xiang <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	panfrost: Allocate tiler and scratchpad BOs per-batch	Boris Brezillon	2019-09-18	4	-41/+68
\| \| \| \| \| \| \| \| \|	If we want to execute several batches in parallel they need to have their own tiler and scratchpad BOs. Let move those objects to panfrost_batch and allocate them on a per-batch basis. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add FBO BOs to batch->bos earlier	Boris Brezillon	2019-09-18	4	-3/+17
\| \| \| \| \| \| \| \| \| \|	If we want the batch dependency tracking to work correctly we must make sure all BOs are added to the batch->bos set early enough. Adding FBO BOs when generating the fragment job is clearly to late. Add a panfrost_batch_add_fbo_bos helper and call it in the clear/draw path. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add the panfrost_batch_create_bo() helper	Boris Brezillon	2019-09-18	4	-25/+28
\| \| \| \| \| \| \| \|	This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+ panfrost_bo_unreference() sequence that's done for all per-batch BOs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Don't return imported/exported BOs to the cache	Boris Brezillon	2019-09-18	2	-0/+9
\| \| \| \| \| \| \| \|	We don't know who else is using the BO in that case, and thus shouldn't re-use it for something else. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add panfrost_bo_{alloc,free}()	Boris Brezillon	2019-09-18	1	-76/+68
\| \| \| \| \| \| \| \| \|	Thanks to that we avoid the recursive call into panfrost_bo_create() and we can get rid of panfrost_bo_release() by inlining the code in panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Stop using panfrost_bo_release() outside of pan_bo.c	Boris Brezillon	2019-09-18	4	-7/+8
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	panfrost_bo_unreference() should be used instead. The only difference caused by this change is that the scratchpad, tiler_heap and tiler_dummy BOs are now returned to the cache instead of being freed when a context is destroyed. This is only a problem if we care about context isolation, which apparently is not the case since transient BOs are already returned to the per-FD cache (and all contexts share the same address space anyway, so enforcing context isolation is almost impossible). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Stop passing screen around for BO operations	Boris Brezillon	2019-09-18	7	-37/+37
\| \| \| \| \| \| \| \|	Store a screen pointer in panfrost_bo so we don't have to pass a screen object to all functions manipulating the BO. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap()	Boris Brezillon	2019-09-18	1	-5/+1
\| \| \| \| \| \| \|	panfrost_bo_mmap() already takes care of that. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Stop exposing panfrost_bo_cache_{fetch,put}()	Boris Brezillon	2019-09-18	2	-8/+2
\| \| \| \| \| \| \| \|	They are not expected to be called directly, users should use panfrost_bo_{create,release}() instead. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Move the BO API to its own header	Boris Brezillon	2019-09-18	16	-74/+112
\| \| \| \| \| \| \| \|	Right now, the BO API is spread over pan_{allocate,resource,screen}.h. Let's move all BO related definitions to a separate header file. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: s/PAN_ALLOCATE_/PAN_BO_/	Boris Brezillon	2019-09-18	7	-19/+19
\| \| \| \| \| \| \| \|	Change the prefix for BO allocation flags to make it consistent with the rest of the BO API. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.c	Boris Brezillon	2019-09-18	2	-19/+20
\| \| \| \| \| \| \| \|	This way we have all BO related functions placed in the same source file. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Get rid of pan_drm.c	Boris Brezillon	2019-09-18	12	-444/+382
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	pan_drm.c was only meaningful when we were supporting 2 kernel drivers (mali_kbase, and the drm one). Now that there's now kernel-driver abstraction we're better off moving those functions were they belong: * BO related functions in pan_bo.c * fence related functions + query_gpu_version() in pan_screen.c * submit related functions in pan_job.c While at it, we rename the functions according to the place they're being moved to. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()	Boris Brezillon	2019-09-18	3	-5/+4
\| \| \| \| \| \| \| \|	has_draws can be inferred directly from the batch->last_job value, no need to pass it around. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Kill a useless memset(0) in panfrost_create_context()	Boris Brezillon	2019-09-18	1	-1/+0
\| \| \| \| \| \| \| \|	ctx is allocated with rzalloc() which takes care of zero-ing the memory region. No need to call memset(0) on top. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add polygon_list to the batch BO set at allocation time	Boris Brezillon	2019-09-18	2	-4/+7
\| \| \| \| \| \| \| \| \|	That's what we do for other per-batch BOs, and we'll soon add an helper to automate this create_bo()+add_bo()+bo_unreference() sequence, so let's prepare the code to ease this transition. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Add missing panfrost_batch_add_bo() calls	Boris Brezillon	2019-09-18	1	-1/+4
\| \| \| \| \| \| \| \| \| \|	Some BOs are used by batches but never explicitly added to the BO set. This is currently not a problem because we wait for the execution of a batch to be finished before releasing a BO, but we will soon relax this rule. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Use the correct type for the bo_handle array	Boris Brezillon	2019-09-18	1	-1/+2
\| \| \| \| \| \| \| \|	The DRM driver expects an array of u32, let's use the correct type, even if using an int works in practice because it's still a 32-bit integer. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	panfrost: Stop exposing internal panfrost_*_batch() functions	Boris Brezillon	2019-09-18	2	-14/+3
\| \| \| \| \| \| \| \|	panfrost_{create,free,get}_batch() are only called inside pan_job.c. Let's make them static. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
*	etnaviv: disable ARB_shadow	Christian Gmeiner	2019-09-18	1	-0/+2
\| \| \| \| \| \| \| \| \|	Looks like only HALT2 GPUs have support for it but that is not yet implemented so disable ARB_shadow for now. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"	Christian Gmeiner	2019-09-18	4	-2/+9
\| \| \| \| \| \| \| \| \| \|	There are GPUs that do not support this feature. This reverts commit e871abe452ad40efcccb0bab6b88fc31d0551e29 Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
*	virgl: Remove wrong EAGAIN handling for drmIoctl	Lepton Wu	2019-09-18	1	-3/+3
\| \| \| \| \| \| \| \| \| \| \|	drmIoctl handles EAGAIN itself and actually it always return -1 on errors. Remove the wrong handling of its return value. Also, print a warning when it fails. v2: - use _debug_printf instead of fprintf (Gurchetan Singh) Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v1)
*	iris: Skip allocating a null surface when there are 0 color regions.	Kenneth Graunke	2019-09-17	2	-2/+9
\| \| \| \| \| \| \| \| \| \| \|	The compiler now sets the "Null Render Target" bit in the RT write extended message descriptor, causing it to write to an implicit null surface without us needing to set one up in the binding table. Together with the last patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	intel/compiler: Set "Null Render Target" ex_desc bit on Gen11	Kenneth Graunke	2019-09-17	1	-0/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When there are no color regions (i.e. a depth only pass), we can set the "Null Render Target" bit in the Gen11 RT write extended message descriptor to indicate that it should behave as if it's writing to a null render target, without the need for a binding table entry. This lets drivers avoid setting up that null RT binding table entry, but more importantly means the HW doesn't actually have to bother looking up the surface state. Together with the next patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	anv: enable VK_KHR_shader_float_controls and SPV_KHR_float_controls	Samuel Iglesias Gonsálvez	2019-09-17	3	-0/+33
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	This adds support for VK_STRUCTURE_TYPE_PHYSICAL_DEVICE_FLOAT_CONTROLS_PROPERTIES_KHR and enables de Vulkan and SPIR-V extensions. Also, notice that this includes the updates applied to the VkPhysicalDeviceFloatControlsPropertiesKHR structure in the extension VK_KHR_shader_float_controls v4 and Vulkan 1.1.116. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>