mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	gallium/u_blitter: enable MSAA when blitting to MSAA surfaces	Brian Paul	2019-07-03	1	-22/+34
\| \| \| \| \| \| \| \| \| \| \| \|	If we're doing a Z -> Z MSAA blit (for example) we need to enable msaa rasterization when drawing the quads so that we can properly write the per-sample values. This fixes a number of Piglit ext_framebuffer_multisample blit tests such as ext_framebuffer_multisample/no-color 2 depth combined with the VMware driver. Signed-off-by: Marek Olšák <[email protected]>
*	virgl: Clear the valid buffer range when possible	Alexandros Frantzis	2019-07-03	2	-0/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If we are discarding the whole resource, we don't care about previous contents, and the resource storage is now unused, either because we have created new resource storage, or because we have waited for the existing resource storage to become unused, or because the transfer is unsynchronized. In the last two cases this commit marks the storage as uninitialized, but only if the resource is not host writable (in which case we can't clear the valid range, since that would result in missed readbacks in future transfers). In the first case, when the whole resource discard involves a reallocation, the reallocation and subsequent rebinding already update the valid buffer range appropriately. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
*	swr/swr: Enable ARB_viewport_array	Jan Zielinski	2019-07-03	7	-48/+60
\| \| \| \| \| \| \| \| \| \|	The rasterizer core supported ARB_viewport_array, but the swr layer connecting core to Gallium state tracker only allowed one viewport. We add support for multiple viewports to swr layer. Reviewed-by: Alok Hota <[email protected]>
*	radv: Support VK_EXT_queue_family_foreign.	Bas Nieuwenhuizen	2019-07-03	3	-3/+7
\| \| \| \| \| \| \| \| \|	Basically same as external for now. Reviewed-by: Samuel Pitoiset <[email protected]> Only case we might need to handle differently in the near future is Raven's case of displayable DCC which is not renderable. But we don't support that yet.
*	radv: Fix interactions between variable descriptor count and inline uniform ↵	Bas Nieuwenhuizen	2019-07-03	1	-1/+5
\| \| \| \| \| \| \|	blocks. Fixes: d7e6541cc72 "radv: Only allocate supplied number of descriptors when variable." Reviewed-by: Samuel Pitoiset <[email protected]>
*	winsys/amdgpu: Make KMS handles valid for original DRM file descriptor	Michel Dänzer	2019-07-03	6	-11/+24
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Getting a DMA-buf fd and converting that to a handle using our duplicate of that file descriptor (getting at which requires passing a radeon_winsys pointer to the buffer_get_handle hook) makes sure of this, since duplicated file descriptors reference the same file description and therefore the same GEM handle namespace. This is necessary because libdrm_amdgpu may use a different DRM file descriptor with a separate handle namespace internally, e.g. because it always reuses any existing amdgpu_device_handle for the same device. amdgpu_bo_export returns a handle which is valid for that internal file descriptor. Bugzilla: https://bugs.freedesktop.org/110903 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	winsys/amdgpu: Add amdgpu_screen_winsys	Michel Dänzer	2019-07-03	7	-142/+183
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It extends pipe_screen / radeon_winsys and references amdgpu_winsys. Multiple amdgpu_screen_winsys instances may reference the same amdgpu_winsys instance, which corresponds to an amdgpu_device_handle. The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM file descriptor passed to amdgpu_winsys_create, which will be needed in the next change. v2: * Add comment in amdgpu_winsys_unref explaining why it always returns true (Marek Olšák) Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	winsys/amdgpu: Use amdgpu_winsys helper instead of open-coded casts	Michel Dänzer	2019-07-03	3	-8/+8
\| \| \| \| \| \| \| \|	Cleanup to prevent breakage with the next change, no functional change intended in this one. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
*	intel: fix wrong format usage	Juan A. Suarez Romero	2019-07-03	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \|	Do not use the view format when filling the surface state. Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.* Fixes: fb1350c76f1 ("intel: Add and use helpers for level0 extent") Reviewed-by: Nanley Chery <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
*	radv: only allocate a 32-bit value for the TC-compat range metadata	Samuel Pitoiset	2019-07-03	1	-2/+2
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: remove unused code in radv_update_tc_compat_zrange_metadata()	Samuel Pitoiset	2019-07-03	1	-2/+0
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: add radv_get_depth_pipeline() helper	Samuel Pitoiset	2019-07-03	1	-25/+41
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	iris: assert isl_surf_init success in resource_from_handle	Mike Blumenkrantz	2019-07-02	1	-14/+15
\| \| \| \| \| \| \| \|	this can fail unexpectedly due to bugs, so it's good to provide feedback when this occurs Reviewed-by: Sagar Ghuge <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
*	anv: Advertise a more accurate minTexelBufferOffsetAlignment	Jason Ekstrand	2019-07-02	1	-1/+4
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	anv: Implement VK_EXT_texel_buffer_alignment	Jason Ekstrand	2019-07-02	2	-0/+38
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	vulkan: Update the XML and headers to 1.1.113	Jason Ekstrand	2019-07-02	1	-8/+63
\| \| \| \|	Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
*	spirv: Ignore ArrayStride in OpPtrAccessChain for Workgroup	Caio Marcelo de Oliveira Filho	2019-07-02	1	-4/+6
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <[email protected]>
*	nouveau: handle new CAPS	Karol Herbst	2019-07-02	2	-0/+26
\| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
*	intel/fs: Use nir_lower_interpolation on gen11+	Jason Ekstrand	2019-07-02	4	-48/+3
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	On gen11, the removed the PLN instruction so we have to emit a pile of MAD to emulate it. We may as well do that in NIR so we can optimize and later schedule it. Shader-db results on Ice Lake: total instructions in shared programs: 17145644 -> 16556440 (-3.44%) instructions in affected programs: 11507454 -> 10918250 (-5.12%) helped: 35763 HURT: 42085 helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18 helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49% HURT stats (abs) min: 1 max: 248 x̄: 2.22 x̃: 2 HURT stats (rel) min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47% 95% mean confidence interval for instructions value: -7.67 -7.47 95% mean confidence interval for instructions %-change: -4.46% -4.29% Instructions are helped. total loops in shared programs: 4370 -> 4370 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 360624645 -> 368220857 (2.11%) cycles in affected programs: 269631244 -> 277227456 (2.82%) helped: 15583 HURT: 65874 helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32 helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44% HURT stats (abs) min: 1 max: 238638 x̄: 133.87 x̃: 20 HURT stats (rel) min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97% 95% mean confidence interval for cycles value: 67.42 119.09 95% mean confidence interval for cycles %-change: 3.61% 3.73% Cycles are HURT. total spills in shared programs: 8943 -> 8981 (0.42%) spills in affected programs: 1925 -> 1963 (1.97%) helped: 44 HURT: 14 total fills in shared programs: 21815 -> 21925 (0.50%) fills in affected programs: 3511 -> 3621 (3.13%) helped: 41 HURT: 18 LOST: 70 GAINED: 14 Reviewed-by: Matt Turner <[email protected]>
*	intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltas	Jason Ekstrand	2019-07-02	2	-1/+14
\| \| \| \|	Reviewed-by: Matt Turner <[email protected]>
*	intel/fs: Actually implement the load_barycentric intrinsics	Jason Ekstrand	2019-07-02	2	-12/+93
\| \| \| \| \| \| \| \| \| \|	If they never get used, dead code should clean them up. Also, we rework the at_offset and at_sample intrinsics so they return a proper vec2 instead of returning things in PLN layout. Fortunately, copy-prop is pretty good at cleaning this up and it doesn't result in any actual extra MOVs. Reviewed-by: Matt Turner <[email protected]>
*	nir: add pass to lower load_interpolated_input	Rob Clark	2019-07-02	6	-0/+193
\| \| \| \| \| \|	Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	panfrost: Pass referenced BOs to the SUBMIT ioctls	Boris Brezillon	2019-07-02	1	-19/+27
\| \| \| \| \| \| \| \| \| \| \|	Instead of manually adding the BOs from the various SLAB pools plus the one backing the color FB, we insert them in the BO set attached to the job and let panfrost_drm_submit_job() pass all BOs from this set to the SUBMIT ioctl. This means we are now passing all referenced BOs and let the scheduler wait on referenced BO fences if needed. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Make SLAB pool creation rely on BO helpers	Boris Brezillon	2019-07-02	7	-110/+56
\| \| \| \| \| \| \|	There's no point duplicating the code, and it will help us simplify the bo_handles[] filling logic in panfrost_drm_submit_job(). Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Add the panfrost_drm_{create,release}_bo() helpers	Boris Brezillon	2019-07-02	3	-29/+70
\| \| \| \| \| \| \|	To avoid the panfrost_memory <-> panfrost_bo dance done in panfrost_resource_create_bo() and panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Move the mmap BO logic out of panfrost_drm_import_bo()	Boris Brezillon	2019-07-02	1	-21/+30
\| \| \| \| \| \| \|	So we can re-use it for the panfrost_drm_create_bo() function we are about to introduce. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Avoid passing winsys handles to import/export BO funcs	Boris Brezillon	2019-07-02	3	-19/+20
\| \| \| \| \| \| \| \| \| \|	Let's keep a clear split between ioctl wrappers and the rest of the driver. All the import BO function need is a dmabuf FD and the screen object, and the export one should only take care of generating a dmabuf FD out of a BO object. Winsys handle manipulation should stay in the resource.c file. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Move BO meta-data out of panfrost_bo	Boris Brezillon	2019-07-02	6	-94/+98
\| \| \| \| \| \| \| \| \| \| \| \|	That's what most (all?) implementation seem to do, and my understanding is that a BO is just a bunch of memory that can be used for anything GPU related, not only texture/FB resources. Let's move those meta data in panfrost_resource so we can use panfrost_bo for all kind of memory allocation and make BO allocation more consistent. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Stop exposing internal panfrost_drm_*() functions	Boris Brezillon	2019-07-02	2	-7/+2
\| \| \| \| \| \| \|	panfrost_drm_submit_job() and panfrost_fence_create() are not used outside of pan_drm.c. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Get rid of the "free imported BO" logic	Boris Brezillon	2019-07-02	4	-37/+8
\| \| \| \| \| \| \| \| \| \|	bo->imported was never set to true which means this path was never taken. Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap() call which seems wrong because the import BO function calls mmap(). Let's just kill this function along with the ->imported field. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Get rid of the panfrost_driver abstraction leftovers	Boris Brezillon	2019-07-02	3	-35/+0
\| \| \| \| \| \| \|	Commit 5f81669d880b ("panfrost: Remove the panfrost_driver abstraction") left a few things behind, remove them now. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Move scanout res creation out of panfrost_resource_create()	Boris Brezillon	2019-07-02	1	-32/+41
\| \| \| \| \| \|	Which improves readability and help us avoid a memory leak. Signed-off-by: Boris Brezillon <[email protected]>
*	panfrost: Add the sampled texture BO to the job	Boris Brezillon	2019-07-02	1	-0/+4
\| \| \| \| \| \| \| \| \|	Otherwise we get random use-after-{free,unmap} errors. Signed-off-by: Boris Brezillon <[email protected]> --- Changes in v2: - Move the panfrost_job_add_bo() call out of the loop
*	radv: enable DCC for layers on GFX8	Samuel Pitoiset	2019-07-02	1	-9/+23
\| \| \| \| \| \| \| \| \| \| \| \|	It's currently only enabled if dcc_slice_size is equal to dcc_slice_fast_clear_size because the driver assumes that portions of multiple layers are contiguous but it's not always true. Still not supported on GFX9. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: do not enable DCC for mipmapped arrays because performance is worse	Samuel Pitoiset	2019-07-02	1	-0/+4
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: implement clearing DCC layers on GFX8	Samuel Pitoiset	2019-07-02	2	-4/+7
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: merge radv_dcc_clear_level() into radv_clear_dcc()	Samuel Pitoiset	2019-07-02	1	-30/+22
\| \| \| \| \| \| \| \|	This will help for clearing DCC arrays because we need to know the subresource range. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	radv: add support for decompressing DCC layers with compute	Samuel Pitoiset	2019-07-02	1	-51/+53
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: compute the DCC fast clear size per slice on GFX8	Samuel Pitoiset	2019-07-02	2	-0/+28
\| \| \| \| \|	Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	ac: compute the size of one DCC slice on GFX8	Samuel Pitoiset	2019-07-02	2	-0/+7
\| \| \| \| \| \| \| \|	Addrlib doesn't provide this info. Because DCC is linear, at least on GFX8, it's easy to compute the size of one slice. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
*	iris: Defer closing and freeing VMA until buffers are idle.	Kenneth Graunke	2019-07-02	1	-10/+51
\| \| \| \| \| \| \| \| \| \| \| \|	There will unfortunately be circumstances where we cannot re-use a virtual memory address until it's no longer active on the GPU. To facilitate this, we instead move BOs to a "dead" list, and defer closing them and returning their VMA until they are idle. We periodically sweep these away in cleanup_bo_cache, which triggers every time a new object's refcount hits zero. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Jordan Justen <[email protected]>
*	iris: Add an explicit alignment parameter to iris_bo_alloc_tiled().	Kenneth Graunke	2019-07-02	3	-12/+19
\| \| \| \| \| \| \| \| \| \| \| \|	In the future, some images will need to be aligned to a larger value than 4096. Most buffers, however, don't have any such requirement, so for now we only add the parameter to iris_bo_alloc_tiled() and leave the others with the simpler interface. v2: Fix missing alignment in vma_alloc, caught by Caio! Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Jordan Justen <[email protected]>
*	v3d: do not flush jobs that are synced with 'Wait for transform feedback'	Iago Toral Quiroga	2019-07-02	5	-20/+61
\| \| \| \| \| \| \| \| \| \| \| \| \|	Generally, we achieve this by skipping the flush on calls to v3d_flush_jobs_writing_resource() when we detect that the resource is written in the current job from a transform feedback write. The exception to this is the case where the caller is about to map the resource, in which case we need to flush immediately since we can only emit 'Wait for transform feedback' commands on rendering jobs. We add a parameter to the function so the caller can identify that scenario. Reviewed-by: Eric Anholt <[email protected]>
*	v3d: emit 'Wait for transform feedback' commands when needed	Iago Toral Quiroga	2019-07-02	1	-0/+120
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The hardware can flush transform feedback writes before reads in the same job by inserting this command. This patch detects when the rendering state for the current draw call reads resources that had been previously written by transform feedback in the same job and inserts the 'Wait for transform feedback' command before emitting the new draw. v2 (Eric): - this was intended to look at job->tf_write_prscs for TF jobs. - clear job->tf_write_prscs after we emit the TF flush. - can skip flushes for fragment shader reads from TF. v3 (Eric): - all resources in job->tf_write_prscs are resources written by TF so we don't need to check if they are bound to PIPE_BIND_STREAM_OUTPUT. - documented optimization opportunity for geometry stages. Reviewed-by: Eric Anholt <[email protected]>
*	v3d: keep track of resources written by transform feedback	Iago Toral Quiroga	2019-07-02	3	-2/+15
\| \| \| \| \| \| \| \| \| \| \| \| \|	The hardware provides a feature to sync reads from previous transform feedback writes in the same job so if we use this mechanism we no longer have to flush the job. In order to identify this scenario we need a mechanism to identify resources that are written by transform feedback. v2: use _mesa_pointer_set_create (Eric) Reviewed-by: Eric Anholt <[email protected]>
*	st/dri: fix typo in format table for GR1616 format	Mike Blumenkrantz	2019-07-01	1	-1/+1
\| \| \| \| \| \| \|	the dri image format here should match the fourcc format Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	st/dri: pass dri2_format_mapping directly to dri2_create_image_from_winsys	Mike Blumenkrantz	2019-07-01	1	-4/+5
\| \| \| \| \| \| \|	this makes the entire struct available for use here Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	mesa/st: simplify format usage in st_bind_egl_image	Mike Blumenkrantz	2019-07-01	1	-15/+13
\| \| \| \| \| \| \| \| \| \| \|	the formats handled in the switch statement will always return an unknown mesa format, so process them directly and leave the default case for other/unknown formats no functional changes Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
*	iris: Use MI_COPY_MEM_MEM for tiny resource_copy_region calls.	Kenneth Graunke	2019-07-01	1	-0/+27
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	If our resource_copy_region size is a small number of DWords, then instead of firing up BLORP, we can simply use MI_COPY_MEM_MEM (after a CS stall). We also try and select the optimal batch. Improves performance in Shadow of Mordor on Low settings at 1920x1080 on Skylake GT4e by 0.689096% +/- 0.473968% (n=4). It tries to copy 4 bytes of data to a buffer which was most recently used as a writable compute shader SSBO. Previously we were switching from compute to the render pipeline, then firing up all of blorp_buffer_copy...for 4 bytes. I arbitrarily decided to support 4/8/12/16 bytes. Jason thinks this is about the right threshold where it's cheaper to use MI_COPY_MEM_MEM.
*	radv: Only allocate supplied number of descriptors when variable.	Bas Nieuwenhuizen	2019-07-01	1	-1/+7
\| \| \| \| \| \|	Fixes: b5e04e9217b "radv: Support allocating variable size descriptor sets." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111019 Reviewed-by: Samuel Pitoiset <[email protected]>