summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* amd/addrlib: add gfx10 supportMarek Olšák2019-07-0319-40/+12176
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make emit_streamout_output externally accessibleNicolai Hähnle2019-07-032-7/+12
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: pass the context to query destroy functionsNicolai Hähnle2019-07-033-11/+10
| | | | | | We'll need this in the future. Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make si_restore_qbo_state externally availableNicolai Hähnle2019-07-033-14/+14
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make get_primitive_id externally visibleNicolai Hähnle2019-07-032-5/+7
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: make si_llvm_export_vs externally availableNicolai Hähnle2019-07-032-12/+17
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: various si_translate_*format functions only apply to pre-gfx10Nicolai Hähnle2019-07-031-0/+6
| | | | Acked-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: use a fragment shader blit instead of DB->CB copy for ZS CPU mappingsMarek Olšák2019-07-034-154/+52
| | | | | | | | | This mainly removes and simplifies code that is no longer needed. There were some issues with the DB->CB stencil copy on gfx10, so let's just use a fragment shader blit for all ZS mappings. It's more reliable. Tested-by: Dieter Nützel <[email protected]>
* gallium/u_blitter: implement copying from ZS to color and vice versaMarek Olšák2019-07-035-35/+314
| | | | | | | | | This is for drivers that can't map depth and stencil and need to blit them to a color texture for CPU access. This also useful for drivers using separate depth and stencil. Tested-by: Dieter Nützel <[email protected]>
* gallium/util: rewrite depth-stencil blit shadersMarek Olšák2019-07-033-183/+46
| | | | | | | | | - merge all 3 functions (Z, S, ZS) - don't write the color output - read the value from texel.x, then write it to position.z or stencil.y (don't use the value from texel.y or texel.z) Tested-by: Dieter Nützel <[email protected]>
* st/mesa: accelerate glCopyPixels(STENCIL)Marek Olšák2019-07-031-20/+38
| | | | Tested-by: Dieter Nützel
* glsl/standalone: meson test for --dump-builderYevhenii Kolesnikov2019-07-032-0/+23
| | | | | | | | | | Added meson test for standalone compiler with --dump-builder option on builtin texture* functions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767 Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* glsl/standalone: exit on unsupported texture functionsSergii Romantsov2019-07-031-1/+14
| | | | | | | | | | | glsl/standalone with --dump-builder will exit when unsupported texture functions are encountered. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107767 Signed-off-by: Sergii Romantsov <[email protected]> Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radeonsi: make gl_SampleMaskIn = 0x1 when MSAA is disabledPierre-Eric Pelloux-Prayer2019-07-031-1/+2
| | | | | | | | | | | | | | | gl_SampleMaskIn is 1 when R_028BE0_PA_SC_AA_CONFIG is 0, so this commit rework the conditions controlling this register. Before it was set if the sctx->framebuffer had a sample count > 1. Now we still require this condition, but we also need either: - GL_MULTISAMPLE to be enabled - to be executing an operation that doesn't depends on GL state using u_blitter. This fixes the arb_sample_shading/sample_mask piglit tests on radeonsi. Signed-off-by: Marek Olšák <[email protected]>
* gallium/u_blitter: enable MSAA when blitting to MSAA surfacesBrian Paul2019-07-031-22/+34
| | | | | | | | | | | | If we're doing a Z -> Z MSAA blit (for example) we need to enable msaa rasterization when drawing the quads so that we can properly write the per-sample values. This fixes a number of Piglit ext_framebuffer_multisample blit tests such as ext_framebuffer_multisample/no-color 2 depth combined with the VMware driver. Signed-off-by: Marek Olšák <[email protected]>
* virgl: Clear the valid buffer range when possibleAlexandros Frantzis2019-07-032-0/+24
| | | | | | | | | | | | | | | | | | If we are discarding the whole resource, we don't care about previous contents, and the resource storage is now unused, either because we have created new resource storage, or because we have waited for the existing resource storage to become unused, or because the transfer is unsynchronized. In the last two cases this commit marks the storage as uninitialized, but only if the resource is not host writable (in which case we can't clear the valid range, since that would result in missed readbacks in future transfers). In the first case, when the whole resource discard involves a reallocation, the reallocation and subsequent rebinding already update the valid buffer range appropriately. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* swr/swr: Enable ARB_viewport_arrayJan Zielinski2019-07-038-49/+61
| | | | | | | | | | The rasterizer core supported ARB_viewport_array, but the swr layer connecting core to Gallium state tracker only allowed one viewport. We add support for multiple viewports to swr layer. Reviewed-by: Alok Hota <[email protected]>
* radv: Support VK_EXT_queue_family_foreign.Bas Nieuwenhuizen2019-07-034-3/+8
| | | | | | | | | Basically same as external for now. Reviewed-by: Samuel Pitoiset <[email protected]> Only case we might need to handle differently in the near future is Raven's case of displayable DCC which is not renderable. But we don't support that yet.
* radv: Fix interactions between variable descriptor count and inline uniform ↵Bas Nieuwenhuizen2019-07-031-1/+5
| | | | | | | blocks. Fixes: d7e6541cc72 "radv: Only allocate supplied number of descriptors when variable." Reviewed-by: Samuel Pitoiset <[email protected]>
* winsys/amdgpu: Make KMS handles valid for original DRM file descriptorMichel Dänzer2019-07-036-11/+24
| | | | | | | | | | | | | | | | | | Getting a DMA-buf fd and converting that to a handle using our duplicate of that file descriptor (getting at which requires passing a radeon_winsys pointer to the buffer_get_handle hook) makes sure of this, since duplicated file descriptors reference the same file description and therefore the same GEM handle namespace. This is necessary because libdrm_amdgpu may use a different DRM file descriptor with a separate handle namespace internally, e.g. because it always reuses any existing amdgpu_device_handle for the same device. amdgpu_bo_export returns a handle which is valid for that internal file descriptor. Bugzilla: https://bugs.freedesktop.org/110903 Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* winsys/amdgpu: Add amdgpu_screen_winsysMichel Dänzer2019-07-037-142/+183
| | | | | | | | | | | | | | | | | It extends pipe_screen / radeon_winsys and references amdgpu_winsys. Multiple amdgpu_screen_winsys instances may reference the same amdgpu_winsys instance, which corresponds to an amdgpu_device_handle. The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM file descriptor passed to amdgpu_winsys_create, which will be needed in the next change. v2: * Add comment in amdgpu_winsys_unref explaining why it always returns true (Marek Olšák) Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* winsys/amdgpu: Use amdgpu_winsys helper instead of open-coded castsMichel Dänzer2019-07-033-8/+8
| | | | | | | | Cleanup to prevent breakage with the next change, no functional change intended in this one. Reviewed-by: Marek Olšák <[email protected]> Tested-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* intel: fix wrong format usageJuan A. Suarez Romero2019-07-031-1/+1
| | | | | | | | | | | Do not use the view format when filling the surface state. Fixes dEQP-VK.image.texel_view_compatible.compute.extended.texture.* Fixes: fb1350c76f1 ("intel: Add and use helpers for level0 extent") Reviewed-by: Nanley Chery <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* radv: only allocate a 32-bit value for the TC-compat range metadataSamuel Pitoiset2019-07-031-2/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused code in radv_update_tc_compat_zrange_metadata()Samuel Pitoiset2019-07-031-2/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_get_depth_pipeline() helperSamuel Pitoiset2019-07-031-25/+41
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* iris: assert isl_surf_init success in resource_from_handleMike Blumenkrantz2019-07-021-14/+15
| | | | | | | | this can fail unexpectedly due to bugs, so it's good to provide feedback when this occurs Reviewed-by: Sagar Ghuge <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: Advertise a more accurate minTexelBufferOffsetAlignmentJason Ekstrand2019-07-021-1/+4
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Implement VK_EXT_texel_buffer_alignmentJason Ekstrand2019-07-022-0/+38
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* vulkan: Update the XML and headers to 1.1.113Jason Ekstrand2019-07-022-12/+102
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Ignore ArrayStride in OpPtrAccessChain for WorkgroupCaio Marcelo de Oliveira Filho2019-07-021-4/+6
| | | | | | | | | | | | | | | | | From OpPtrAccessChain description in the SPIR-V spec (1.4 rev 1): For objects in the Uniform, StorageBuffer, or PushConstant storage classes, the element’s address or location is calculated using a stride, which will be the Base-type’s Array Stride when the Base type is decorated with ArrayStride. For all other objects, the implementation will calculate the element’s address or location. For non-CL shaders the driver should layout the Workgroup storage class, so override any explicitly set ArrayStride in the shader. This currently fixes only the lower_workgroup_access_to_offsets case, which is used by anv. Reviewed-by: Juan A. Suarez <[email protected]>
* nouveau: handle new CAPSKarol Herbst2019-07-022-0/+26
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* intel/fs: Use nir_lower_interpolation on gen11+Jason Ekstrand2019-07-024-48/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On gen11, the removed the PLN instruction so we have to emit a pile of MAD to emulate it. We may as well do that in NIR so we can optimize and later schedule it. Shader-db results on Ice Lake: total instructions in shared programs: 17145644 -> 16556440 (-3.44%) instructions in affected programs: 11507454 -> 10918250 (-5.12%) helped: 35763 HURT: 42085 helped stats (abs) min: 1 max: 140 x̄: 19.09 x̃: 18 helped stats (rel) min: 0.04% max: 37.93% x̄: 15.40% x̃: 14.49% HURT stats (abs) min: 1 max: 248 x̄: 2.22 x̃: 2 HURT stats (rel) min: 0.05% max: 50.00% x̄: 5.00% x̃: 2.47% 95% mean confidence interval for instructions value: -7.67 -7.47 95% mean confidence interval for instructions %-change: -4.46% -4.29% Instructions are helped. total loops in shared programs: 4370 -> 4370 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 360624645 -> 368220857 (2.11%) cycles in affected programs: 269631244 -> 277227456 (2.82%) helped: 15583 HURT: 65874 helped stats (abs) min: 1 max: 28561 x̄: 78.45 x̃: 32 helped stats (rel) min: <.01% max: 67.81% x̄: 5.38% x̃: 2.44% HURT stats (abs) min: 1 max: 238638 x̄: 133.87 x̃: 20 HURT stats (rel) min: <.01% max: 306.25% x̄: 5.81% x̃: 3.97% 95% mean confidence interval for cycles value: 67.42 119.09 95% mean confidence interval for cycles %-change: 3.61% 3.73% Cycles are HURT. total spills in shared programs: 8943 -> 8981 (0.42%) spills in affected programs: 1925 -> 1963 (1.97%) helped: 44 HURT: 14 total fills in shared programs: 21815 -> 21925 (0.50%) fills in affected programs: 3511 -> 3621 (3.13%) helped: 41 HURT: 18 LOST: 70 GAINED: 14 Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Implement nir_intrinsic_load_fs_input_interp_deltasJason Ekstrand2019-07-022-1/+14
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Actually implement the load_barycentric intrinsicsJason Ekstrand2019-07-022-12/+93
| | | | | | | | | | If they never get used, dead code should clean them up. Also, we rework the at_offset and at_sample intrinsics so they return a proper vec2 instead of returning things in PLN layout. Fortunately, copy-prop is pretty good at cleaning this up and it doesn't result in any actual extra MOVs. Reviewed-by: Matt Turner <[email protected]>
* nir: add pass to lower load_interpolated_inputRob Clark2019-07-026-0/+193
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* panfrost: Pass referenced BOs to the SUBMIT ioctlsBoris Brezillon2019-07-021-19/+27
| | | | | | | | | | | Instead of manually adding the BOs from the various SLAB pools plus the one backing the color FB, we insert them in the BO set attached to the job and let panfrost_drm_submit_job() pass all BOs from this set to the SUBMIT ioctl. This means we are now passing all referenced BOs and let the scheduler wait on referenced BO fences if needed. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Make SLAB pool creation rely on BO helpersBoris Brezillon2019-07-027-110/+56
| | | | | | | There's no point duplicating the code, and it will help us simplify the bo_handles[] filling logic in panfrost_drm_submit_job(). Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Add the panfrost_drm_{create,release}_bo() helpersBoris Brezillon2019-07-023-29/+70
| | | | | | | To avoid the panfrost_memory <-> panfrost_bo dance done in panfrost_resource_create_bo() and panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Move the mmap BO logic out of panfrost_drm_import_bo()Boris Brezillon2019-07-021-21/+30
| | | | | | | So we can re-use it for the panfrost_drm_create_bo() function we are about to introduce. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Avoid passing winsys handles to import/export BO funcsBoris Brezillon2019-07-023-19/+20
| | | | | | | | | | Let's keep a clear split between ioctl wrappers and the rest of the driver. All the import BO function need is a dmabuf FD and the screen object, and the export one should only take care of generating a dmabuf FD out of a BO object. Winsys handle manipulation should stay in the resource.c file. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Move BO meta-data out of panfrost_boBoris Brezillon2019-07-026-94/+98
| | | | | | | | | | | | That's what most (all?) implementation seem to do, and my understanding is that a BO is just a bunch of memory that can be used for anything GPU related, not only texture/FB resources. Let's move those meta data in panfrost_resource so we can use panfrost_bo for all kind of memory allocation and make BO allocation more consistent. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Stop exposing internal panfrost_drm_*() functionsBoris Brezillon2019-07-022-7/+2
| | | | | | | panfrost_drm_submit_job() and panfrost_fence_create() are not used outside of pan_drm.c. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Get rid of the "free imported BO" logicBoris Brezillon2019-07-024-37/+8
| | | | | | | | | | bo->imported was never set to true which means this path was never taken. Moreover, panfrost_drm_free_imported_bo() is doing missing the munmap() call which seems wrong because the import BO function calls mmap(). Let's just kill this function along with the ->imported field. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Get rid of the panfrost_driver abstraction leftoversBoris Brezillon2019-07-023-35/+0
| | | | | | | Commit 5f81669d880b ("panfrost: Remove the panfrost_driver abstraction") left a few things behind, remove them now. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Move scanout res creation out of panfrost_resource_create()Boris Brezillon2019-07-021-32/+41
| | | | | | Which improves readability and help us avoid a memory leak. Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Add the sampled texture BO to the jobBoris Brezillon2019-07-021-0/+4
| | | | | | | | | Otherwise we get random use-after-{free,unmap} errors. Signed-off-by: Boris Brezillon <[email protected]> --- Changes in v2: - Move the panfrost_job_add_bo() call out of the loop
* radv: enable DCC for layers on GFX8Samuel Pitoiset2019-07-021-9/+23
| | | | | | | | | | | | It's currently only enabled if dcc_slice_size is equal to dcc_slice_fast_clear_size because the driver assumes that portions of multiple layers are contiguous but it's not always true. Still not supported on GFX9. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not enable DCC for mipmapped arrays because performance is worseSamuel Pitoiset2019-07-021-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement clearing DCC layers on GFX8Samuel Pitoiset2019-07-022-4/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>