summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: use gl_shader_program_data::spirvAlejandro Piñeiro2018-06-211-1/+1
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* mesa: remove struct gl_extensions::ATI_separate_stencilEmil Velikov2018-06-212-2/+0
| | | | | | | | | | | | Virtually every driver that supports ATI_separate_stencil also supports EXT_stencil_two_side. Use the latter boolean for both extension. With that in mind we can drop the explicit true from the drivers and the nasty comment in compute_version(). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* meson: fix i965/anv/isl genX static lib namesEric Engestrom2018-06-181-1/+1
| | | | | | | | | | | Shouldn't make any functional difference, just that `liblibanv_gen90.a` will now be called `libanv_gen90.a`. Fixes: 3218056e0eb375eeda470 "meson: Build i965 and dri stack" Fixes: d1992255bb29054fa5176 "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* drivers/dri/i965: add missing #includeRoss Burton2018-06-121-0/+2
| | | | | | brw_bufmgr.h uses time_t without include time.h, so the build fails under musl. Reviewed-by: Eric Engestrom <[email protected]>
* i965: fix resource leakEric Engestrom2018-06-111-1/+3
| | | | | | | | | | v2: intel_miptree_release() already takes care of the planes, no need to hand-code the loop (Lionel) Coverity ID: 1436909 Fixes: 3352f2d746d3959b22ca4 "i965: Create multiple miptrees for planar YUV images" Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* i965/screen: Sanity check that all formats we advertise are useableJason Ekstrand2018-06-071-4/+20
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Use RGBA non-sRGB formats for imagesJason Ekstrand2018-06-071-0/+9
| | | | | | | | | | Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly handle RGBX formats. Also, we don't want to make decisions based on those in the first place because we can't render to RGBA and we use the non-sRGB version to determine whether or not to allow CCS_E. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Return false for unsupported formats in query_modifiersJason Ekstrand2018-06-071-2/+14
| | | | | Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Refactor query_dma_buf_formatsJason Ekstrand2018-06-071-12/+13
| | | | | | | | | This reworks it to work like query_dma_buf_modifiers and, in particular, makes it more flexible so that we can disallow a non-static set of formats. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Require softpin support for Cannonlake and later.Kenneth Graunke2018-06-061-0/+10
| | | | | | | | | | | This isn't strictly necessary, but anyone running Cannonlake will already have Kernel 4.5 or later, so there's no reason to support the relocation model on Gen10+. This will let us avoid dealing with them for new features. Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Allocate VMA in userspace for full-PPGTT systems.Kenneth Graunke2018-06-061-1/+1
| | | | | | | | | | | | | | | | This patch enables soft-pinning of all buffers, allowing us to skip relocation processing entirely. All systems with full PPGTT and > 4GB of VMA should gain these benefits. This should be most Gen8+. Unfortunately, this excludes a few systems: - Cherryview (only has 32-bit addressing, despite 48-bit pointers) - Broadwell with a 32-bit kernel - Anybody running pre-4.5 kernel. We may enable it for Cherryview in the future, but it would require some tweaks to the memory zone. Reviewed-by: Jordan Justen <[email protected]>
* intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin.Kenneth Graunke2018-06-061-0/+29
| | | | | | | | | | | | | | | | | | commit 92f01fc5f914fd500497d0c3aed75f3ac8dc054d made i965 start emitting VF cache invalidates when the high bits of vertex buffers change. But we were not tracking vertex buffers emitted by BLORP. This was papered over by a mistake where I emitted VF cache invalidates all the time, which Chris fixed in commit 3ac5fbadfd8644d30fce9ff267cb811ad157996a. This patch adds a new hook which allows the driver to track addresses and request a VF cache invalidate as appropriate. v2: Make the driver do the PIPE_CONTROL so it can apply workarounds (caught by Jason Ekstrand). Rebase on anv bug fix. v3: Don't screw up the boolean (caught by Jason Ekstrand). Fixes: 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Reviewed-by: Jason Ekstrand <[email protected]>
* dri: add missing 16bits formats mappingLionel Landwerlin2018-06-071-0/+16
| | | | | | | | | | | | | | | | | | i965 advertises the 16-bit R and RG formats through eglQueryDmaBufFormatsEXT but falls over when a client tries to use or asks more information about such a format because driImageFormatToGLFormat returns MESA_FORMAT_NONE. Found by Eero Tamminen. v2: Add G16R16 formats (Lionel) v3: Fix G16R16 mapping to mesa format (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642 Reviewed-by: Plamena Manolova <[email protected]> (v2) Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Prepare batchbuffer module for softpin support.Kenneth Graunke2018-06-042-3/+39
| | | | | | | | | | | | | | | | | | | | If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations. We simply want to add the BO to the validation list, and possibly mark it as writeable. The new brw_use_pinned_bo() interface does just that. To avoid having to make every caller consider both the relocation and softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given a softpinned buffer. We also can't grow buffers that are softpinned - the mechanism places a larger BO at the same offset as the original, which requires moving BOs around in the VMA. With softpin, we only allocate enough VMA for the original size of the BO. v2: Assert that BOs aren't pinned if the kernel says we should move them (feedback from Chris Wilson) Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add virtual memory allocator infrastructure to brw_bufmgr.Kenneth Graunke2018-06-042-1/+286
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This introduces a new fast virtual memory allocator integrated with our BO cache bucketing. For larger objects, it falls back to the simple free-list allocator (util_vma). This puts the allocators in place but doesn't enable softpin yet. v2: (feedback from Chris Wilson) - Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag - Avoid vma_free(0ull) on the err_free path. - Only enable if the kernel says we have full PPGTT support - Make bucketing allocators more resistant to failing to grow arrays (feedback from Scott Phillips) - Don't use node after popping it from the list. - Avoid undefined behavior in canonicalization by reusing new helper - Comment updates (feedback from myself) - Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper to return a non-canonical address with the high bits zeroed. - Don't shadow loop variable 'i' when destroying things (ugly; worked) v3: - Replace zero_high_bits with new common gen_48b_address helper. Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Disable internal CCS for shadows of multi-sampled windowsJason Ekstrand2018-06-041-1/+10
| | | | | | | | | | | | | | If window system supports Y-tiling but not CCS_E, we currently create an internal CCS for any window system buffers and then resolve right before handing it off to X or Wayland. In the case of the single-sampled shadow of a multi-sampled window system buffer, this is pointless because the only thing we do with it is use it as a MSAA resolve target so we do MSAA resolve -> CCS resolve -> hand to the window system. Instead, just disable CCS for the shadow and then the MSAA resolve will write uncompressed directly into it. If the window system supports CCS_E, we will still use CCS_E, we just won't do internal CCS. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Rename a parameter to create_for_dri_imageJason Ekstrand2018-06-042-4/+4
| | | | | | | Instead of having it be a general "is this a winsys image" boolean, make it more specific to the actual purpose. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix batch-last mode to properly swap BOs.Kenneth Graunke2018-06-041-0/+5
| | | | | | | | | | | | | | | On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move the validation list entry to the end...but incorrectly left the exec_bo array alone, causing a mismatch where exec_bos[0] no longer corresponded with validation_list[0] (and similarly for the last entry). One example of resulting breakage is that we'd update bo->gtt_offset based on the wrong buffer. This wreaked total havoc when trying to use softpin, and likely caused unnecessary relocations in the normal case. Fixes: 29ba502a4e28471f67e4e904ae503157087efd20 (i965: Use I915_EXEC_BATCH_FIRST when available.) Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Add ARB_fragment_shader_interlock support.Plamena Manolova2018-06-011-0/+1
| | | | | | | | | Adds suppport for ARB_fragment_shader_interlock. We achieve the interlock and fragment ordering by issuing a memory fence via sendc. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* mesa: expose ARB_tessellation_shader in the compatibility profileMarek Olšák2018-05-291-1/+2
| | | | | | | Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <[email protected]>
* mesa: expose AMD_vertex_shader_layer in the compatibility profileMarek Olšák2018-05-291-1/+2
| | | | | | | | | | This requires layered FBOs from GL 3.2. Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: expose ARB_gpu_shader5 in the compatibility profileMarek Olšák2018-05-291-2/+4
| | | | | | | Gallium drivers don't expose this yet due to: "st/mesa: use PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITY" Reviewed-by: Timothy Arceri <[email protected]>
* i965: Only emit VF cache invalidations when the high bits changesChris Wilson2018-05-291-1/+1
| | | | | | | | | | | | Commit 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") tried to only emit the VF invalidate if the high bits changed, but it accidentally always set need_invalidate to true; causing it to emit unconditionally emit the pipe control before every primitive. Fixes: 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106708 Reviewed-by: Kenneth Graunke <[email protected]>
* dri: replace two-way switch case with a table lookupEric Engestrom2018-05-291-74/+84
| | | | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]> --- v2: rebased on top of 432df741e0b85c021da0 "dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format."
* dri: fix error value returned by driGLFormatToImageFormat()Eric Engestrom2018-05-293-3/+3
| | | | | | | | | | | | 0 is not a valid value for the __DRI_IMAGE_FORMAT_* enum. It is, however, the value of MESA_FORMAT_NONE, which two of the callers (i915 & i965) checked for. The other callers (that check for errors, ie. st/dri) already check for __DRI_IMAGE_FORMAT_NONE. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Revert recent tiled memcpy changes.Kenneth Graunke2018-05-265-186/+9
| | | | | | | | | This reverts commit 79fe00efb474b3f3f0ba4c88826ff67c53a02aef. This reverts commit f5e8b13f78a085bc95a1c0895e4a38ff6b87b375. This reverts commit d21c086d819d78fb3f6abcbb14aa492970f442aa. They broke the Android build and I'd rather not leave it broken for the long holiday weekend.
* i965/miptree: Use cpu tiling/detiling when mappingScott D Phillips2018-05-251-4/+98
| | | | | | | | | | | | | | | | | | | | | | | | | Rename the (un)map_gtt functions to (un)map_map (map by returning a map) and add new functions (un)map_tiled_memcpy that return a shadow buffer populated with the intel_tiled_memcpy functions. Tiling/detiling with the cpu will be the only way to handle Yf/Ys tiling, when support is added for those formats. v2: Compute extents properly in the x|y-rounded-down case (Chris Wilson) v3: Add units to parameter names of tile_extents (Nanley Chery) Use _mesa_align_malloc for the shadow copy (Nanley) Continue using gtt maps on gen4 (Nanley) v4: Use streaming_load_memcpy when detiling v5: (edited by Ken) Move map_tiled_memcpy above map_movntdqa, so it takes precedence. Add intel_miptree_access_raw, needed after rebasing on commit b499b85b0f2cc0c82b7c9af91502c2814fdc8e67. Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i915: Fix streaming loads for intel_tiled_memcpyChris Wilson2018-05-251-5/+5
| | | | | | | | We stream from a tiled and aligned source into an unaligned user buffer, so we need to use _mm_storeu_si128. Fixes: d21c086d819d78fb3f6abcbb14aa492970f442aa (i965/tiled_memcpy: inline movntdqa loads in tiled_to_linear) Reviewed-by: Kenneth Graunke <[email protected]>
* intel/blorp: Support blits and clears on surfaces with offsetsJason Ekstrand2018-05-251-0/+2
| | | | | | | | | | | | | For certain EGLImage cases, we represent a single slice or LOD of an image with a byte offset to a tile and X/Y intratile offsets to the given slice. Most of i965 is fine with this but it breaks blorp. This is a terrible way to represent slices of a surface in EGL and we should stop some day but that's a very scary and thorny path. This gets blorp to start working with those surfaces and fixes some dEQP EGL test bugs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106629 Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* i965/tiled_memcpy: inline movntdqa loads in tiled_to_linearScott D Phillips2018-05-254-5/+88
| | | | | | | | | | | | | | | | | | | | The reference for MOVNTDQA says: For WC memory type, the nontemporal hint may be implemented by loading a temporary internal buffer with the equivalent of an aligned cache line without filling this data to the cache. [...] Subsequent MOVNTDQA reads to unread portions of the WC cache line will receive data from the temporary internal buffer if data is available. This hidden cache line sized temporary buffer can improve the read performance from wc maps. v2: Add mfence at start of tiled_to_linear for streaming loads (Chris) Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: enable OES_texture_view for gen8+Tapani Pälli2018-05-241-1/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.Francisco Jerez2018-05-231-3/+5
| | | | | | | | | | | | | | | Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <[email protected]> Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Handle non-zero texture buffer offsets in buffer object range calculation.Francisco Jerez2018-05-231-1/+3
| | | | | | | | | | Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Move buffer texture size calculation into a common helper function.Francisco Jerez2018-05-231-23/+32
| | | | | | | | | | | | | The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <[email protected]>
* i965: add {X,A}BGR2101010 to 'intel_image_formats'Miguel Casas2018-05-231-0/+6
| | | | | | | | | This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.Miguel Casas2018-05-231-0/+8
| | | | | | | | | Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: Remove ring switching entirelyJason Ekstrand2018-05-2211-105/+61
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Move the access_raw call to the individual map functionsJason Ekstrand2018-05-221-3/+13
| | | | | | | | | | | The only function that doesn't need to call access_raw is map_blit. If it takes the blitter path, it will happen as part of intel_miptree_copy. If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle doing whatever resolves are needed. This should save us resolves in quite a few cases and will probably help performance a bit. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove support for the BLT ringJason Ekstrand2018-05-221-9/+3
| | | | | | | We still support the blitter on gen4-5 but it's on the same ring as 3D. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Use blorp for blit maps on gen6+Jason Ekstrand2018-05-221-11/+25
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Use blorp for validation tex copies on gen6+Jason Ekstrand2018-05-221-11/+29
| | | | | | | | It's faster than the blitter and can handle things like stencil properly so it doesn't require software fallbacks. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Delete the blitter path for CopyTexSubImageJason Ekstrand2018-05-221-58/+0
| | | | | | | | The blorp path (called first) can do anything the blitter path can do so it's just dead code. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't fall back to the blitter in BlitFramebufferJason Ekstrand2018-05-221-8/+0
| | | | | | | | | On gen4-5, we try the blitter before we even try blorp. On newer platforms, blorp can do everything the blitter can so there's no point in even having the blitter fall-back path. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove some unused includes of intel_blit.hJason Ekstrand2018-05-224-4/+0
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blit: Delete intel_emit_linear_blitJason Ekstrand2018-05-222-62/+0
| | | | | | | This function is no longer used. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use meta for pixel ops on gen6+Jason Ekstrand2018-05-223-4/+10
| | | | | | | | | | | | | Using meta for anything is fairly aweful and definitely has more CPU overhead. However, it also uses the 3D pipe and is therefore likely faster in terms of GPU time than the blitter. Also, the blitter code has so many early returns that it's probably not buying us that much. We may as well just use meta all the time instead of working over-time to find the tiny case where we can use the blitter. We keep gen4-5 using the old blit paths to avoid perturbing old hardware too much. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.Kenneth Graunke2018-05-222-0/+69
| | | | | | | | | | | | | | | | | | | | We'd like to start using soft-pin to assign BO addresses up front, and never move them again. Our previous plan for dealing with 48-bit VF cache bugs was to relocate vertex buffers to the low 4GB, so we'd never have addresses that alias in the low 32 bits. But that requires moving buffers dynamically. This patch tracks the last seen BO address for each vertex/index buffer, and emits a VF cache invalidate if the high bits change. (Ideally, we won't hit this case very often.) This should work for the soft-pin case, but unfortunately won't work in the relocation case, as we don't actually know the addresses. So, we have to use both methods. v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple more explicitly (suggested by Scott). Mention "single batch" too (suggested by Chris). Reviewed-by: Scott D Phillips <[email protected]>
* i965: Introduce a "memory zone" concept on BO allocation.Kenneth Graunke2018-05-2216-38/+107
| | | | | | | | | | | | | | | | | We're planning to start managing the PPGTT in userspace in the near future, rather than relying on the kernel to assign addresses. While most buffers can go anywhere, some need to be restricted to within 4GB of a base address. This commit adds a "memory zone" parameter to the BO allocation functions, which lets the caller specify which base address the BO will be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA. Eventually, I hope to create a 4GB memory zone corresponding to each state base address. Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: isl: Move the MCS gen7+ assertion into ISLNanley Chery2018-05-181-2/+0
| | | | | | | This is useful for every user of ISL. Drop the comment along the way to match similar functions in ISL. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965/miptree: Remove format assertion in alloc_auxNanley Chery2018-05-181-5/+0
| | | | | | | | | intel_miptree_supports_{ccs,mcs,hiz} ensures the format is valid for the color or depth miptree before the miptree is assigned an aux_usage. alloc_aux switches on the aux_usage so don't assert that the format is valid. Reviewed-by: Topi Pohjolainen <[email protected]>