summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: enable OES_texture_view for gen8+Tapani Pälli2018-05-241-1/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: changes to expose OES_texture_view extensionTapani Pälli2018-05-246-6/+32
| | | | | | | | | | | Functionality already covered by ARB_texture_view, patch also adds missing 'gles guard' for enums (added in f1563e6392). Tested via arb_texture_view.*_gles3 tests and individual app utilizing texture view with ETC2. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: call nir_lower_io_to_temporaries for VS, GS, TES and FSSamuel Pitoiset2018-05-241-0/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Do not lower FS inputs because this moves all load_var instructions at beginning of shaders and because interp_var_at_sample (and friends) seem broken. That might be eventually enabled later on if we really want to preload all FS inputs at beginning. Polaris10: Totals from affected shaders: SGPRS: 54072 -> 54264 (0.36 %) VGPRS: 38580 -> 38124 (-1.18 %) Spilled SGPRs: 652 -> 652 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Code Size: 2128116 -> 2127380 (-0.03 %) bytes Max Waves: 8048 -> 8086 (0.47 %) Vega10: Totals from affected shaders: SGPRS: 52616 -> 52656 (0.08 %) VGPRS: 37536 -> 37116 (-1.12 %) Spilled SGPRs: 828 -> 828 (0.00 %) Code Size: 2043756 -> 2042672 (-0.05 %) bytes Max Waves: 9176 -> 9254 (0.85 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: call nir_split_var_copies() before nir_lower_var_copies()Samuel Pitoiset2018-05-241-0/+3
| | | | | | | | This doesn't nothing special currently because we don't create any copy_var instructions, but this is needed for the next patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: Use intel_bufferobj_buffer() wrapper in image surface state setup.Francisco Jerez2018-05-231-3/+5
| | | | | | | | | | | | | | | Instead of directly using intel_obj->buffer. Among other things intel_bufferobj_buffer() will update intel_buffer_object:: gpu_active_start/end, which are used by glBufferSubData() to decide which path to take. Fixes a failure in the Piglit ARB_shader_image_load_store-host-mem-barrier Buffer Update/WaW tests, which could be reproduced with a non-standard glGetTexSubImage implementation (see bug report). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105351 Reported-by: Nanley Chery <[email protected]> Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Handle non-zero texture buffer offsets in buffer object range calculation.Francisco Jerez2018-05-231-1/+3
| | | | | | | | | | Otherwise the specified surface state will allow the GPU to access memory up to BufferOffset bytes past the end of the buffer. Found by inspection. v2: Protect against out-of-range BufferOffset (Nanley). Cc: [email protected] Reviewed-by: Nanley Chery <[email protected]>
* i965: Move buffer texture size calculation into a common helper function.Francisco Jerez2018-05-231-23/+32
| | | | | | | | | | | | | The buffer texture size calculations (should be easy enough, right?) are repeated in three different places, each of them subtly broken in a different way. E.g. the image load/store path was never fixed to clamp to MaxTextureBufferSize, and none of them are taking into account the buffer offset correctly. It's easier to fix it all in one place. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106481 Reviewed-by: Nanley Chery <[email protected]>
* Revert "mesa: simplify _mesa_is_image_unit_valid for buffers"Francisco Jerez2018-05-231-13/+12
| | | | | | | | | | | | | | | | | | This reverts commit c0ed52f6146c7e24e1275451773bd47c1eda3145. It was preventing the image format validation from being done on buffer textures, which is required to ensure that the application doesn't attempt to bind a buffer texture with an internal format incompatible with the image unit format (e.g. of different texel size), which is not allowed by the spec (it's not allowed for *any* texture target, whether or not there is spec wording restricting this behavior specifically for buffer textures) and will cause the driver to calculate texel bounds incorrectly and potentially crash instead of the expected behavior. Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106465 Reviewed-by: Nanley Chery <[email protected]>
* ac: Use DPP for build_ddxy where possible.Bas Nieuwenhuizen2018-05-231-1/+15
| | | | | | | | | | | | WQM is pretty reliable now on LLVM 7, so let us just use DPP + WQM. This gives approximately a 1.5% performance increase on the vrcompositor built-in benchmark. v2: Use ac_build_quad_swizzle. Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: add {X,A}BGR2101010 to 'intel_image_formats'Miguel Casas2018-05-231-0/+6
| | | | | | | | | This patch adds {X,A}BGR2101010 entries to the list of supported 'intel_image_formats'. Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* dri_util: Add R10G10B10{A,X}2 translation between DRI and mesa_format.Miguel Casas2018-05-231-0/+8
| | | | | | | | | Add R10G10B10{A,X}2 translation between mesa_format and DRI format to driGLFormatToImageFormat() and driImageFormatToGLFormat(). Bug: https://crbug.com/776093 Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* tgsi/scan: add hw atomic to the list of memory accessing filesDave Airlie2018-05-231-1/+2
| | | | | | | | This fixes 4 out of 5 cases in: arb_framebuffer_no_attachments-atomic on cayman. Reviewed-by: Marek Olšák <[email protected]> Cc: "18.0 18.1" <[email protected]>
* llvmpipe: improve rasterization discard logicRoland Scheidegger2018-05-2315-89/+118
| | | | | | | | | | | | | | | | | | | | | | This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* ac/surface/gfx6: Don't force a tile index for fmask.Bas Nieuwenhuizen2018-05-231-1/+1
| | | | | | | | | | | | | | The bpe of the fmask often differs from the bpe of the main surface. On SI that means it has to get a different tile index. addrlib is capable of figuring this out itself, so just pass -1 instead to let it know that it is not preset. Fixes: 9bf3570fed0 "ac/surface/gfx6: compute FMASK together with the color surface" Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106511 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106499 Reviewed-by: Marek Olšák <[email protected]>
* i965: Remove ring switching entirelyJason Ekstrand2018-05-2211-105/+61
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Move the access_raw call to the individual map functionsJason Ekstrand2018-05-221-3/+13
| | | | | | | | | | | The only function that doesn't need to call access_raw is map_blit. If it takes the blitter path, it will happen as part of intel_miptree_copy. If map_blit takes the blorp path, brw_blorp_copy_miptrees will handle doing whatever resolves are needed. This should save us resolves in quite a few cases and will probably help performance a bit. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove support for the BLT ringJason Ekstrand2018-05-221-9/+3
| | | | | | | We still support the blitter on gen4-5 but it's on the same ring as 3D. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Use blorp for blit maps on gen6+Jason Ekstrand2018-05-221-11/+25
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/miptree: Use blorp for validation tex copies on gen6+Jason Ekstrand2018-05-221-11/+29
| | | | | | | | It's faster than the blitter and can handle things like stencil properly so it doesn't require software fallbacks. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Delete the blitter path for CopyTexSubImageJason Ekstrand2018-05-221-58/+0
| | | | | | | | The blorp path (called first) can do anything the blitter path can do so it's just dead code. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Don't fall back to the blitter in BlitFramebufferJason Ekstrand2018-05-221-8/+0
| | | | | | | | | On gen4-5, we try the blitter before we even try blorp. On newer platforms, blorp can do everything the blitter can so there's no point in even having the blitter fall-back path. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove some unused includes of intel_blit.hJason Ekstrand2018-05-224-4/+0
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/blit: Delete intel_emit_linear_blitJason Ekstrand2018-05-222-62/+0
| | | | | | | This function is no longer used. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use meta for pixel ops on gen6+Jason Ekstrand2018-05-223-4/+10
| | | | | | | | | | | | | Using meta for anything is fairly aweful and definitely has more CPU overhead. However, it also uses the 3D pipe and is therefore likely faster in terms of GPU time than the blitter. Also, the blitter code has so many early returns that it's probably not buying us that much. We may as well just use meta all the time instead of working over-time to find the tiny case where we can use the blitter. We keep gen4-5 using the old blit paths to avoid perturbing old hardware too much. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.Kenneth Graunke2018-05-222-0/+69
| | | | | | | | | | | | | | | | | | | | We'd like to start using soft-pin to assign BO addresses up front, and never move them again. Our previous plan for dealing with 48-bit VF cache bugs was to relocate vertex buffers to the low 4GB, so we'd never have addresses that alias in the low 32 bits. But that requires moving buffers dynamically. This patch tracks the last seen BO address for each vertex/index buffer, and emits a VF cache invalidate if the high bits change. (Ideally, we won't hit this case very often.) This should work for the soft-pin case, but unfortunately won't work in the relocation case, as we don't actually know the addresses. So, we have to use both methods. v2: Mention that the cache uses a <VertexBufferIndex, Address> tuple more explicitly (suggested by Scott). Mention "single batch" too (suggested by Chris). Reviewed-by: Scott D Phillips <[email protected]>
* i965: Introduce a "memory zone" concept on BO allocation.Kenneth Graunke2018-05-2216-38/+107
| | | | | | | | | | | | | | | | | We're planning to start managing the PPGTT in userspace in the near future, rather than relying on the kernel to assign addresses. While most buffers can go anywhere, some need to be restricted to within 4GB of a base address. This commit adds a "memory zone" parameter to the BO allocation functions, which lets the caller specify which base address the BO will be associated with, or BRW_MEMZONE_OTHER for the full 48-bit VMA. Eventually, I hope to create a 4GB memory zone corresponding to each state base address. Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0Jason Ekstrand2018-05-221-0/+2
| | | | | Fixes: d6cd14f2131a5b "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
* dri3: Stricter SBC wraparound handlingMichel Dänzer2018-05-221-3/+11
| | | | | | | | | | | | | | Prevents corrupting the upper 32 bits of draw->recv_sbc when draw->send_sbc resets to 0 (which currently happens when the window is unbound from a context and bound to one again), which in turn caused loader_dri3_swap_buffers_msc to calculate target_msc with corrupted upper 32 bits. This resulted in hangs with the Xorg modesetting driver as of xserver 1.20 (older versions and other drivers ignored the upper 32 bits of the target MSC, which is why this wasn't noticed earlier). Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/106351 Tested-by: Mike Lothian <[email protected]>
* radv: fix computation of user sgprs for 32-bit pointersSamuel Pitoiset2018-05-221-1/+3
| | | | | | | With 32-bit pointers we only need one user SGPR per desc set. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop user_sgpr_info::sgpr_countSamuel Pitoiset2018-05-221-13/+11
| | | | | | | It's only used inside allocate_user_sgprs(). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add support for 32-bit pointers in user data SGPRsSamuel Pitoiset2018-05-224-21/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We still use 64-bit GPU pointers for all ring buffers because llvm.amdgcn.implicit.buffer.ptr doesn't seem to support 32-bit GPU pointers for now. This can be improved later anyways. Vega10: Totals from affected shaders: SGPRS: 1008722 -> 1026710 (1.78 %) VGPRS: 706580 -> 707136 (0.08 %) Spilled SGPRs: 22555 -> 22209 (-1.53 %) Spilled VGPRs: 75 -> 75 (0.00 %) Code Size: 34819208 -> 35202140 (1.10 %) bytes Max Waves: 175423 -> 175086 (-0.19 %) Polaris10: Totals from affected shaders: SGPRS: 1029849 -> 1036517 (0.65 %) VGPRS: 709984 -> 708872 (-0.16 %) Spilled SGPRs: 22672 -> 22309 (-1.60 %) Spilled VGPRs: 82 -> 66 (-19.51 %) Scratch size: 76 -> 60 (-21.05 %) dwords per thread Code Size: 34915336 -> 35309752 (1.13 %) bytes Max Waves: 151221 -> 151677 (0.30 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add set_loc_shader_ptr() helperSamuel Pitoiset2018-05-221-7/+13
| | | | | | | This helper will hep for switching to 32-bit GPU pointers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate descriptor BOs in the 32-bit addr spaceSamuel Pitoiset2018-05-221-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate the upload BO in the 32-bit addr spaceSamuel Pitoiset2018-05-221-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set amdgpu-32bit-address-high-bits LLVM attributeSamuel Pitoiset2018-05-223-0/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: allow to allocate BOs in the 32-bit addr spaceSamuel Pitoiset2018-05-222-1/+3
| | | | | | | This introduces a new flag called RADEON_FLAG_32BIT. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: request high addressSamuel Pitoiset2018-05-221-4/+6
| | | | | | | This is needed for 32-bit GPU pointers. Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965/glk: Add l3 banks count for 2x6 configurationAnuj Phogat2018-05-211-1/+1
| | | | | | | | | | | | 2x6 configuration with pci-id 0x3185 has same number of banks (2) as 3x6 configuration (pci-id 0x3184). Reported-by: Clayton Craft <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Clayton Craft <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Fixes: eb23be1d97da "i965: Add and initialize l3_banks field for gen7+" Cc: Francisco Jerez <[email protected]>
* v3d: Include v3d_drm.h path.Vinson Lee2018-05-211-0/+1
| | | | | | | | | | | | | | Fix build error. CC v3d_blit.lo In file included from v3d_blit.c:27:0: v3d_context.h:39:10: fatal error: v3d_drm.h: No such file or directory #include "v3d_drm.h" ^~~~~~~~~~~ Fixes: 8a793d42f1cc ("v3d: Switch the vc5 driver to using the finalized V3D UABI.") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv: fix centroid interpolationSamuel Pitoiset2018-05-211-3/+0
| | | | | | | | | | | | | | | It's legal to set the centroid and sample interpolation modes when MSAA disabled. So, we have to initialize the centroid inputs because the hardware doesn't. This fixes rendering issues with DXVK and The Witness, World of Warcraft, Trackmania and probably more games. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106315 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102390 CC: 18.0 18.1 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Cleanup unused prime blit path.Bas Nieuwenhuizen2018-05-212-25/+0
| | | | | | | | Since we have the common WSI code, we use vkCmdCopyImageToBuffer instead. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Fix SRGB compute copies.Bas Nieuwenhuizen2018-05-212-0/+42
| | | | | | | | | | | | | | | SRGB stores are broken. We had compensation code in the resolve path but none in the copy path. Since we don't want any conversion and it does not matter for DCC, just make everything UNORM instead. This happened to cause wrong colors for the PRIME path, as that uses image->buffer copies which always use the compute path. CC: 18.0 18.1 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106587 Reviewed-by: Dave Airlie <[email protected]>
* android: enable VK_ANDROID_native_bufferTapani Pälli2018-05-211-3/+0
| | | | | | | | | | | | | | Patch changes entrypoints generator to not skip this extension even though it is set as disabled in the xml. We also need compilation flag VK_USE_PLATFORM_ANDROID_KHR to be enabled. It looks like this extension got disabled in commit 69f447553c. v2: just remove the whole 'supported' attrib check + remove vk_icd.h compilation fix (fix in VulkanHeaders instead) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* virgl: set texture buffer offset alignment to disable ARB_texture_buffer_range.Dave Airlie2018-05-211-1/+1
| | | | | | | | | | The host side hasn't got support for this feature yet, so don't enable it unless we get the caps from the host. This makes the texture buffer range piglit tests skip now. Fixes: fe0647df5a7 (virgl: add offset alignment values to to v2 caps struct) Reviewed-by: Gurchetan Singh <[email protected]>
* mesa: stop hiding query parameters from OpenGL compatTimothy Arceri2018-05-211-14/+7
| | | | | | | | Just let the extension detection do its job as we will be adding compat profile support in future, also we want these to work with compat profile version overrides. Reviewed-by: Marek Olšák <[email protected]>
* radv: fix VK_EXT_descriptor_indexingChristoph Haag2018-05-201-1/+1
| | | | | | | | GetPhysicalDeviceProperties2KHR() was crashing because features was null Fixes: 0e10790558b "radv: Enable VK_EXT_descriptor_indexing." CC: 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: Only align linear power of two fmt textures.Bas Nieuwenhuizen2018-05-201-2/+2
| | | | | | | | We're not sharing 32_32_32 formats between different GPUs, so we do not have to align for vega on pre-vega cards. Fixes: e361970ed73 "radv: Add support for IMG_DATA_FORMAT_32_32_32." Reviewed-by: Marek Olšák <[email protected]>
* amd/addrlib: Use defines in autotools build.Bas Nieuwenhuizen2018-05-201-0/+1
| | | | | | | | Otherwise stuff like NDEBUG would not be passed through. CC: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106479 Reviewed-by: Marek Olšák <[email protected]>
* r600/compute: Mark several functions as staticAaron Watry2018-05-192-30/+29
| | | | | | | They're not used anywhere else, so keep them private Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jan Vesely <[email protected]>
* r600/compute: Remove unused compute_memory_pool functionsAaron Watry2018-05-192-103/+0
| | | | | Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Jan Vesely <[email protected]>