summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nv50/ir: make a copy of tex src if it's referenced multiple timesIlia Mirkin2018-04-221-37/+49
| | | | | | | | | | | | | | | | For nv50 we coalesce the srcs and defs into a single node. As such, we can end up with impossible constraints if the source is referenced after the tex operation (which, due to the coalescing of values, will have overwritten it). This logic already exists for inserting moves for MERGE/UNION sources. It's the exact same idea here, so leverage that code, which also includes a few optimizations around not extending live ranges unnecessarily. Fixes tests/spec/glsl-1.30/execution/fs-textureSize-components.shader_test Signed-off-by: Ilia Mirkin <[email protected]>
* virgl: disable virgl when no 3D for virtio gpu.Lepton Wu2018-04-231-0/+11
| | | | | | | | | | | | If users are running mesa under old version of qemu or have turned off GL at runtime, virtio gpu driver actually doesn't work. Adds a detection here so mesa can fall back to software rendering. v2: - move detection from loader to virgl (Ilia, Emil) Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: mark const structs as extern in header file to avoid lto damageDave Airlie2018-04-231-3/+3
| | | | | | | | | | | | | The copr repo from che was using LTO and he reported radv broke recently with it. When testing with lto builds here I noticed that we weren't seeing any instance extensions reported. It appears LTO was treating the const without extern as an empty struct, this is possibly a gcc bug, but we can work around it just by marking these with extern. Acked-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium/tests/trivial: fix viewport depth transformIlia Mirkin2018-04-212-6/+7
| | | | | | | | | | | These were getting mapped off into outer space, which would cause nv50 and nvc0 to clip the primitives (as depth_clip was enabled). These drivers are configured to clip everything outside the [0, 1] range, even though the hardware supports other view settings. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* trace: allow image resource to be nullIlia Mirkin2018-04-211-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv50/ir/ra: prefer def == src2 for fma with immediates on nvc0Karol Herbst2018-04-211-10/+29
| | | | | | | | | | | | | | | | | | | | This helps with the PostRALoadPropagation pass moving long immediates into FMA/MAD instructions. changes in shader-db: total instructions in shared programs : 5894114 -> 5886074 (-0.14%) total gprs used in shared programs : 666558 -> 666563 (0.00%) total shared used in shared programs : 520416 -> 520416 (0.00%) total local used in shared programs : 53524 -> 53524 (0.00%) total bytes used in shared programs : 54006744 -> 53932472 (-0.14%) local shared gpr inst bytes helped 0 0 2 4192 4192 hurt 0 0 7 9 9 Signed-off-by: Karol Herbst <[email protected]> [imirkin: minor edits to separate nv50 and nvc0+ cases] Reviewed-by: Ilia Mirkin <[email protected]>
* autotools: Include new meson files18.1-branchpointDylan Baker2018-04-205-1/+10
| | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* autotools: Add passes.h to sources so it will be included in the tarballDylan Baker2018-04-201-0/+1
| | | | | | | | | | | This was introduced in commit 8f848ada8a42d9aaa8136afa1bafe32281a0fb48 but not added to the sources list, which is necessary for it to be included in release tarballs. Fixes: 8f848ada8a42d9aaa8136afa1bafe32281a0fb48 ("swr/rast: Start refactoring of builder/packetizer.") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0: fix line width on GM20x+Rhys Perry2018-04-201-1/+4
| | | | | | | This has the side-effect of fixing polygon-offset piglit test failures. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965/miptree: Delete an unused functionNanley Chery2018-04-202-17/+0
| | | | | | | | We're going to combine ::mcs_buf and ::hiz_buf in later commits. Once that happens, this function no longer make sense. Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/miptree: Don't leak the clear_color_boNanley Chery2018-04-201-2/+1
| | | | | | | | Free the clear_color_bo in addition to freeing the intel_miptree_aux_buffer which holds the reference to it. Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/blorp: Do the gen11 BTI flushJason Ekstrand2018-04-201-0/+14
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/blorp: Do the gen11 BTI flushJason Ekstrand2018-04-201-0/+14
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* etnaviv: fix texture_format_needs_swizLucas Stach2018-04-201-1/+1
| | | | | | | | | | | | | memcmp returns 0 when both swizzles are the same, which means we don't need any hardware swizzling. texture_format_needs_swiz should return true when the return value of the memcmp is non-zero. Fixes: 751ae6afbefd ("etnaviv: add support for swizzled texture formats") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Tested-by: Marek Vasut <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Reviewed-by: Wladimir J. van der Laan <[email protected]>
* ac/nir: fix image dimension for subpass attachmentsSamuel Pitoiset2018-04-201-3/+15
| | | | | | | | | | | For subpass attachments we need one more coordinate with the layer, so make them array types. This fixes a bunch of CTS fails with RADV. Fixes: 24fb3e6aa1 ("ac/nir: use ac_build_image_opcode for image intrinsics") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Mark GTT memory as device local for APUs.Bas Nieuwenhuizen2018-04-201-3/+5
| | | | | | | | Otherwise a lot of games complain about not having enough memory, and it is sort of local so this seems reasonable to me. CC: 18.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/winsys: allow to submit up to 4 IBs for chips without chainingSamuel Pitoiset2018-04-201-50/+168
| | | | | | | | | | | | | | | | | | | The SI family doesn't support chaining which means the maximum size in dwords per CS is limited. When that limit was reached we failed to submit the CS and the application crashed. This patch allows to submit up to 4 IBs which is currently the limit, but recent amdgpu supports more than that. Please note that we can reach the limit of 4 IBs per submit but currently we can't improve that. The only solution is to upgrade libdrm. That will be improved later but for now this should fix crashes on SI or when using RADV_DEBUG=noibs. Fixes: 36cb5508e89 ("radv/winsys: Fail early on overgrown cs.") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105775 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* gallium/util: Android backtrace supportStefan Schake2018-04-203-1/+114
| | | | | | | | | | | | | | We can't use any of the existing implementations in u_debug_stack. Android technically has libunwind, but it's been modified to the point where it no longer compiles with the Mesa usage. The library is also not meant to be referenced by vendor libraries. The officially sanctioned way of obtaining backtraces is through the Android own libbacktrace, a C++ library. Access it through a separate C++ source file on Android only. Signed-off-by: Stefan Schake <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Rob Herring <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* gallium/util: Don't stub u_debug_stack on AndroidStefan Schake2018-04-201-1/+2
| | | | | | | | | The fallback path for no libunwind ends up being stubs for Android. Don't compile them in so we can provide our own implementation. Signed-off-by: Stefan Schake <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* ac/nir: handle nir_intrinsic_load_first_vertex like base_vertexSamuel Pitoiset2018-04-201-2/+2
| | | | | | | | This fixes a ton of CTS crashes. Fixes: c366f422f0 ("nir: Offset vertex_id by first_vertex instead of base_vertex") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: allow local BOs on APUsSamuel Pitoiset2018-04-201-1/+2
| | | | | | | | | Ported from RadeonSI. Local BOs ignore BO priorities, and we don't need those on APUs. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use a global BO list only for VK_EXT_descriptor_indexingSamuel Pitoiset2018-04-203-9/+34
| | | | | | | | | | | | Maintaining two different paths is annoying but this gets rid of the performance regression introduced by the global BO list. We might find a better solution in the future, but for now just keeps two paths. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* Revert "radv: Don't store buffer references in the descriptor set."Samuel Pitoiset2018-04-205-13/+82
| | | | | | | | | | | | | | | | In order to reduce a performance regression introduced by 4b13fe55a4 ("radv: Keep a global BO list for VkMemory."), we are going to maintain two different paths. One when VK_EXT_descriptor_indexing is enabled by the application because we need to have a global BO list, and one (the old one) when it's not enabled. With Talos on Polaris, the global BO list reduces performance by 10% which is too much for me. This reverts commit ab6cadd3ecc7fbdd9079808b407674e0b19c52f0. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965/fs: retype offset_reg to UD at load_ssboJose Maria Casanova Crespo2018-04-201-1/+1
| | | | | | | | | | | | | | All operations with offset_reg at do_vector_read are done with UD type. So copy propagation was not working through the generated MOVs: mov(8) vgrf9:UD, vgrf7:D This change allows removing the MOV generated for reading the first components for 16-bit and 64-bit ssbo reads with non-constant offsets. Reviewed-by: Iago Toral Quiroga <[email protected]>
* ac/nir: use ac_build_image_opcode for image intrinsicsNicolai Hähnle2018-04-203-140/+78
| | | | | | So that we'll use the dimension-aware intrinsics in the future. Acked-by: Marek Olšák <[email protected]>
* radeonsi: generate image load/store/atomic ops using ac_build_image_opcodeNicolai Hähnle2018-04-204-164/+210
| | | | | | In preparation of dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass address components individually to ac_build_image_intrinsicNicolai Hähnle2018-04-205-409/+295
| | | | | | This is in preparation for the new image intrinsics. Acked-by: Marek Olšák <[email protected]>
* amd/common: pass new enum ac_image_dim to ac_build_image_opcodeNicolai Hähnle2018-04-204-13/+114
| | | | | | | This is in preparation for the new, dimension-aware LLVM image intrinsics. Acked-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix crash in test involving the sample maskNicolai Hähnle2018-04-201-1/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi/nir: set FS properties only when scanning a fragment shaderNicolai Hähnle2018-04-201-1/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: fix atomic compare-and-swapNicolai Hähnle2018-04-201-0/+1
| | | | | | | | | The LLVM instruction returns { i32, i1 }, where the i1 indicates success. We're only interested in the first part, which is the loaded value. Fixes dEQP-GLES31.functional.compute.shared_var.atomic.compswap.* Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: fix error paths of si_texture_transfer_mapNicolai Hähnle2018-04-201-13/+12
| | | | | | | trans is zero-initialized, but trans->resource is setup immediately so needs to be dereferenced. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: prevent spurious Valgrind errors when serializing NIRNicolai Hähnle2018-04-201-2/+4
| | | | | | | | It looks as if the structure fields array is fully initialized below, but in fact at least gcc in debug builds will not actually overwrite the unused bits of bit fields. Reviewed-by: Timothy Arceri <[email protected]>
* clover: Fix host access validation for sub-buffer creationAaron Watry2018-04-191-2/+7
| | | | | | | | | | | | | | | | From CL 1.2 Section 5.2.1: CL_INVALID_VALUE if buffer was created with CL_MEM_HOST_WRITE_ONLY and flags specify CL_MEM_HOST_READ_ONLY , or if buffer was created with CL_MEM_HOST_READ_ONLY and flags specify CL_MEM_HOST_WRITE_ONLY , or if buffer was created with CL_MEM_HOST_NO_ACCESS and flags specify CL_MEM_HOST_READ_ONLY or CL_MEM_HOST_WRITE_ONLY . Fixes CL 1.2 CTS test/api get_buffer_info v2: Correct host_access_flags check (Francisco) Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* nir: Offset vertex_id by first_vertex instead of base_vertexNeil Roberts2018-04-196-13/+6
| | | | | | | | | | | | | | | | | | base_vertex will be zero for non-indexed calls and in that case we need vertex_id to be offset by the ‘first’ parameter instead. That is what we get with first_vertex. This is true for both GL and Vulkan. The freedreno driver is also setting vertex_id_zero_based on nir_options. In order to avoid breakage this patch switches the relevant code to handle SYSTEM_VALUE_FIRST_VERTEX so that it can retain the same behavior. v2: change a3xx/fd3_emit.c and a4xx/fd4_emit.c from SYSTEM_VALUE_BASE_VERTEX to SYSTEM_VALUE_FIRST_VERTEX (Kenneth). Reviewed-by: Ian Romanick <[email protected]> Cc: Rob Clark <[email protected]> Acked-by: Marek Olšák <[email protected]>
* spirv: Lower BaseVertex to FIRST_VERTEX instead of BASE_VERTEXNeil Roberts2018-04-193-5/+18
| | | | | | | | | | | | | The base vertex in Vulkan is different from GL in that for non-indexed primitives the value is taken from the firstVertex parameter instead of being set to zero. This coincides with the new SYSTEM_VALUE_FIRST_VERTEX instead of BASE_VERTEX. v2 (idr): Add comment describing why SYSTEM_VALUE_FIRST_VERTEX is used for SpvBuiltInBaseVertex. Suggested by Jason. Reviewed-by: Ian Romanick <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Handle firstvertex in an identical way to BaseVertexAntia Puentes2018-04-197-13/+35
| | | | | | | | | | | | Until we set gl_BaseVertex to zero for non-indexed draw calls both have an identical value. The Vertex Elements are kept like that: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <Draw ID, 0, 0, 0> v2 (idr): Mark nir_intrinsic_load_first_vertex as "unreachable" in emit_system_values_block and fs_visitor::nir_emit_vs_intrinsic.
* intel/compiler: Add a uses_firstvertex flagNeil Roberts2018-04-192-0/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* compiler: Add SYSTEM_VALUE_FIRST_VERTEX and instrinsicsAntia Puentes2018-04-195-0/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This VS system value will contain the value passed as <basevertex> for indexed draw calls or the value passed as <first> for non-indexed draw calls. It can be used to calculate the gl_VertexID as SYSTEM_VALUE_VERTEX_ID_ZERO_BASE plus SYSTEM_VALUE_FIRST_VERTEX. From the OpenGL 4.6 spec, 10.4 "Drawing Commands Using Vertex Arrays": - Page 352: "The index of any element transferred to the GL by DrawArraysOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is first + i." - Page 355: "The index of any element transferred to the GL by DrawElementsOneInstance is referred to as its vertex ID, and may be read by a vertex shader as gl_VertexID. The vertex ID of the ith element transferred is the sum of basevertex and the value stored in the currently bound element array buffer at offset indices + i." Currently the gl_VertexID calculation uses SYSTEM_VALUE_BASE_VERTEX but this will have to change when the value of gl_BaseVertex is fixed. Currently its value is broken for non-indexed draw calls because it must be zero but we are setting it to <first>. v2: use SYSTEM_VALUE_FIRST_VERTEX as name for the value, instead of SYSTEM_VALUE_BASE_VERTEX_ID (Kenneth). v3 (idr): Rebase on Rob Clark converting nir_intrinsics.h to be generated. Reformat commit message to 72 columns. Reviewed-by: Neil Roberts <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* meson: Build st_tests_common with gtestMike Lothian2018-04-191-1/+1
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106131 Fixes: 34cb4d0ebc1 ("meson: build tests for gallium mesa state tracker") Signed-off-by: Mike Lothian <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: Add Vega M support.Bas Nieuwenhuizen2018-04-194-2/+11
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add bound checking workaround for dynamic buffers.Bas Nieuwenhuizen2018-04-193-1/+5
| | | | | | | I have seen a few applications and games do the dynamic buffer bounds incorrectly, this make it easier to work around, e.g. for debugging. Reviewed-by: Samuel Pitoiset <[email protected]>
* svga: Fix incorrect advertizing of EGL_KHR_gl_colorspaceThomas Hellstrom2018-04-191-1/+1
| | | | | | | | | | | | | | | | | | | When advertizing this extension, egl_dri2 uses the DRI2_RENDERER_QUERY extension to query whether an sRGB format is supported. That extension will query our driver with the BIND flag PIPE_BIND_RENDER_TARGET rather than PIPE_BIND_DISPLAY_TARGET which is used when building the configs. We only return the correct value for PIPE_BIND_DISPLAY_TARGET. The inconsistency causes EGL to crash at surface initialization if sRGB is not supported. Fix this by supporting both bind flags. Testing done: piglit egl_gl_colorspace srgb Cc: <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* swr: Fix include for createPromoteMemoryToRegisterPassMike Lothian2018-04-191-0/+3
| | | | | | | | | | | | Include llvm/Transforms/Utils.h with the newest LLVM 7 v2: Include with " " rather than < > (Vinson Lee) v3: Use LLVM_VERSION_MAJOR rather than HAVE_LLVM (George Kyriazis) Signed-of-by: Mike Lothian <[email protected]> Tested-by: Vinson Lee <[email protected]> Reviewed-By: George Kyriazis <[email protected]>
* radv: enable DCC for MSAA 2x textures on VI under an optionSamuel Pitoiset2018-04-194-1/+13
| | | | | | | | | | | | | | | | This can be enabled with RADV_PERFTEST=dccmsaa. DCC for MSAA textures is actually not as easy to implement. It looks like there is some corner cases. I will improve support incrementally. Vega support, as well as Polaris improvements, will be added later. No CTS changes on Polaris using RADV_DEBUG=zerovram and RADV_PERFTEST=dccmsaa. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: decompress DCC for multisampled source images before resolvingSamuel Pitoiset2018-04-194-4/+18
| | | | | | | | | Multisampled source images (ie. color attachments) can be now DCC compressed, so the driver needs to perform a DCC decompression pass before resolving Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add a workaround for fast clears with DCC and MSAA texturesSamuel Pitoiset2018-04-191-0/+9
| | | | | | | | This should be fixed at some point in order to improve performance. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate CMASK for DCC fast clear with MSAASamuel Pitoiset2018-04-191-0/+7
| | | | | | | | CMASK is required because it should be cleared to 0xCCCCCCCC for MSAA textures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement fast color clear for DCC with MSAASamuel Pitoiset2018-04-191-1/+16
| | | | | | | | When DCC is enabled with MSAA textures, CMASK should be cleared to 0xCCCCCCCC. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sure to sync after resolving using the compute pathSamuel Pitoiset2018-04-191-0/+3
| | | | | | | | | | | | | | | This fixes some random CTS failures: dEQP-VK.renderpass.multisample.*. Performing a fast-clear eliminate is still useless, but it seems that we need to sync. Found while running CTS with RADV_DEBUG=zerovram. Fixes: 56a171a499c ("radv: don't fast-clear eliminate after resolving a subpass with compute") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>