aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* broadcom/vc4: Expand width of dst surfaceZhaowei Yuan2019-09-031-1/+1
| | | | | | | | | | | Four bytes of src_surf will be compressed into a 32-bits data and stored into dst_surf, and dst_surf is read as z-order, so its width must be aligned to multiples of 8(4x2) before divided by 2. Signed-off-by: Zhaowei Yuan <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266 Reviewed-by: Alejandro Piñeiro <[email protected]>
* swr: Fix make_unique build error.Vinson Lee2019-09-021-3/+3
| | | | | | | | | | swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’: swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func))); ^~~~~~~~~~~ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* iris: Lessen texture cache hack flush for blits/copies on Icelake.Kenneth Graunke2019-08-311-16/+34
| | | | | | | | | | | Lionel found actual documentation for this at long last. Apparently it actually is a sampler cache limitation that was mostly fixed on Icelake. Unfortunately, it seems there are still issues with ASTC and non-ASTC sampler views. Still, we can lessen the flush condition from "format mismatch" to "ASTC mismatch", which eliminates most of the flushing here. We also update the documentation to refer to the workaround name.
* gallium/auxiliary/indices: consistently apply start only to inputErik Faye-Lund2019-08-311-10/+10
| | | | | | | | | | | | | | | | | | The majority of these only apply the start argument to the input, but a few of them also does for the output-array. util_primconvert, the only user of this argument expects this pass a non-zero start-argument does not expect this to be applied to the output; if it is, it will write outside of allocated memory, leading to VRAM corruption. The reason this doesn't seem to have been noticed before, is that no driver currently use util_primconvert to convert a primitive-type to itself, which is the cases where this was broken. But for Zink, this will no longer be true, because we need to eliminate the use of 8-bit index-buffers. Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: 28f3f8d413f ("gallium/auxiliary/indices: add start param") Reviewed-by: Rob Clark <[email protected]>
* swr: Fix build with llvm-9.0 again.Vinson Lee2019-08-313-0/+28
| | | | | | | | | | Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") unintentionally removed changes for llvm-9.0. Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* pan/midgard: Use shared psiz clamp passAlyssa Rosenzweig2019-08-302-76/+0
| | | | | | | We already had a perfectly cromulent pass for this, but one landed in common NIR code so let's switch and lighten our tree. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add transient BOs to job batchesBoris Brezillon2019-08-302-1/+2
| | | | | | | | | | | | | | | | Memory allocated through panfrost_allocate_transient() is likely to come from the transient pool. Let's add the BO backing the allocated memory region to the job batch so the kernel can retain this BO while jobs are executed. In practice that has never been a problem because the transient pool is never shrinked, and even if it was, we still control the lifetime of the job, so there's no reason for this BO to be freed before the GPU is done executing the batch. But it still make sense to add the BO for debugging purpose. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: protect access to shared bo cache and transient poolRohan Garg2019-08-305-5/+23
| | | | | | | | | | Both the BO cache and the transient pool are shared across context's. Protect access to these with mutexes. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Jobs must be per context, not per screenRohan Garg2019-08-305-17/+14
| | | | | | | | | | | Jobs _must_ only be shared across the same context, having the last_job tracked in a screen causes use-after-free issues and memory corruptions. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* freedreno/a3xx: fix sysmem <-> gmem tiles transferKhaled Emara2019-08-302-2/+3
| | | | | | | Tiling mode was missing from fd3_emit_gmem_restore_tex(). emit_gmem2mem_surf() used LINEAR exclusiveley. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix texture tiling parametersKhaled Emara2019-08-301-10/+21
| | | | | | | | * Fix 2D/2DArray/3D tiling parameters: There is a bottom threshold for width and height. * Renable tiling for Cubemap, after setting the right parameters. Reviewed-by: Rob Clark <[email protected]>
* broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.Dave Stevenson2019-08-301-8/+23
| | | | | | | | | | | | Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride." for v3d. Allows YUV buffers with a single buffer and plane offsets to be passed in. Signed-off-by: Dave Stevenson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swr/rasterizer: Fix GS attributes processingJan Zielinski2019-08-303-24/+10
| | | | | | | Input to GS is just a set of attributes, so remove explicit setup of 'position' which is meaningless for GS input processing. Reviewed-by: Alok Hota <[email protected]>
* ac: drop now useless lookup_interp_param from ABISamuel Pitoiset2019-08-301-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: import linear/perspective PS input parameters from radv/radeonsiSamuel Pitoiset2019-08-302-17/+19
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: disable accurate cube corner for integer textures.Dave Airlie2019-08-301-1/+6
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111511 Reviewed-by: Roland Scheidegger <[email protected]>
* radeonsi: add JPEG decode support for VCN 2.0 devicesThong Thai2019-08-291-3/+1
| | | | | Signed-off-by: Thong Thai <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"Thong Thai2019-08-291-7/+4
| | | | | | | | | | | This reverts commit 5a2e65be89d97ed5d7263f0296ea69ae8517187b. Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL still needs to be emitted by the UMD, or else the driver will hang Cc: 19.2 <[email protected]> Signed-off-by: Thong Thai <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* iris: Fix partial fast clear checks to account for miplevel.Kenneth Graunke2019-08-291-2/+2
| | | | | | | | | | | | | | We enabled fast clears at level > 0, but didn't minify the dimensions when comparing the box size, so we always thought it was a partial clear and as a result never actually enabled any. This eliminates some slow clears in Civilization VI, but they are mostly during initialization and not the main rendering. Thanks to Dan Walsh for noticing we had too many slow clears. Fixes: 393f659ed83 ("iris: Enable fast clears on other miplevels and layers than 0.") Reviewed-by: Rafael Antognolli <[email protected]>
* panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()Rohan Garg2019-08-293-5/+3
| | | | | | | | | is_scanout is not used anywhere and can be inferred within panfrost_drm_submit_vs_fs_job() if required. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* iris: Actually describe bo_reuse driconf optionKenneth Graunke2019-08-291-0/+10
| | | | | | | Otherwise it doesn't exist and can't be parsed, so everything dies at screen init time. Fixes: 6dc4ddc5f81 ("iris: use driconf for 'bo_reuse' parameter")
* panfrost/ci: Print only regressionsTomeu Vizoso2019-08-292-4/+7
| | | | | | | | Some functionality has been added to deqp-volt to only print regressions, so update our version of it and use the new options. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* gallivm: use fallback code for mul_hi with llvm >= 7.0Roland Scheidegger2019-08-291-1/+6
| | | | | | | | | | | | | LLVM 7.0 ditched the pmulu intrinsics. This is only a trivial patch to use the fallback code instead. It'll likely produce atrocious code since the pattern doesn't match what llvm itself uses in its autoupgrade paths, hence the pattern won't be recognized. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=111496 Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* swr/rasterizer: Enable ARB_fragment_layer_viewportJan Zielinski2019-08-293-1/+21
| | | | | | | Added loading gl_Layer and gl_ViewportIndex variables to Pixel Shader context. Reviewed-by: Alok Hota <[email protected]>
* iris: use driconf for 'bo_reuse' parameterTapani Pälli2019-08-294-6/+20
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Don't auto-flush/dirty on transfer unmap for coherent buffersKenneth Graunke2019-08-281-1/+2
| | | | | | | | | | | | | | | | | | | | | When u_upload_mgr fills up a buffer, it unmaps and destroys it. Our unmap function was automatically performing the equivalent of a FlushMappedBufferRange call in this case. Because the buffer mapping is persistent and coherent, we don't actually do any flushing when we do the rest of the writes to the buffer - we were just doing one final one at the end. But we would be using the uploaded contents on the GPU the whole time. This certainly shouldn't be necessary for streaming buffers, and if such flushing and dirtying is necessary for coherent buffers, this is wildly insufficient. Drops a small number of constant packets and PIPE_CONTROL flushes from most benchmarks that I've looked at. Doesn't seem to make much of an impact on performance, however. Thanks to Felix Degrood for noticing that we were emitting more 3DSTATE_CONSTANT_* packets than we needed to.
* st/nine: Properly initialize GLSL types for NIR shaders.Timur Kristóf2019-08-281-0/+5
| | | | | | | | NIR shaders use GLSL types (note: these live outside libglsl), and nine needs to properly initialize these just like the other state trackers. This fixes an assertion failure when TTN is used. Signed-off-by: Timur Kristóf <[email protected]>
* iris: build android libmesa_iris for gen12Tapani Pälli2019-08-281-1/+21
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Build for gen12Jordan Justen2019-08-283-1/+7
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/a6xx: Fix non-mipmap filtering selection.Eric Anholt2019-08-281-6/+6
| | | | | | | | | | We were clamping the LOD to force non-mipmap filtering, but that means that the HW doesn't get to select between the min and mag filters. Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering. Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.* Reviewed-by: Rob Clark <[email protected]>
* gallium: Don't emit identical endian-dependent pack/unpack code.Eric Anholt2019-08-281-5/+11
| | | | | | | | | Reduces the size of the u_format_table.c file by 140k (out of 1.64M) and makes me less confused about endianness in gallium. Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Fix big-endian addressing of non-bitmask array formats.Eric Anholt2019-08-281-6/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | The formats affected are: - LA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT) - R8G8B8 x (UNORM, SNORM, SRGB, USCALED, SSCALED, UINT, SINT) - RG/RGB/RGBA x (64_FLOAT, 32_FLOAT, 16_FLOAT, 32_UNORM, 32_SNORM, 32_USCALED, 32_SSCALED, 32_FIXED, 32_UINT, 32_SINT) - RGB/RGBA x (16_UNORM, 16_SNORM, 16_USCALED, 16_SSCALED, 16_UINT, 16_SINT) - RGBx16 x (UNORM, SNORM, FLOAT, UINT, SINT) - RGBx32 x (FLOAT, UINT, SINT) - RA x (16_FLOAT, 32_FLOAT, 32_UINT, 32_SINT) The updated st_formats.c unit test checks that the formats affected by this change are all array formats in the equivalent Mesa format (if any). Mesa's array format definition is clear: the value stored is an array (increasing memory address) of values of the channel's type. It's also the only thing that makes sense for the RGB types, or very large types like RGBA64_FLOAT (A should not move to the low address because the cpu is BE). Acked-by: Roland Scheidegger <[email protected]> Acked-by: Adam Jackson <[email protected]> Tested-by: Matt Turner <[email protected]> (unit tests on BE) Reviewed-by: Marek Olšák <[email protected]>
* gallium: Drop a bit of dead code from the pack/unpack python.Eric Anholt2019-08-281-2/+0
| | | | | | | | Nothing used this var. Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Drop the useless union wrapper on pack/unpack.Eric Anholt2019-08-281-28/+22
| | | | | | | | | Nothing accessed the .value field, just the .chan. Unwrap all the code from the union, for clarity (and 13k less generated code). Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: Skip generating the pack/unpack union if we don't use it.Eric Anholt2019-08-281-1/+1
| | | | | | | | | Shaves 30k off of the 1.6M .c file, and makes for less noise for me trying to understand how gallium formats actually work. Reviewed-by: Roland Scheidegger <[email protected]> Acked-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Reset the damage area on imported resourcesBoris Brezillon2019-08-281-11/+12
| | | | | | | | Reset the damage area in the resource_from_handle() path (as done in panfrost_resource_create()). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* lima: fix texture descriptor issuesVasily Khoruzhick2019-08-282-17/+13
| | | | | | | | | | | Looks like initial RE was wrong and some fields have different purpose. I.e. there's no "disable_mipmap" field, it's actually part of another field that selects mipmap filtering. Also fix layout position. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* iris: Drop swizzling parameter from s8_offset.Kenneth Graunke2019-08-271-19/+3
| | | | This is always false on Gen8+, no need for dead code and parameters.
* radeonsi: fix scratch buffer WAVESIZE setting leading to corruptionMarek Olšák2019-08-273-31/+39
| | | | | Cc: 19.2 19.1 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: unbind blend/DSA/rasterizer state correctly in delete functionsMarek Olšák2019-08-271-1/+9
| | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414 Fixes: b758eed9c37 ("radeonsi: make sure that blend state != NULL and remove all NULL checking") Cc: 19.2 <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: align scratch and ring buffer allocations for faster memory accessMarek Olšák2019-08-273-7/+11
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: consolidate determining VGPR_COMP_CNT for API VSMarek Olšák2019-08-271-44/+32
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flagsMarek Olšák2019-08-275-18/+59
| | | | | | | | | | We need two different values of the register, one for NGG and one for legacy, in order to fix edge flags for the legacy pipeline. Passing the ngg flag to emit_clip_regs would be too complicated, so CONTEXT_REG_RMW is used for partial register updates. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: remove incorrect ngg/pos_writes_edgeflag variablesMarek Olšák2019-08-274-21/+14
| | | | | | It varies depending on si_shader_key::as_ngg. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: add PKT3_CONTEXT_REG_RMWMarek Olšák2019-08-271-0/+30
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* winsys/amdgpu+radeon: process AMD_DEBUG in addition to R600_DEBUGMarek Olšák2019-08-272-4/+8
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: add AMD_DEBUG=nonggMarek Olšák2019-08-272-1/+4
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: finish up Navi14, add PCI IDMarek Olšák2019-08-271-1/+2
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: always use the legacy pipeline for streamoutMarek Olšák2019-08-271-1/+1
| | | | | | The best way to prevent GDS hangs is not to use GDS. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: don't initialize VGT_INSTANCE_STEP_RATE_0Marek Olšák2019-08-271-1/+2
| | | | | | Only gfx9 and older use it to get InstanceID in VGPR1. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>