aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: add initial compute state structsDave Airlie2019-09-043-0/+40
| | | | | | These mirror the fragment shader structs, this is just a framework. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: introduce compute shader contextDave Airlie2019-09-046-0/+98
| | | | | | | | The compute shader will need it's own context like the frag shader has, this just introduces the framework struct and allocates/frees for it in the right places. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add compute threadpool + mutexDave Airlie2019-09-046-2/+256
| | | | | | | Reviewed-by: Roland Scheidegger <[email protected]> In order to efficiently run a number of compute blocks, use a threadpool that just allows for jobs with unique sequential ids to be dispatched.
* llvmpipe: reogranise jit pointer orderingDave Airlie2019-09-042-31/+31
| | | | | | | | In order to share the texture/image/sampler code with compute shaders we need to reorg them to be at the front of context same as draw does for vs/gs sharing. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: enable fb no attachDave Airlie2019-09-041-1/+2
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* iris: Report correct number of planes for planar imagesKenneth Graunke2019-09-031-1/+8
| | | | | | | | | | | We were only handling the modifiers case and not counting the number of planes in actual planar images. Fixes Piglit's ext_image_dma_buf_import-export. Fixes: fc12fd05f56 ("iris: Implement pipe_screen::resource_get_param") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509 Reviewed-by: Jordan Justen <[email protected]>
* lima: Return fence unconditionallyRoman Stratiienko2019-09-041-4/+2
| | | | | | | | | Based on the vc4 implementation. Fixes Android RenderEngine::flush() routine: android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225 Signed-off-by: Roman Stratiienko <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: clone uniforms and load_coords into each successorVasily Khoruzhick2019-09-044-41/+155
| | | | | | | | | | | | | | | | | | | | | | | Try more aggressive approach with cloning uniform and coord loads. Uniform load can be inserted into any instruction, so let's do that. ARM site claim that penalty for cache miss is one clock, so we don't lose anything if we merge it into instruction that uses the result. As side effect we can also pipeline it and thus decrease reg pressure. Do the same for varyings that hold texture coords, but for different reason: looks like there's a special path for coords that increases precision if varying that holds it is pipelined. If we don't pipeline it and load coords from a register its precision is fp16 and thus only 10 bits which is not enough to accurately sample textures of size 1024 or larger. Since instruction can hold only one uniform load and one varying load, node_to_instr now creates a move using helper introduced in previous commit if slot is already taken. As side effect of this change we can also try to pipeline texture loads and create a move if attempt fails. Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: don't assume that load coords gets value from registerVasily Khoruzhick2019-09-043-9/+13
| | | | | | | | | It can load value from varying directly as well. Also load_regs is the only op that has a source, so add src_num field to load node and set it accordingly. Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: add common helper for creating movsVasily Khoruzhick2019-09-043-49/+41
| | | | | | | Introduce common helper for creating movs to avoid code duplication Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* freedreno: Fix the type of single-component scaled vertex attrs.Eric Anholt2019-09-034-12/+12
| | | | | | | | | | This looks like clear copy-and-pasteos, and fixes: dEQP-GLES2.functional.draw.random.40 (on A307 and A630, both tested in the new CI farm) Reviewed-by: Rob Clark <[email protected]>
* radeonsi/nir: Remove uniform variable scanningConnor Abbott2019-09-031-84/+7
| | | | | | | | | | | We can get all the information we need from NIR. It's slightly less accurate, but radeonsi doesn't use the extra information. The old code also overcounted atomic counters, which led to problems when everything was used at once. Fixes KHR-GL45.compute_shader.resources-max. Reviewed-by: Marek Olšák <[email protected]>
* nir: Fix num_ssbos when lowering atomic countersConnor Abbott2019-09-031-4/+1
| | | | | | | | | | | | Otherwise it's impossible to know the maximum SSBO index for both internal TGSI shaders from TTN (which don't have any notion of atomic counters and no offset) as well as shaders from GLSL. I fixed everything I could find while grepping for num_ssbos and num_abos, which hopefully is everything (iris was the only user I could find that uses it in a meaningful way). Reviewed-by: Marek Olšák <[email protected]>
* panfrost: Remove panfrost_uploadAlyssa Rosenzweig2019-09-032-26/+0
| | | | | | | | | This routine was made obsolete over a series of reworks of memory allocation; Tomeu's changes to shader memory allocation finally made this unused as cppcheck noted. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Fix misc. issues flagged by cppcheckAlyssa Rosenzweig2019-09-033-10/+7
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* panfrost: Mark (1 << 31) as unsignedAlyssa Rosenzweig2019-09-031-3/+3
| | | | | | | I was not aware this incurred undefined behaviour; thank you cppcheck. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* broadcom/vc4: Expand width of dst surfaceZhaowei Yuan2019-09-031-1/+1
| | | | | | | | | | | Four bytes of src_surf will be compressed into a 32-bits data and stored into dst_surf, and dst_surf is read as z-order, so its width must be aligned to multiples of 8(4x2) before divided by 2. Signed-off-by: Zhaowei Yuan <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111266 Reviewed-by: Alejandro Piñeiro <[email protected]>
* swr: Fix make_unique build error.Vinson Lee2019-09-021-3/+3
| | | | | | | | | | swr_shader.cpp: In function ‘void (* swr_compile_gs(swr_context*, swr_jit_gs_key&))(HANDLE, HANDLE, SWR_GS_CONTEXT*)’: swr_shader.cpp:732:44: error: ‘make_unique’ was not declared in this scope ctx->gs->map.insert(std::make_pair(key, make_unique<VariantGS>(builder.gallivm, func))); ^~~~~~~~~~~ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* iris: Lessen texture cache hack flush for blits/copies on Icelake.Kenneth Graunke2019-08-311-16/+34
| | | | | | | | | | | Lionel found actual documentation for this at long last. Apparently it actually is a sampler cache limitation that was mostly fixed on Icelake. Unfortunately, it seems there are still issues with ASTC and non-ASTC sampler views. Still, we can lessen the flush condition from "format mismatch" to "ASTC mismatch", which eliminates most of the flushing here. We also update the documentation to refer to the workaround name.
* swr: Fix build with llvm-9.0 again.Vinson Lee2019-08-313-0/+28
| | | | | | | | | | Commit 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") unintentionally removed changes for llvm-9.0. Fixes: 6f7306c029a7 ("swr/rast: Refactor memory API between rasterizer core and swr") Fixes: 5dd9ad157005 ("swr/rasterizer: Better implementation of scatter") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* pan/midgard: Use shared psiz clamp passAlyssa Rosenzweig2019-08-302-76/+0
| | | | | | | We already had a perfectly cromulent pass for this, but one landed in common NIR code so let's switch and lighten our tree. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add transient BOs to job batchesBoris Brezillon2019-08-302-1/+2
| | | | | | | | | | | | | | | | Memory allocated through panfrost_allocate_transient() is likely to come from the transient pool. Let's add the BO backing the allocated memory region to the job batch so the kernel can retain this BO while jobs are executed. In practice that has never been a problem because the transient pool is never shrinked, and even if it was, we still control the lifetime of the job, so there's no reason for this BO to be freed before the GPU is done executing the batch. But it still make sense to add the BO for debugging purpose. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: protect access to shared bo cache and transient poolRohan Garg2019-08-305-5/+23
| | | | | | | | | | Both the BO cache and the transient pool are shared across context's. Protect access to these with mutexes. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* panfrost: Jobs must be per context, not per screenRohan Garg2019-08-305-17/+14
| | | | | | | | | | | Jobs _must_ only be shared across the same context, having the last_job tracked in a screen causes use-after-free issues and memory corruptions. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* freedreno/a3xx: fix sysmem <-> gmem tiles transferKhaled Emara2019-08-302-2/+3
| | | | | | | Tiling mode was missing from fd3_emit_gmem_restore_tex(). emit_gmem2mem_surf() used LINEAR exclusiveley. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix texture tiling parametersKhaled Emara2019-08-301-10/+21
| | | | | | | | * Fix 2D/2DArray/3D tiling parameters: There is a bottom threshold for width and height. * Renable tiling for Cubemap, after setting the right parameters. Reviewed-by: Rob Clark <[email protected]>
* broadcom/v3d: Allow importing linear BOs with arbitrary offset/stride.Dave Stevenson2019-08-301-8/+23
| | | | | | | | | | | | Equivalent of 0c1dd9dee "broadcom/vc4: Allow importing linear BOs with arbitrary offset/stride." for v3d. Allows YUV buffers with a single buffer and plane offsets to be passed in. Signed-off-by: Dave Stevenson <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swr/rasterizer: Fix GS attributes processingJan Zielinski2019-08-303-24/+10
| | | | | | | Input to GS is just a set of attributes, so remove explicit setup of 'position' which is meaningless for GS input processing. Reviewed-by: Alok Hota <[email protected]>
* ac: drop now useless lookup_interp_param from ABISamuel Pitoiset2019-08-301-1/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: import linear/perspective PS input parameters from radv/radeonsiSamuel Pitoiset2019-08-302-17/+19
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add JPEG decode support for VCN 2.0 devicesThong Thai2019-08-291-3/+1
| | | | | Signed-off-by: Thong Thai <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* Revert "radeonsi: don't emit PKT3_CONTEXT_CONTROL on amdgpu"Thong Thai2019-08-291-7/+4
| | | | | | | | | | | This reverts commit 5a2e65be89d97ed5d7263f0296ea69ae8517187b. Even though CONTEXT_CONTROL is emitted by the kernel, CONTEXT_CONTROL still needs to be emitted by the UMD, or else the driver will hang Cc: 19.2 <[email protected]> Signed-off-by: Thong Thai <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* iris: Fix partial fast clear checks to account for miplevel.Kenneth Graunke2019-08-291-2/+2
| | | | | | | | | | | | | | We enabled fast clears at level > 0, but didn't minify the dimensions when comparing the box size, so we always thought it was a partial clear and as a result never actually enabled any. This eliminates some slow clears in Civilization VI, but they are mostly during initialization and not the main rendering. Thanks to Dan Walsh for noticing we had too many slow clears. Fixes: 393f659ed83 ("iris: Enable fast clears on other miplevels and layers than 0.") Reviewed-by: Rafael Antognolli <[email protected]>
* panfrost: Remove unused argument from panfrost_drm_submit_vs_fs_job()Rohan Garg2019-08-293-5/+3
| | | | | | | | | is_scanout is not used anywhere and can be inferred within panfrost_drm_submit_vs_fs_job() if required. Signed-off-by: Rohan Garg <[email protected]> Reviewed-by: Boris Brezillon <[email protected]> Signed-off-by: Boris Brezillon <[email protected]>
* iris: Actually describe bo_reuse driconf optionKenneth Graunke2019-08-291-0/+10
| | | | | | | Otherwise it doesn't exist and can't be parsed, so everything dies at screen init time. Fixes: 6dc4ddc5f81 ("iris: use driconf for 'bo_reuse' parameter")
* panfrost/ci: Print only regressionsTomeu Vizoso2019-08-292-4/+7
| | | | | | | | Some functionality has been added to deqp-volt to only print regressions, so update our version of it and use the new options. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* swr/rasterizer: Enable ARB_fragment_layer_viewportJan Zielinski2019-08-293-1/+21
| | | | | | | Added loading gl_Layer and gl_ViewportIndex variables to Pixel Shader context. Reviewed-by: Alok Hota <[email protected]>
* iris: use driconf for 'bo_reuse' parameterTapani Pälli2019-08-294-6/+20
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Don't auto-flush/dirty on transfer unmap for coherent buffersKenneth Graunke2019-08-281-1/+2
| | | | | | | | | | | | | | | | | | | | | When u_upload_mgr fills up a buffer, it unmaps and destroys it. Our unmap function was automatically performing the equivalent of a FlushMappedBufferRange call in this case. Because the buffer mapping is persistent and coherent, we don't actually do any flushing when we do the rest of the writes to the buffer - we were just doing one final one at the end. But we would be using the uploaded contents on the GPU the whole time. This certainly shouldn't be necessary for streaming buffers, and if such flushing and dirtying is necessary for coherent buffers, this is wildly insufficient. Drops a small number of constant packets and PIPE_CONTROL flushes from most benchmarks that I've looked at. Doesn't seem to make much of an impact on performance, however. Thanks to Felix Degrood for noticing that we were emitting more 3DSTATE_CONSTANT_* packets than we needed to.
* iris: build android libmesa_iris for gen12Tapani Pälli2019-08-281-1/+21
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Build for gen12Jordan Justen2019-08-283-1/+7
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/a6xx: Fix non-mipmap filtering selection.Eric Anholt2019-08-281-6/+6
| | | | | | | | | | We were clamping the LOD to force non-mipmap filtering, but that means that the HW doesn't get to select between the min and mag filters. Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering. Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.* Reviewed-by: Rob Clark <[email protected]>
* panfrost: Reset the damage area on imported resourcesBoris Brezillon2019-08-281-11/+12
| | | | | | | | Reset the damage area in the resource_from_handle() path (as done in panfrost_resource_create()). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* lima: fix texture descriptor issuesVasily Khoruzhick2019-08-282-17/+13
| | | | | | | | | | | Looks like initial RE was wrong and some fields have different purpose. I.e. there's no "disable_mipmap" field, it's actually part of another field that selects mipmap filtering. Also fix layout position. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* iris: Drop swizzling parameter from s8_offset.Kenneth Graunke2019-08-271-19/+3
| | | | This is always false on Gen8+, no need for dead code and parameters.
* radeonsi: fix scratch buffer WAVESIZE setting leading to corruptionMarek Olšák2019-08-273-31/+39
| | | | | Cc: 19.2 19.1 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: unbind blend/DSA/rasterizer state correctly in delete functionsMarek Olšák2019-08-271-1/+9
| | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111414 Fixes: b758eed9c37 ("radeonsi: make sure that blend state != NULL and remove all NULL checking") Cc: 19.2 <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: align scratch and ring buffer allocations for faster memory accessMarek Olšák2019-08-273-7/+11
| | | | Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: consolidate determining VGPR_COMP_CNT for API VSMarek Olšák2019-08-271-44/+32
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: set PA_CL_VS_OUT_CNTL with CONTEXT_REG_RMW to fix edge flagsMarek Olšák2019-08-275-18/+59
| | | | | | | | | | We need two different values of the register, one for NGG and one for legacy, in order to fix edge flags for the legacy pipeline. Passing the ngg flag to emit_clip_regs would be too complicated, so CONTEXT_REG_RMW is used for partial register updates. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>