summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* lima: run opt_algebraic between int_to_float and boot_to_float for vsVasily Khoruzhick2019-09-091-4/+5
| | | | | | | | | int_to_float emits ftrunc and ftrunc lowering generates bool ops. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: fix warning in gpir disassemblerVasily Khoruzhick2019-09-091-1/+1
| | | | | | | | | | | | | Fixes following warning: ../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’: ../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds] 241 | "xyzw"[src - gpir_codegen_src_attrib_x]); Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: lower fceilVasily Khoruzhick2019-09-091-0/+1
| | | | | | | | | GP doesn't support fceil so we need to lower it. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Disallow moves for schedule_first nodesConnor Abbott2019-09-091-1/+5
| | | | | | | | | | | | The entire point of schedule_first is that the node has to be scheduled as soon as possible without any moves because it doesn't produce a proper floating-point value, or its value changes depending on where you read it. We were still introducing a move for preexp2 in some cases though, even if it got scheduled as soon as possible, which broke some exp() tests. Fix that. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Fix fake dep handling for schedule_first nodesConnor Abbott2019-09-092-10/+30
| | | | | | | | | | | | | The whole point of schedule_first nodes is that they need to be scheduled as soon as possible, so if a schedule_first node is the successor in a fake dependency that prevents it from being scheduled after its parent, that can cause problems. We need to add these fake dependencies to the parent as well, and we need to guarantee that the pre-RA scheduler puts schedule_first nodes right before their parents in order to prevent this from adding cycles to the dependency graph. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Fix schedule_first insertion logicConnor Abbott2019-09-091-2/+3
| | | | | | | | | | | The idea was to make sure schedule_first nodes were always first in the ready list. I made sure they were inserted first, but not that other nodes wouldn't later be scheduled ahead of them. Fixes [email protected]@execution@built-in-functions@vs-exp-float and probably others. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Ignore unscheduled successors in can_use_complex()Connor Abbott2019-09-091-1/+2
| | | | | | | | | | The point of the function is to avoid creating a complex move which is used by certain slots in the next instruction, but unscheduled successors will never be in the next instruction. Found while debugging a crash that the previous commit fixed. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Do all lowerings before rschedConnor Abbott2019-09-093-23/+2
| | | | | | | | | | | | | | | | | | | | The scheduler assumes that load nodes are always duplicated so that they can always be scheduled eventually and therefore they never need to be spilled. But some lowerings were running after the pre-RA scheduler, whereas duplication has to happen before then since it's needed for the scheduler to do a better job reducing register pressure. This meant that lowerings were introducing multiple uses of a load instruction, which broke the scheduler's expectation and resulted in infinite loops in situations where the only nodes available to spill were load nodes. Spilling load nodes would be silly, so we want to fix the lowerings rather than the scheduler. Just do all lowerings before the pre-RA scheduler, which also helps with reducing pressure since the scheduler can more accurately compute the pressure. Fixes lima/mesa#104. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* panfrost: Rename pan_bo_cache.c into pan_bo.cBoris Brezillon2019-09-082-1/+1
| | | | | | | | So we can move all the BO logic into this file instead of having it spread over pan_resource.c, pan_drm.c and pan_bo_cache.c. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of the now unused SLAB allocatorBoris Brezillon2019-09-083-47/+0
| | | | | | | | | The last users have been converted to use plain BOs. Let's get rid of this abstraction. We can always consider adding it back if we need it at some point. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of unused panfrost_context fieldsBoris Brezillon2019-09-081-4/+0
| | | | | | | | Some fields in panfrost_context are unused (probably leftovers from previous refactor). Let's get rid of them. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOsBoris Brezillon2019-09-083-18/+21
| | | | | | | | | ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using panfrost_drm_allocate_slab() but they never any of the SLAB-based allocation logic. Let's convert those fields to plain BOs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make transient allocation rely on the BO cacheBoris Brezillon2019-09-085-104/+16
| | | | | | | | | Right now, the transient memory allocator implements its own BO caching mechanism, which is not really needed since we already have a generic BO cache. Let's simplify things a bit. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop passing a ctx to functions being passed a batchBoris Brezillon2019-09-084-21/+23
| | | | | | | | The context can be retrieved from batch->ctx. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch()Boris Brezillon2019-09-083-9/+9
| | | | | | | | | Given the function name it makes more sense to pass it a job batch directly. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: s/job/batch/Boris Brezillon2019-09-0816-259/+264
| | | | | | | | | | | | What we currently call a job is actually a batch containing several jobs all attached to a rendering operation targeting a specific FBO. Let's rename structs, functions, variables and fields to reflect this fact. Suggested-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* android: fix linking issues with liblogTapani Pälli2019-09-071-1/+2
| | | | | | | | Fixes Android build errors observed in Intel CI. Fixes: f9f7cbc1aa3 "util: android logging support" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* iris: Support the disable_throttling=true driconf option.Kenneth Graunke2019-09-063-0/+6
|
* amd: move adaptive sync to performance section, as it is defined in xmlpoolEric Engestrom2019-09-061-4/+1
| | | | | | | | Fixes: 3844ed8d44677588bc29 ("radv: Add adaptive_sync driconfig option and enable it by default.") Fixes: e260493f2ab2483e5a55 ("radeonsi: Enable adaptive_sync by default for radeon") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* gallivm: drop LLVM<3.3 code paths as no build system allows thatEric Engestrom2019-09-065-42/+36
| | | | | Suggested-by: Michel Dänzer <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* llvmpipe: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOREric Engestrom2019-09-061-2/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* clover: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOREric Engestrom2019-09-061-2/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* gallivm: replace more complex 3.x version check with LLVM_VERSION_MAJOR/MINOREric Engestrom2019-09-067-71/+74
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* clover: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-062-17/+19
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* gallivm: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-068-26/+34
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* swr: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-061-3/+4
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* amd: replace major llvm version checks with LLVM_VERSION_MAJOREric Engestrom2019-09-064-6/+13
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* svga: replace binary HAVE_LLVM checks with LLVM_AVAILABLEEric Engestrom2019-09-061-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* r600: replace binary HAVE_LLVM checks with LLVM_AVAILABLEEric Engestrom2019-09-061-6/+2
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* aux/draw: replace binary HAVE_LLVM checks with LLVM_AVAILABLEEric Engestrom2019-09-068-26/+26
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* gallivm: replace `0x` version print with actual version stringEric Engestrom2019-09-061-2/+3
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Acked-by: Michel Dänzer <[email protected]>
* anv,iris: L3ALLOC register replaces L3CNTLREG for gen12Jordan Justen2019-09-061-2/+13
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop redirecting state cache to command streamer cache sectionKenneth Graunke2019-09-061-6/+0
| | | | | | | | | | | | | | | | | | This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.
* iris: Invalidate state/texture/constant caches after STATE_BASE_ADDRESSKenneth Graunke2019-09-061-4/+55
| | | | | | | | Jason pointed out that the caches likely refer to offsets from dynamic and surface state base addresses, so when we change those, we need to invalidate the caches. Comment borrowed from src/intel/vulkan/genX_cmd_buffer.c.
* freedreno/a6xx: Implement primitive count queries on GPUKristian H. Kristensen2019-09-0613-18/+123
| | | | | | | | | | The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or tessellation, since these stages add primitives at runtime. Use the WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and implement a hw query for this. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Let the GPU track streamout offsetsKristian H. Kristensen2019-09-064-19/+36
| | | | | | | | | | The GPU writes out streamout offsets as it goes to the FLUSH_BASE pointer. We use that value with CP_MEM_TO_REG when appending to the stream so that we don't have to track the offsets with the CPU in the driver. This ensures that streamout continues to work once we enable geometry and tessellation shader stages that add geometry. Reviewed-by: Rob Clark <[email protected]>
* llvmpipe: fix CALLOC vs. free mismatchesRoland Scheidegger2019-09-062-4/+5
| | | | | | | Should fix some issues we're seeing. And use REALLOC instead of realloc. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* panfrost/ci: Increase timeoutsTomeu Vizoso2019-09-061-2/+2
| | | | | | | | Sometimes LAVA jobs will timeout due to transient issues, and the Gitlab job will fail in that case. Increase the timeouts to reduce the likeliness of that happening and reduce false positives. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost/ci: Use special runner for LAVA jobsTomeu Vizoso2019-09-061-9/+1
| | | | | | | So repositories don't need to be specially configured with a token to access LAVA, store this token in a bind volume for a special runner. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost/ci: Re-add support for armhfTomeu Vizoso2019-09-064-28/+39
| | | | | | | Now that Volt supports armhf, build again images and submit to LAVA for RK3288. Signed-off-by: Tomeu Vizoso <[email protected]>
* radeon: Fix mjpeg issue for ARCTURUSZhu, James2019-09-061-0/+1
| | | | | | | ARCTURUS mjpeg is using direct register access. Signed-off-by: James Zhu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* radeon/vcn: add RENOIR VCN decode supportLeo Liu2019-09-061-4/+4
| | | | | | | It has same VCN2.x block as Navi1x Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Boyuan Zhang <[email protected]>
* tgsi_to_nir: Remove dependency on libglsl.Timur Kristóf2019-09-062-14/+18
| | | | | | | | | This commit removes the GLSL dependency in TTN by manually recording the textures used and calling nir_lower_samplers instead of its GL counterpart. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* radeonsi: Release storage for smda_uploads when the context is destroyedGert Wollny2019-09-061-0/+1
| | | | | | | | | | | | | This fixes a memory leak in the flush code: Direct leak of 128 byte(s) in 1 object(s) allocated from: #0 in __interceptor_realloc .../gcc-8.3.0/libsanitizer/asan/asan_malloc_linux.cc:105 #1 in si_buffer_do_flush_region src/gallium/drivers/radeonsi/si_buffer.c:573 #2 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:608 #3 in si_buffer_flush_region src/gallium/drivers/radeonsi/si_buffer.c:597 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* lima/ppir: don't lower phis to scalarVasily Khoruzhick2019-09-051-1/+0
| | | | | | | | | | Utgard PP is vec4 architecture, so lowering phis to scalars increases instruction count and potentially interferes with spilling. Tested-by: Andreas Baierl <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* freedreno/a2xx: formats updateJonathan Marek2019-09-064-247/+103
| | | | | | | | | | | | | | For render formats, update fd2_pipe2color to only work with HW supported render formats, and remove the format whitelist is_format_supported. This patch enables float render formats (which work). For vertex/texture formats, use a generic function which translates using the bitsize of the channels. Since we fake support for some vertex formats, check for these in is_format_supported to avoid enabling them as sampler formats. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a2xx: fix depth gmem restoreJonathan Marek2019-09-061-15/+12
| | | | | | | | | | | | | | Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16 render formats for gmem restore. Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color working with depth formats. gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src formats are the same. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a2xx: implement polygon offsetJonathan Marek2019-09-061-0/+12
| | | | | | | | | Fixes failures in the following deqp tests: dEQP-GLES2.functional.polygon_offset.* Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: fix SRC_ALPHA_SATURATE for alpha blend functionJonathan Marek2019-09-061-1/+6
| | | | | | | | | Fixes failures in the following deqp tests: dEQP-GLES2.functional.fragment_ops.*src_alpha_saturate* Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a2xx: ir2: update register state in scalar insertJonathan Marek2019-09-061-0/+6
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Rob Clark <[email protected]>