summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: fix MSAA resolve hangsRob Clark2019-07-291-11/+4
| | | | | | | | | Seems like RB_BLIT_SCISSOR needs to be aligned to (minimum?) tile size. Fixes intermittent GPU hangs triggered by some of the three.js samples on https://threejs.org/ Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix helgrind complaint on shader-db key setup.Eric Anholt2019-07-291-2/+1
| | | | | | | If the variable's going to be static, we shouldn't be memsetting it from every thread and instead just have it in the data section. Reviewed-by: Rob Clark <[email protected]>
* gallium: switch boolean -> bool at the interface definitionsIlia Mirkin2019-07-2215-38/+38
| | | | | | | | | | | | | | | | | | This is a relatively minimal change to adjust all the gallium interfaces to use bool instead of boolean. I tried to avoid making unrelated changes inside of drivers to flip boolean -> bool to reduce the risk of regressions (the compiler will much more easily allow "dirty" values inside a char-based boolean than a C99 _Bool). This has been build-tested on amd64 with: Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d vc4 i915 svga virgl swr panfrost iris lima kmsro Gallium st: mesa xa xvmc xvmc vdpau va Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* util: use standard name for snprintf()Eric Engestrom2019-07-191-1/+1
| | | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* util: use standard name for sprintf()Eric Engestrom2019-07-191-1/+1
| | | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: Drop the WFI in the program update stateobj.Eric Anholt2019-07-171-2/+0
| | | | | | | | | | | | | Rob Clark thinks this was likely a workaround for our const buffer update bugs, and now that it's passing tests, we should be able to drop it. renderdoc-traces results: traces/android/clashofclans.rdc: +6.1% +/- 1.1% traces/android/candycrush.rdc: +5.2% +/- 1.6% Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Drop the WFI in constant uploads.Eric Anholt2019-07-171-2/+0
| | | | | | | | | | | | | | Now that the bin vs render constlen is fixed, we can skip these waits. Improves webgl aquarium performance at 10k fish from 27fps to 33. Some highlights from renderdoc-traces: traces/android/minecraft.rdc: +17.1% +/- 3.4% traces/glmark2/ideas-speed=duration.rdc: +11.6% +/- 2.4% traces/android/candycrush.rdc: +5.4% +/- 1.1% traces/android/clashofclans.rdc: +4.4% +/- 1.3% Reviewed-by: Rob Clark <[email protected]>
* freedreno: Assert that we don't exceed constlen.Eric Anholt2019-07-171-10/+24
| | | | | | | | | We actually could go up to vs->constlen in the binning shader on a6xx, but for sanity let's make sure that we're always under constlen. This would have caught the bug fixed in 572c76fd8826 ("freedreno: Clamp UBO uploads to the constlen decided by the shader.") Reviewed-by: Rob Clark <[email protected]>
* freedreno: Fix more constlen overflows.Eric Anholt2019-07-171-2/+5
| | | | | | | | | Fixes constlen overflow in dEQP-GLES31.functional.shaders.builtin_var.compute.num_work_groups and dEQP-GLES31.functional.image_load_store.buffer.image_size.readonly_32 and probably others. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Drop stale comment about skipping uploads.Eric Anholt2019-07-171-1/+0
| | | | | | | We already skip the upload if it's unused, due to the constlen > offset check. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Generate headers from xml filesKristian H. Kristensen2019-07-105-12/+12
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Rob Clark <[email protected]>
* gallium: get rid of PIPE_CAP_SM3Erik Faye-Lund2019-07-101-1/+3
| | | | | | | | | | | | | | | | | | | | | PIPE_CAP_SM3 has always been an odd one out of all our caps. While most other caps are fine-grained and single-purpose, this cap encode several features in one. And since OpenGL cares more about single features, it'd be nice to get rid of this one. As it turns, this is now relatively simple. We only really care about three features using this cap, and those already got their own caps. So we can remove it, and make sure all current drivers just give the same response to all of them. The only place we *really* care about SM3 is in nine, and there we can instead just re-construct the information based on the finer-grained caps. This avoids DX9 semantics from needlessly leaking into all of the drivers, most of who doesn't care a whole lot about DX9 specifically. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+1
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* freedreno/a6xx: wire up dither stateRob Clark2019-06-283-2/+14
| | | | | | | | | | | | | | | Fixes: dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.rebind_rbo_rgba4 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4 dEQP-GLES2.functional.fbo.render.recreate_colorbuffer.no_rebind_rbo_rgba4_stencil_index8 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.rebind_rbo_rgba4_depth_component16 dEQP-GLES2.functional.fbo.render.recreate_depthbuffer.no_rebind_rbo_rgba4_depth_component16 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.rebind_rbo_rgba4_stencil_index8 dEQP-GLES2.functional.fbo.render.recreate_stencilbuffer.no_rebind_rbo_rgba4_stencil_index8 Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: remove fnot/fxor/fand/for opcodesJonathan Marek2019-06-262-6/+2
| | | | | | | | | | There doesn't seem to be any reason to keep these opcodes around: * fnot/fxor are not used at all. * fand/for are only used in lower_alu_to_scalar, but easily replaced Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* freedreno: correct batch_depends_on() logicRob Clark2019-06-261-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: drop unused arg from fd_batch_flush()Rob Clark2019-06-2612-23/+23
| | | | | | | The `force` arg has been unused for a while.. but apparently I forgot to garbage collect it. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix batch leak in fd5 blitter pathRob Clark2019-06-241-0/+1
| | | | | Fixes: 3d198926a48 freedreno: use fd_bc_alloc_batch instead of fd_batch_create. Signed-off-by: Rob Clark <[email protected]>
* freedreno: Stop treating UBO 0 specially in UBO uploading.Eric Anholt2019-06-241-33/+1
| | | | | | | | | ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we need to upload (all of it, since it will lower indirect UBO 0 accesses from load_ubo back to indirection on the constant buffer). Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Clamp UBO uploads to the constlen decided by the shader.Rob Clark2019-06-241-0/+11
| | | | | | | | | | | | If the NIR-level analysis decided to move UBO loads to the constant file, but the backend decided not to load those constants, we could upload past the end of constlen. This is particularly relevant for pre-a6xx, where we emit a different constlen between bin and render variants. (Fix by Rob, commit message by anholt) Reviewed-by: Eric Anholt <[email protected]>
* freedreno: Remove silly return from ir3_optimize_nir().Eric Anholt2019-06-211-1/+3
| | | | | | | | We only ever return the shader we were passed in (but internally modified). Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Stop reporting max_const in shader-db.Eric Anholt2019-06-211-2/+1
| | | | | | | We end up uploading constlen regardless, so max_const would only get you slightly improved granularity in const usage in comparison. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Include binning shaders in shader-db.Eric Anholt2019-06-211-3/+8
| | | | | | | We want to see if we've improved our binning VS output, as well as the render VS. Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: disallow UBWC for x24s8Rob Clark2019-06-171-4/+15
| | | | | | | | | Fixes: dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d dEQP-GLES31.functional.stencil_texturing.misc.compare_mode_effect Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: un-swap X24S8_UINTRob Clark2019-06-172-5/+6
| | | | | | | | | | | | | | | | | | | The stencil is actually in the .w component, but we used to use SWAP to remap the channels. This doesn't work when tiled/ubwc. Fixes: dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_2d_array dEQP-GLES31.functional.stencil_texturing.format.depth24_stencil8_cube dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_2d_array dEQP-GLES31.functional.stencil_texturing.format.stencil_index8_cube dEQP-GLES31.functional.stencil_texturing.misc.base_level dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.stencil_index8.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot dEQP-GLES31.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot dEQP-GLES31.functional.texture.border_clamp.sampler.uint_stencil Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: re-enable UBWC for depth/stencilRob Clark2019-06-151-0/+2
| | | | | | | | Now that we can blit depth/stencil in a way that plays nicely with UBWC, re-enable it. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: handle z24s8/z24x8 blits with u_blitterRob Clark2019-06-152-25/+11
| | | | | | | | | | Now that it can turn these blits into rendering to RB6_Z24_UNORM_S8_UINT it can properly handle cases where only one of depth+stencil is being blit. And this avoids lying about he format, which completely doesn't work when UBWC is used. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: handle fallback for rewritten blits ourselfRob Clark2019-06-151-11/+37
| | | | | | | | | For re-written z/s blits, we want to use the re-written `pipe_blit_info` even if we have to fallback to 3d pipe (`u_blitter`). So handle that fallback ourself. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: rename variableRob Clark2019-06-151-39/+39
| | | | | | | | | The name 'separate' doesn't make a while lot of sense, as only one of the cases is the blit actually split. But split out from previous patch in an attempt to reduce the noise. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: consolidate z/s blit handlingRob Clark2019-06-151-67/+46
| | | | | | | This will get even simpler with the next patch Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: fix MAX_INDICESRob Clark2019-06-131-2/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/blitter: remove dead codeRob Clark2019-06-131-7/+0
| | | | | | | The src/dst format is overriden from the pipe_blit_info, so this just logic just serves to confuse the reader. Signed-off-by: Rob Clark <[email protected]>
* freedreno: turn staging cube into 2d-arrayRob Clark2019-06-131-0/+2
| | | | | | | Since we could only need a subset of the layers, and otherwise we trigger an assert in util_max_layer() Signed-off-by: Rob Clark <[email protected]>
* freedreno: use util_dynarray_clear instead of util_dynarray_resize(_, 0)Nicolai Hähnle2019-06-125-12/+12
| | | | | | | | | This is more expressive and simplifies a subsequent change. v2: - fix one more call-site after rebase Reviewed-by: Marek Olšák <[email protected]>
* freedreno/a5xx: enable a540Rob Clark2019-06-112-2/+14
| | | | | Tested-by: Jeffrey Hugo <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: enable UBWC by defaultRob Clark2019-06-113-18/+3
| | | | | | | | Flip the FD_MESA_DEBUG flag to a disable rather than enable, drop the obsolete comment (and bonus, drop unused softpin debug flag) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: disallow UBWC for z24s8Rob Clark2019-06-111-1/+0
| | | | | | | | | | | | | | This is slightly annoying because it *mostly* works.. but we have some issues to sort out about how to blit z24s8/x24s8/z24x8 with UBWC before we can enable UBWC by default. For now it is a step forward to at least enable it for non-z/s while we figure out how to blit z24s8+UBWC. (The basic issue is that pretending z24s8 is an equivalently sized rgba format for the purpose of blitting falls apart when UBWC is in the picture.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: use correct UBWC reg buildersRob Clark2019-06-112-11/+11
| | | | | | | | No functional change, the registers have the same layout as MRT flags pitch reg. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: disable UBWC for some formatsRob Clark2019-06-111-2/+0
| | | | | | | | | | | | An older blob claims to support UBWC w/ r32ui an r32i, but not r32f. Results from deqp indicate that it doesn't work with r32ui and r32i. This *could* also just mean that use as "IBO" (image) is more limited than as texture, although blob also doesn't seem to bother to try to use UBWC with images at all, so hard to know for sure. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: handle non-UWC-compatible image viewsRob Clark2019-06-115-1/+45
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: handle non-UBWC-compatible texture viewsRob Clark2019-06-113-0/+23
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: add helper to uncompress UBWC resourceRob Clark2019-06-112-0/+37
| | | | | | | | | | | | | | | | | | | | | We'll need this for a few edge cases, like image/sampler view that uses a format that UBWC does not support with a resource originally created in a format that UBWC does support. NOTE we *could* in some cases do an in-place uncompress. But that has a couple potential sharp edges: 1) the uncompressed buffer could have different layout, ie. a5xx with meta and pixel data of layers/levels interleaved. 2) if it comes mid-batch, it would force flush, or somehow fixing up cmdstream for draws already emitted. But with the resource shadowing approach we can rely on batch re-ordering to avoid splitting things.. older draws see the older compressed version, newer draws see the new uncompressed version of the rsc. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: handle images in rebind_resource()Rob Clark2019-06-111-0/+9
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: allow null discard box in shadow pathRob Clark2019-06-111-4/+10
| | | | | | | When uncompressing a UBWC buffer, we don't want to discard anything. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: swap UBWC state in shadow pathRob Clark2019-06-111-0/+4
| | | | | | | | | It doesn't come up yet, as so far we only hit this path with linear buffers. But it will when we start re-using the shadow path for uncompressing UBWC buffers. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: add modifier param to fd_try_shadow_resource()Rob Clark2019-06-111-3/+5
| | | | | | | | To uncompress UBWC, I want to re-use the shadow path, but we'll need a way to request that the new buffer is not compressed. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: correct modifier for UBWC buffersRob Clark2019-06-111-0/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a5xx: Fix indirect draw max_indices calculationEduardo Lima Mitev2019-06-111-2/+1
| | | | | | | | | | | | | | | | | The number of elements to draw should not be affected by the offset. A similar fix was submitted for a6xx at 79180a05. Fixes these dEQP tests on a5xx: dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_separate_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawarrays_combined_grid_500x500_drawcount_2500 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_8 dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_500x500_drawcount_2500 Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: re-arrange program stageobj/groupRob Clark2019-06-074-30/+58
| | | | | | | | | | | | | | Split out a separate program config state group to run early before the other groups. This seems to help w/ intermittent "missed tiles" (although I had assumed that was a mem2gmem issue), or at least I can't reproduce that issue with this patch, but can without. It has the benefit of HLSQ_VS_CNTL.CONSTLEN matching for VS and BS. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: fix hangs with newer sqe fwRob Clark2019-06-071-32/+81
| | | | | | | | | | | | | | | | | | | With the newer (v1.76) fw, we were getting hangs (compared to older v1.66 fw). Re-work the GMEM code to structure things a bit closer to the blob. This moves some PKT7 packets from IB2 to IB1, which I think is what was confusing SQE and causing it to get stuck in an infinite loop. But in general structuring things at least closer to the same way blob does makes it easier to compare cmdstream. Note: this is a bit on the large side for what I'd normally consider for stable.. but right now it is looking like it is the newer fw that is headed for linux-firmware. This should defn have some soak time on master, but probably a good idea for this patch to end up in distro mesa builds by the time a630_sqe.fw hits linux-firmware. Cc: [email protected] Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>