aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: Stop scattered remapping of SSBOs/images to IBOs.Eric Anholt2020-01-216-32/+37
| | | | | | | | | | | | | | | | | | | | | | | Just make it be all SSBOs then all storage images. The remapping table was there to make it so that the big gap present from gallium's atomic lowering would get cleaned up, but that's no longer case. The table has made it very hard to support Vulkan storage images, so it's time for it to go. This does mean that an SSBO/IBO that is only loaded (or size-queried) will now occupy a slot in the table where it wouldn't before. This seems like a minor cost compared to being able to drop this much logic. With the remapping table gone, SSBO array handling for turnip just falls out. Fixes many array cases of dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.* Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Jonathan Marek <[email protected]> (turnip) Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
* nir: Drop the ssbo_offset to atomic lowering.Eric Anholt2020-01-211-1/+1
| | | | | | | | | | | | | The arguments passed in were: - prog->info.num_ssbos - prog->nir->info.num_ssbos - arbitrary values for standalone compilers The num_ssbos should match between the prog's info and prog->nir's info until this lowering happens. Reviewed-by: Marek Olšák <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3240>
* freedreno/a6xx: add PROG_FB_RAST stateobjRob Clark2020-01-172-0/+6
| | | | | | | | | | For the handful of registers that depend on the union of program/ framebuffer/rasterizer state. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/a6xx: move dynamic program state to streaming stateobjRob Clark2020-01-174-44/+61
| | | | | | | | | Move the program state which we can't pre-bake to a streaming state object, rather than emitting directly in the draw cmdstream. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/a6xx: drop a few more per-draw registersRob Clark2020-01-172-8/+23
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/a6xx: separate rast stateobj for prim restartRob Clark2020-01-177-37/+67
| | | | | | | | | | | | | | This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather than unconditionally emitting it directly in the cmdstream on every draw. This also starts adding some tracking about previous draw state, so that following patches can limit some of the register writes we currently emit on every draw. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/a6xx: cleanup rasterizer stateRob Clark2020-01-172-89/+54
| | | | | | | | | | | | All but one of the reg values is only used in the stateobj, so we can inline the register value setup and stateobj construction. While we are at it, switch over to the new register builders. Prep work for next patch. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/a6xx: limit scratch/debug markers to debug buildsRob Clark2020-01-171-2/+10
| | | | | | | | | | | | | | | | | The overhead does seem to matter when you have a high enough # of draw calls that effect few bins/pixels, because these writes would happen unconditionally (ie. not part of a state-group). Possibly we could keep these if we moved them into a state-group so the register writes would be no-ops on bins with no geometry. OTOH I usually end up adding in a WFI when using them scratch reg values to track down a crash. (So add a WFI to mitigate the annoyance of needing to use a debug build to get scratch regs to locate the position of a crash/hang in the cmdstream.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
* freedreno/registers: document vertex/instance id offset bitsJonathan Marek2019-12-191-1/+1
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3162>
* freedreno/a6xx: Set up multisample sysmem MRTs correctlyKristian H. Kristensen2019-12-191-3/+1
| | | | | | | | | We had an extra factor of num_samples in the stride. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Rewrite compressed blits in a helper functionKristian H. Kristensen2019-12-191-33/+63
| | | | | | | | Similar to how we handle zs blits. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Move handle_rgba_blit() upKristian H. Kristensen2019-12-191-53/+51
| | | | | | | | If we move this function up, we don't have to forward declare it. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Handle srgb blits on the blitterKristian H. Kristensen2019-12-191-10/+20
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Use A6XX_SP_2D_SRC_FORMAT_MASK macroKristian H. Kristensen2019-12-191-1/+1
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: RB6_R8G8B8 is actually 32 bit RGBXKristian H. Kristensen2019-12-193-10/+10
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Use blitter for resolve blitsKristian H. Kristensen2019-12-191-24/+5
| | | | | | | | | We have a SAMPLES_AVERAGE bit that does what we need for resolving multisample buffers - let's use it. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Add fd_resource_swap() helperKristian H. Kristensen2019-12-194-5/+12
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Pick blitter swap based on resource tilingKristian H. Kristensen2019-12-191-2/+5
| | | | | | | | | | The linear levels in a tiled resource are stored in the canonical swap, WZYX. We need to pick the swap based on whether or not the resource is tiled, not whether the the level in question is tiled. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Program sampler swap based on resource tilingKristian H. Kristensen2019-12-191-16/+4
| | | | | | | | It doesn't matter whether or not the level in question is linear. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno: Add debug flag for forcing linear layoutsKristian H. Kristensen2019-12-193-25/+33
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno/a6xx: Make DEBUG_BLIT_FALLBACK only dump fallbacksKristian H. Kristensen2019-12-191-3/+5
| | | | | | | | Use new macro, DEBUG_BLIT, for dumping all blits. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2848>
* freedreno: Fix CP_MEM_TO_REG flag definitionsConnor Abbott2019-12-182-3/+3
| | | | | | | | | | | These actually mean something completely different, at least on A5xx and A6xx. The only other usage of the old flags on something older than A6xx was a typo, so I don't know if it was always this way, but at the same time it means that we don't have to worry too much about that. Reviewed-by: Eric Anholt <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
* freedreno: Use new macros for CP_WAIT_REG_MEM and CP_WAIT_MEM_GTEConnor Abbott2019-12-182-12/+14
| | | | | | | | | | Similar to the existing usage for CP_COND_WRITE5, this makes it clear what each of the magic parameters are for. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
* a6xx: Add more CP packetsConnor Abbott2019-12-185-12/+12
| | | | | | | | | | | | | | | | | | | | And add fields uncovered by looking at the firmware. I think this covers all the memory, register, and scratch manipulation opcodes that exist on A6xx, plus one additional nice find for Vulkan and describing a previously unknown opcode and documenting CP_WAIT_REG_MEM. Note that the bits for the CP_REG_TO_MEM count, as well as the formula for computing the actual count for both CP_REG_TO_MEM and CP_MEM_TO_REG, are changed because the A630 SQE firmware actually does something different. I haven't investigated older microcodes to see whether this extends back to A5xx and A4xx, but the only non-A6xx uses of this field result in the same bit-pattern when using the A6xx bit range and formula, so it should be safe to change the definition universally. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3116>
* freedreno/a6xx: Document the CP_SET_DRAW_STATE enable bitsKristian H. Kristensen2019-12-172-26/+32
| | | | | | | | | | There are bits for binning, gmem and sysmem. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3131>
* freedreno/a6xx: fix OUT_REG() vs growable cmdstreamRob Clark2019-12-141-1/+1
| | | | | | | | | | BEGIN_RING() could decide we can't fit the next packet in the current cmdstream segment, and grow a new segment. So we need to grab ring->cur *after* BEGIN_RING(), otherwise we are writing cmdstream past the end of the previous segment. Fixes: bdd98b892f3 ("freedreno: New struct packing macros") Signed-off-by: Rob Clark <[email protected]>
* gallium/util: add alignment parameter to util_upload_index_bufferErico Nunes2019-12-141-1/+1
| | | | | | | | | | At least on Mali Utgard, index buffers need to be aligned on 0x40. To avoid duplicating this, add an alignment parameter. Keep the previous default for the other existing users. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2445>
* freedreno/ir3: add last-baryf shaderdb statRob Clark2019-12-131-1/+2
| | | | | | | | | Sometimes sched changes that are a win in terms of instruction count and/or register pressure, are worse in real life, due to keeping varying storage locked for too long. Add a shader-db stat to give this more visibility. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: Convert some tile setup to OUT_REG()Kristian H. Kristensen2019-12-111-25/+15
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Convert gmem blits to OUT_REG()Kristian H. Kristensen2019-12-111-33/+13
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Convert VSC pipe setup to OUT_REG()Kristian H. Kristensen2019-12-111-16/+13
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Convert emit_zs() to OUT_REG()Kristian H. Kristensen2019-12-111-29/+24
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Convert emit_mrt() to OUT_REG()Kristian H. Kristensen2019-12-111-42/+37
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Include fd6_pack.h in a few filesKristian H. Kristensen2019-12-112-8/+10
| | | | | | | | | Including non-functional changes to get the value from the fd_reg_pair in places. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: Drop stale includeKristian H. Kristensen2019-12-111-3/+0
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* freedreno: New struct packing macrosKristian H. Kristensen2019-12-111-0/+109
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]>
* tu: Move UBWC layout into fdl6_layout() and use that function.Eric Anholt2019-12-111-1/+1
| | | | | | | | This gets us shared non-UBWC layout code between gallium and turnip. Until I fix up the rest of gallium to handle UBWC mipmapping, we do the single-level UBWC setup in gallium as a fixup after layout. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Move a6xx's setup_slices() to a shareable helper function.Eric Anholt2019-12-113-146/+19
| | | | | | | We pass in all the parameters for setting up the layout, though freedreno still sets a few of them up early (since it uses layout helpers in making some decisions about the layout setup parameters that will be cleaned up once krh's blitter work lands).
* freedreno: Move UBWC layout into a slices array like the non-UBWC slices.Eric Anholt2019-12-115-9/+11
| | | | | | This is a little refactor in preparation for UBWC mipmapping support. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Refactor the UBWC flags registers emission.Eric Anholt2019-12-113-41/+34
| | | | | | | It's the same logic for each of these being emitted, and I was about to change the rsc->layout.* for UBWC. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Drop the extra offset field for mipmap slices.Eric Anholt2019-12-112-2/+7
| | | | | | | | We can just bake the UBWC-goes-first delta into the slices at setup time. I did have to fix up the resource shadowing swap path to swap the slice fields, as it was missing and regressed the format reinterpets otherwise. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: enable LRZ by defaultRob Clark2019-12-103-2/+4
| | | | | | Now that dEQP should be happy, lets flip the switch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: fix LRZ logicRob Clark2019-12-104-28/+49
| | | | | | | | | | | | In particular, we need to invalidate the LRZ state when we cannot be confident in what the Z state would be during rendering: 1) depth test modes not supported by LRZ 2) stencil test, which would require full rasterization and stencil test in the binning pass (whereas LRZ normally just needs to determine the min and max z value in an 8x8 quad) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: fix LRZ layoutRob Clark2019-12-101-7/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx+a6xx: split LRZ layout to per-genRob Clark2019-12-104-45/+70
| | | | | | Seems to be a bit different for a6xx, so let's split this out. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: disable LRZ when blendingRob Clark2019-12-103-2/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: Track the set of UBOs to be uploaded in UBO analysis.Eric Anholt2019-12-091-19/+21
| | | | | | | | | | | We were iterating over the entire 32-entry array each time, when we can just use a bitset to know that we're only uploading from the first entry normally. Knocks ir3_emit_user_consts down from ~.5% of CPU to .1% on WebGL fishtank. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Stop forcing ALLOW_MAPPED_BUFFERS_DURING_EXEC off.Eric Anholt2019-12-091-3/+0
| | | | | | | | | The default is to not throw GL errors when drawing with mapped buffers, but we were forcing it on for unclear reasons. Internally we keep all our buffers mapped anyway, so it should be a no-op other than reducing CPU overhead (.23% in a perf report for WebGL fishtank) Reviewed-by: Rob Clark <[email protected]>
* freedreno: Enable texture upload memory throttling.Eric Anholt2019-12-061-0/+3
| | | | | | | Fixes oom-killer during streaming-texture-upload, which I found while trying to enable piglit in CI. Reviewed-by: Rob Clark <[email protected]>
* freedreno: reorder format checkFritz Koenig2019-12-063-6/+6
| | | | | | | | | | | | | | With the addition of the planar formats helper, the planar formats no longer have a valid block.bits field. Calling util_format_get_blocksize therefore asserts. Reorder the check to see if the format is supported before doing the query to get the blocksize. Fixes: 20f132e5eff2d ("gallium/util: add planar format layouts and helpers") Signed-off-by: Fritz Koenig <[email protected]> Reviewed-by: Rob Clark <[email protected]>