diff options
author | Bas Nieuwenhuizen <[email protected]> | 2016-05-10 00:49:39 +0200 |
---|---|---|
committer | Bas Nieuwenhuizen <[email protected]> | 2016-05-26 22:07:04 +0200 |
commit | 7846fa876820fde373b4402e5d1cf3d24f06d11f (patch) | |
tree | e4f26b31ddd08e953c79bf253bf9e8b51b36132d /scripts | |
parent | c49e68dc4bcc14cac529d1e3be5fe0090ed4d146 (diff) |
radeonsi: Add offchip buffer address calculation.
Instead of creating a memory area per patch and per vertex, we put
the same attribute of every vertex & patch together. Most loads
and stores access the same attribute across all lanes, only for
different patches and vertices.
For the TCS this results in tightly packed data for 4-component
stores.
For the TES this is not the case as within a patch the loads
often also access the same vertex. However if there are < 4
vertices/patch, this still results in a reduction of the number
of cache lines. In the LDS situation we only do better than worst
case if the data per patch < 64 bytes, which due to the
tessellation factors is pretty much never.
We do not use hardware swizzling for this. It would slightly reduce
the number of executed VALU instructions, but I had issues with
increased wait times that I haven't been able to solve yet.
Furthermore, the tbuffer_store intrinsic does not support both
VGPR offset and an index, so we have a problem storing
indirectly indexed outputs. This can be solved by temporarily
storing arrays in LDS and then copying them, but I don't think
that is worth the effort. The difference in VALU cycles
hardware swizzling gives is about 0.2% of total busy cycles.
That is without handling the array case.
I chose for attributes instead of components as they are often
accessed together, and the software swizzling takes VALU cycles
for calculating offsets.
v2: - Rename functions to get_tcs_tes_buffer_address.
- multiply by 16 as late as possible.
- Use tgsi_full_src_register_from_dst.
- Remove some bad comments.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Diffstat (limited to 'scripts')
0 files changed, 0 insertions, 0 deletions