diff options
author | Brian Paul <[email protected]> | 2010-04-16 09:10:54 -0600 |
---|---|---|
committer | Brian Paul <[email protected]> | 2010-04-16 09:25:44 -0600 |
commit | 0639765b2850739af1678f10fc0c5706d5827776 (patch) | |
tree | dd83a7abaeae23f39f780dad327840b0e12e297b /src/gallium/drivers/llvmpipe/lp_tile_image.c | |
parent | 97831efdb02ad68c70602a5b3a68c024e49e5715 (diff) |
Merge the lp-surface-tiling branch into master.
This branch implemented dual representations of texture/drawing surfaces:
one in the conventional linear layout and the other the tiled layout which
is used by the fragment shader pipe. Per-tile flags indicate the layout
of each image tile. In many situations this lets us avoid converting
image data between the two layouts.
Squashed commit of the following:
commit 563a7e3cc552fdcfcaf9ac0d4b1683c3ba2ae732
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 14:48:21 2010 -0600
llvmpipe: convert points/lines to triangles with draw module
This isn't the most efficient way to render points/lines but it allows us
to run more tests.
commit a8aa763e8a717533f2b13bb6ea53cbccbede68c9
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 14:47:28 2010 -0600
llvmpipe: call llvmpipe_get_texture_tile() for depth/stencil
The returned pointer isn't used, but the tile status/layout info
gets updated. Helps to fix glReadPixels(DEPTH / STENCIL).
commit 463bc64af266194acbea71cd52e26a79b8c8a260
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 10:58:48 2010 -0600
llvmpipe: add store_color to debug cmd_names list
commit 784cc73fb334a9d7b7c93cbd8a1445cdf742ff58
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 10:57:43 2010 -0600
llvmpipe: fix debug build
commit 792c93171ec075664f55720ffed397ac2834a4fc
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 10:49:01 2010 -0600
llvmpipe: fix cube mapping
commit 882b1035db88c3dd8aebe28dc971ac30a9ee39e3
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 09:53:30 2010 -0600
llvmpipe: remove some older/unused code
commit b807d32b23145301e8842824664d9f06b9c5502e
Author: Brian Paul <[email protected]>
Date: Thu Apr 8 09:29:50 2010 -0600
llvmpipe: silence warning
commit 7b337e64fec92836ccdf9d96216289dd58418e35
Author: Brian Paul <[email protected]>
Date: Wed Apr 7 17:06:08 2010 -0600
llvmpipe: clean-up, comments in lp_surface_copy()
commit c52fa36f249cc652fa8d5fdd94d6574127c08c41
Author: Brian Paul <[email protected]>
Date: Wed Apr 7 16:51:42 2010 -0600
llvmpipe: overhaul tiled/linear memory management
Now we keep per-tile layout info (linear vs. tiled (or neither or both)
and convert from one layout to the other on demand.
commit 4a50ccfd470547c9be0704005818a87014e9c0e9
Author: Brian Paul <[email protected]>
Date: Wed Apr 7 16:51:27 2010 -0600
llvmpipe: added tile read/write counters
commit b7d0ea9c687ac8773b083791623826fa604adf21
Author: Brian Paul <[email protected]>
Date: Mon Apr 5 14:54:04 2010 -0600
llvmpipe: rename some functions
commit ee45c6e5b95cbd3c8cccc9aa4d45d8aef11e20c4
Author: Brian Paul <[email protected]>
Date: Mon Apr 5 14:42:15 2010 -0600
llvmpipe: re-org some get block/tile pointer code
commit 26ce97c16c0b6520ff1538803baa772d8c3b1280
Author: Brian Paul <[email protected]>
Date: Mon Apr 5 14:34:13 2010 -0600
llvmpipe: disable bad assertions
commit 5c670481248c4d46f87f13bf3af5655925e7002d
Author: Brian Paul <[email protected]>
Date: Fri Apr 2 16:36:11 2010 -0600
llvmpipe: add a special-case optimization to lp_surface_copy()
Be more efficient when copying tiled image to linear image.
Before, the fallback path was always converting the whole source image
to linear. Now we can convert just a sub region.
commit faa684645e64d6024b3a11e4e08da825e8220b2e
Author: Brian Paul <[email protected]>
Date: Fri Apr 2 16:15:16 2010 -0600
llvmpipe: assorted texture and tile/line conversion code change s
The tiled/linear conversion functions take x/y positions now to
allow converting only sub-regions.
More texture-related helper functions.
commit baad81ec5318d44bfac1e37c7643afc0836607bb
Author: Brian Paul <[email protected]>
Date: Tue Mar 30 13:18:40 2010 -0600
llvmpipe: convert tiled->linear upon PIPE_FLUSH_SWAPBUFFERS
If we know we're about to do a swapbuffers we should immediately
convert the tiled color tiles to linear instead of later in
llvmpipe_texture_unmap() since we can take advantage of threading/
parallelism here.
commit 928dd41256811daeddb7506a49a34dbad04beaf8
Author: Brian Paul <[email protected]>
Date: Tue Mar 30 09:16:58 2010 -0600
llvmpipe: polish-up the llvmpipe_flush() code
commit dd6014abcf86c517d159b8175e0eaeb167ea2ef6
Author: Brian Paul <[email protected]>
Date: Tue Mar 30 09:15:17 2010 -0600
llvmpipe: SETUP_x enum clean-up
commit 0b1ce6da2b28a41f3389685ab93e10b43c950f5d
Author: Brian Paul <[email protected]>
Date: Fri Mar 26 10:43:37 2010 -0600
llvmpipe: remove unused vars
commit 4562663480f88162ed4452cb05569eecb67f9f39
Author: Brian Paul <[email protected]>
Date: Fri Mar 26 10:31:55 2010 -0600
llvmpipe: cope with non-existant color/depth buffers
The fragment jit functions always grab these pointers, even if they're
not used.
commit df4329edbaf204ed501f1eac0698b8198178f9af
Author: Brian Paul <[email protected]>
Date: Thu Mar 25 15:20:15 2010 -0600
llvmpipe: do all render target surface mapping/unmapping in the rast code
commit 3d0c25d5ba8b8f61e8366d4c97324e45d526ff41
Author: Brian Paul <[email protected]>
Date: Thu Mar 25 14:31:21 2010 -0600
llvmpipe: map z/stencil buffer on demand like color buffers
Plus lots of code clean-up and loose ends taken care of.
commit c3b6fddd788aef09b4b84b843b7b1272231151e8
Author: Brian Paul <[email protected]>
Date: Thu Mar 25 13:15:03 2010 -0600
llvmpipe: remove unused write_zstencil field
commit 63374d97836926a6357e9d6dd24a509a8e155c56
Author: Brian Paul <[email protected]>
Date: Thu Mar 25 09:45:59 2010 -0600
llvmpipe: add missing lp_rast_end() call
Fixes crash on window resize when LP_NUM_THREADS=0.
commit 92fe9952161cc06f6edc58778e9e5a8b9ea447dc
Author: Brian Paul <[email protected]>
Date: Wed Mar 24 10:15:19 2010 -0600
llvmpipe: add tiled/linear conversion for 16-bit Z images
commit 6605fa28c147f30df351da0e4413cab33e4db5da
Author: Brian Paul <[email protected]>
Date: Tue Mar 23 16:06:41 2010 -0600
llvmpipe: implement tiled/linear conversion for Z/stencil images
commit 804528d84ffa292ef9d49d3666cdd3fa099ff3ff
Author: Brian Paul <[email protected]>
Date: Tue Mar 23 16:05:45 2010 -0600
llvmpipe: added texture stride comment
commit 66a88c012edf670c4ac887a912f02dcff93266dd
Author: Brian Paul <[email protected]>
Date: Tue Mar 23 16:04:07 2010 -0600
llvmpipe: remove unused vars
commit e2ca8d1328316dc8b36d5f688c16d109e49a6870
Author: Brian Paul <[email protected]>
Date: Mon Mar 22 18:53:11 2010 -0600
llvmpipe: checkpoint WIP: overhaul texture/surface mapping
Conversion between tiled and linear surfaces is working everywhere now.
The LP_TEXTURE_READ/READ_WRITE/WRITE_ALL flags let us avoid unnecessary
image layout conversions.
Still some loose ends, temporary/debug code, etc.
Need to implement tiled/linear conversion for depth/stencil images.
commit f2730a03839ee8984c1f537b7cbebba24961397a
Author: Brian Paul <[email protected]>
Date: Mon Mar 22 14:41:58 2010 -0600
llvmpipe: rename/repurpose lp_rast_store_color()
commit e192a47552c5d20d2caef452ca7697e2cd852c9b
Author: Brian Paul <[email protected]>
Date: Mon Mar 22 14:38:51 2010 -0600
llvmpipe: remove lp_rast_load_color()
commit 3cff0bde4b4ab980e1c3e812700419091527c76b
Author: Brian Paul <[email protected]>
Date: Mon Mar 22 14:11:38 2010 -0600
llvmpipe: remove/consolidate texture image code
commit 3a2f08b6a550c69ef5e874f482be30252cbf8bfa
Author: Brian Paul <[email protected]>
Date: Fri Mar 19 17:03:14 2010 -0600
llvmpipe: checkpoint WIP: directly render to tiled texture buffers
We're now directly writing colors into the tiled texture image buffers.
This is a checkpoint commit with lots of dead code and temporary hacks.
Everything will get cleaned up eventually.
commit c5ca987e03870849514d4e3c99af143722a09695
Author: Brian Paul <[email protected]>
Date: Fri Mar 19 16:41:14 2010 -0600
llvmpipe: refactor code, create tile_pixel_offset()
commit 2133e8273e937cbac09cd7264d6ce53af9764ddb
Author: Brian Paul <[email protected]>
Date: Fri Mar 19 14:55:11 2010 -0600
llvmpipe: pass LP_TEXTURE_LINEAR/TILED flags around
commit b9b9d4b82b01f4588721fdc8444740f859b4a021
Author: Brian Paul <[email protected]>
Date: Fri Mar 19 14:51:05 2010 -0600
llvmpipe: checkpoint WIP: hanlde co-existing tiled/linear texture data
Cube maps are temporarily broken, maybe other things.
commit 4cd322e6889940b5f155fcb69041b685b9ef9273
Author: Brian Paul <[email protected]>
Date: Fri Mar 19 11:34:43 2010 -0600
progs/demos: add other modes/patterns to dissolve demo
Diffstat (limited to 'src/gallium/drivers/llvmpipe/lp_tile_image.c')
-rw-r--r-- | src/gallium/drivers/llvmpipe/lp_tile_image.c | 296 |
1 files changed, 250 insertions, 46 deletions
diff --git a/src/gallium/drivers/llvmpipe/lp_tile_image.c b/src/gallium/drivers/llvmpipe/lp_tile_image.c index c1980b316d5..0852150ba72 100644 --- a/src/gallium/drivers/llvmpipe/lp_tile_image.c +++ b/src/gallium/drivers/llvmpipe/lp_tile_image.c @@ -25,6 +25,14 @@ **************************************************************************/ +/** + * Code to convert images from tiled to linear and back. + * XXX there are quite a few assumptions about color and z/stencil being + * 32bpp. + */ + + +#include "util/u_format.h" #include "lp_tile_soa.h" #include "lp_tile_image.h" @@ -33,33 +41,172 @@ /** + * Untile a 4x4 block of 32-bit words (all contiguous) to linear layout + * at dst, with dst_stride words between rows. + */ +static void +untile_4_4_uint32(const uint32_t *src, uint32_t *dst, unsigned dst_stride) +{ + uint32_t *d0 = dst; + uint32_t *d1 = d0 + dst_stride; + uint32_t *d2 = d1 + dst_stride; + uint32_t *d3 = d2 + dst_stride; + + d0[0] = src[0]; d0[1] = src[1]; d0[2] = src[4]; d0[3] = src[5]; + d1[0] = src[2]; d1[1] = src[3]; d1[2] = src[6]; d1[3] = src[7]; + d2[0] = src[8]; d2[1] = src[9]; d2[2] = src[12]; d2[3] = src[13]; + d3[0] = src[10]; d3[1] = src[11]; d3[2] = src[14]; d3[3] = src[15]; +} + + + +/** + * Untile a 4x4 block of 16-bit words (all contiguous) to linear layout + * at dst, with dst_stride words between rows. + */ +static void +untile_4_4_uint16(const uint16_t *src, uint16_t *dst, unsigned dst_stride) +{ + uint16_t *d0 = dst; + uint16_t *d1 = d0 + dst_stride; + uint16_t *d2 = d1 + dst_stride; + uint16_t *d3 = d2 + dst_stride; + + d0[0] = src[0]; d0[1] = src[1]; d0[2] = src[4]; d0[3] = src[5]; + d1[0] = src[2]; d1[1] = src[3]; d1[2] = src[6]; d1[3] = src[7]; + d2[0] = src[8]; d2[1] = src[9]; d2[2] = src[12]; d2[3] = src[13]; + d3[0] = src[10]; d3[1] = src[11]; d3[2] = src[14]; d3[3] = src[15]; +} + + + +/** + * Convert a 4x4 rect of 32-bit words from a linear layout into tiled + * layout (in which all 16 words are contiguous). + */ +static void +tile_4_4_uint32(const uint32_t *src, uint32_t *dst, unsigned src_stride) +{ + const uint32_t *s0 = src; + const uint32_t *s1 = s0 + src_stride; + const uint32_t *s2 = s1 + src_stride; + const uint32_t *s3 = s2 + src_stride; + + dst[0] = s0[0]; dst[1] = s0[1]; dst[4] = s0[2]; dst[5] = s0[3]; + dst[2] = s1[0]; dst[3] = s1[1]; dst[6] = s1[2]; dst[7] = s1[3]; + dst[8] = s2[0]; dst[9] = s2[1]; dst[12] = s2[2]; dst[13] = s2[3]; + dst[10] = s3[0]; dst[11] = s3[1]; dst[14] = s3[2]; dst[15] = s3[3]; +} + + + +/** + * Convert a 4x4 rect of 16-bit words from a linear layout into tiled + * layout (in which all 16 words are contiguous). + */ +static void +tile_4_4_uint16(const uint16_t *src, uint16_t *dst, unsigned src_stride) +{ + const uint16_t *s0 = src; + const uint16_t *s1 = s0 + src_stride; + const uint16_t *s2 = s1 + src_stride; + const uint16_t *s3 = s2 + src_stride; + + dst[0] = s0[0]; dst[1] = s0[1]; dst[4] = s0[2]; dst[5] = s0[3]; + dst[2] = s1[0]; dst[3] = s1[1]; dst[6] = s1[2]; dst[7] = s1[3]; + dst[8] = s2[0]; dst[9] = s2[1]; dst[12] = s2[2]; dst[13] = s2[3]; + dst[10] = s3[0]; dst[11] = s3[1]; dst[14] = s3[2]; dst[15] = s3[3]; +} + + + +/** * Convert a tiled image into a linear image. * \param src_stride source row stride in bytes (bytes per row of tiles) * \param dst_stride dest row stride in bytes */ void -lp_tiled_to_linear(const uint8_t *src, - uint8_t *dst, +lp_tiled_to_linear(const void *src, void *dst, + unsigned x, unsigned y, unsigned width, unsigned height, - enum pipe_format format, - unsigned src_stride, - unsigned dst_stride) + enum pipe_format format, unsigned dst_stride) { - const unsigned tiles_per_row = src_stride / BYTES_PER_TILE; - unsigned i, j; - - for (j = 0; j < height; j += TILE_SIZE) { - for (i = 0; i < width; i += TILE_SIZE) { - unsigned tile_offset = - ((j / TILE_SIZE) * tiles_per_row + i / TILE_SIZE); - unsigned byte_offset = tile_offset * BYTES_PER_TILE; - const uint8_t *src_tile = src + byte_offset; - - lp_tile_write_4ub(format, - src_tile, - dst, - dst_stride, - i, j, TILE_SIZE, TILE_SIZE); + assert(x % TILE_SIZE == 0); + assert(y % TILE_SIZE == 0); + /*assert(width % TILE_SIZE == 0); + assert(height % TILE_SIZE == 0);*/ + + /* Note that Z/stencil surfaces use a different tiling size than + * color surfaces. + */ + if (util_format_is_depth_or_stencil(format)) { + const uint bpp = util_format_get_blocksize(format); + const uint src_stride = dst_stride * TILE_VECTOR_WIDTH; + const uint tile_w = TILE_VECTOR_WIDTH, tile_h = TILE_VECTOR_HEIGHT; + const uint tiles_per_row = src_stride / (tile_w * tile_h * bpp); + + dst_stride /= bpp; /* convert from bytes to words */ + + if (bpp == 4) { + const uint32_t *src32 = (const uint32_t *) src; + uint32_t *dst32 = (uint32_t *) dst; + uint i, j; + + for (j = 0; j < height; j += tile_h) { + for (i = 0; i < width; i += tile_w) { + /* compute offsets in 32-bit words */ + uint ii = i + x, jj = j + y; + uint src_offset = (jj / tile_h * tiles_per_row + ii / tile_w) + * (tile_w * tile_h); + uint dst_offset = jj * dst_stride + ii; + untile_4_4_uint32(src32 + src_offset, + dst32 + dst_offset, + dst_stride); + } + } + } + else { + const uint16_t *src16 = (const uint16_t *) src; + uint16_t *dst16 = (uint16_t *) dst; + uint i, j; + + assert(bpp == 2); + + for (j = 0; j < height; j += tile_h) { + for (i = 0; i < width; i += tile_w) { + /* compute offsets in 16-bit words */ + uint ii = i + x, jj = j + y; + uint src_offset = (jj / tile_h * tiles_per_row + ii / tile_w) + * (tile_w * tile_h); + uint dst_offset = jj * dst_stride + ii; + untile_4_4_uint16(src16 + src_offset, + dst16 + dst_offset, + dst_stride); + } + } + } + } + else { + /* color image */ + const uint bpp = 4; + const uint tile_w = TILE_SIZE, tile_h = TILE_SIZE; + const uint bytes_per_tile = tile_w * tile_h * bpp; + const uint src_stride = dst_stride * tile_w; + const uint tiles_per_row = src_stride / bytes_per_tile; + uint i, j; + + for (j = 0; j < height; j += tile_h) { + for (i = 0; i < width; i += tile_w) { + uint ii = i + x, jj = j + y; + uint tile_offset = ((jj / tile_h) * tiles_per_row + ii / tile_w); + uint byte_offset = tile_offset * bytes_per_tile; + const uint8_t *src_tile = (uint8_t *) src + byte_offset; + + lp_tile_write_4ub(format, + src_tile, + dst, dst_stride, + ii, jj, tile_w, tile_h); + } } } } @@ -71,28 +218,85 @@ lp_tiled_to_linear(const uint8_t *src, * \param dst_stride dest row stride in bytes (bytes per row of tiles) */ void -lp_linear_to_tiled(const uint8_t *src, - uint8_t *dst, +lp_linear_to_tiled(const void *src, void *dst, + unsigned x, unsigned y, unsigned width, unsigned height, - enum pipe_format format, - unsigned src_stride, - unsigned dst_stride) + enum pipe_format format, unsigned src_stride) { - const unsigned tiles_per_row = dst_stride / BYTES_PER_TILE; - unsigned i, j; - - for (j = 0; j < height; j += TILE_SIZE) { - for (i = 0; i < width; i += TILE_SIZE) { - unsigned tile_offset = - ((j / TILE_SIZE) * tiles_per_row + i / TILE_SIZE); - unsigned byte_offset = tile_offset * BYTES_PER_TILE; - uint8_t *dst_tile = dst + byte_offset; - - lp_tile_read_4ub(format, - dst_tile, - src, - src_stride, - i, j, TILE_SIZE, TILE_SIZE); + assert(x % TILE_SIZE == 0); + assert(y % TILE_SIZE == 0); + /* + assert(width % TILE_SIZE == 0); + assert(height % TILE_SIZE == 0); + */ + + if (util_format_is_depth_or_stencil(format)) { + const uint bpp = util_format_get_blocksize(format); + const uint dst_stride = src_stride * TILE_VECTOR_WIDTH; + const uint tile_w = TILE_VECTOR_WIDTH, tile_h = TILE_VECTOR_HEIGHT; + const uint tiles_per_row = dst_stride / (tile_w * tile_h * bpp); + + src_stride /= bpp; /* convert from bytes to words */ + + if (bpp == 4) { + const uint32_t *src32 = (const uint32_t *) src; + uint32_t *dst32 = (uint32_t *) dst; + uint i, j; + + for (j = 0; j < height; j += tile_h) { + for (i = 0; i < width; i += tile_w) { + /* compute offsets in 32-bit words */ + uint ii = i + x, jj = j + y; + uint src_offset = jj * src_stride + ii; + uint dst_offset = (jj / tile_h * tiles_per_row + ii / tile_w) + * (tile_w * tile_h); + tile_4_4_uint32(src32 + src_offset, + dst32 + dst_offset, + src_stride); + } + } + } + else { + const uint16_t *src16 = (const uint16_t *) src; + uint16_t *dst16 = (uint16_t *) dst; + uint i, j; + + assert(bpp == 2); + + for (j = 0; j < height; j += tile_h) { + for (i = 0; i < width; i += tile_w) { + /* compute offsets in 16-bit words */ + uint ii = i + x, jj = j + y; + uint src_offset = jj * src_stride + ii; + uint dst_offset = (jj / tile_h * tiles_per_row + ii / tile_w) + * (tile_w * tile_h); + tile_4_4_uint16(src16 + src_offset, + dst16 + dst_offset, + src_stride); + } + } + } + } + else { + const uint bpp = 4; + const uint tile_w = TILE_SIZE, tile_h = TILE_SIZE; + const uint bytes_per_tile = tile_w * tile_h * bpp; + const uint dst_stride = src_stride * tile_w; + const uint tiles_per_row = dst_stride / bytes_per_tile; + uint i, j; + + for (j = 0; j < height; j += TILE_SIZE) { + for (i = 0; i < width; i += TILE_SIZE) { + uint ii = i + x, jj = j + y; + uint tile_offset = ((jj / tile_h) * tiles_per_row + ii / tile_w); + uint byte_offset = tile_offset * bytes_per_tile; + uint8_t *dst_tile = (uint8_t *) dst + byte_offset; + + lp_tile_read_4ub(format, + dst_tile, + src, src_stride, + ii, jj, tile_w, tile_h); + } } } } @@ -102,7 +306,7 @@ lp_linear_to_tiled(const uint8_t *src, * For testing only. */ void -test_tiled_linear_conversion(uint8_t *data, +test_tiled_linear_conversion(void *data, enum pipe_format format, unsigned width, unsigned height, unsigned stride) @@ -113,13 +317,13 @@ test_tiled_linear_conversion(uint8_t *data, uint8_t *tiled = malloc(wt * ht * TILE_SIZE * TILE_SIZE * 4); - unsigned tiled_stride = wt * TILE_SIZE * TILE_SIZE * 4; + /*unsigned tiled_stride = wt * TILE_SIZE * TILE_SIZE * 4;*/ - lp_linear_to_tiled(data, tiled, width, height, format, - stride, tiled_stride); + lp_linear_to_tiled(data, tiled, 0, 0, width, height, format, + stride); - lp_tiled_to_linear(tiled, data, width, height, format, - tiled_stride, stride); + lp_tiled_to_linear(tiled, data, 0, 0, width, height, format, + stride); free(tiled); } |