| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
It is buggy (it was giving wrong results for some of the formats with
padding), and util_format_description::is_array already does precisely
what's intended.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
formats.
This is what we want in practice.
The only change is in PIPE_FORMAT_R8SG8SB8UX8U_NORM, which no longer is
considered an array format.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch fixes various format manipulation for big-endian
architectures.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds two more functions in type conversions header:
* lp_build_bswap: construct a call to llvm.bswap intrinsic for an
element
* lp_build_bswap_vec: byte swap every element in a vector base on the
input and output types.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch fixes the vector constant generation used for vector shuffle
for big-endian machines.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch enforces the clear of NJ bit in VSCR Altivec register so
denormal numbers are handles as expected by IEEE standards.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch adds Altivec intrinsics for float vector types. It changes
the SSE specific definitions to a platform neutral and adds the calls
to Altivec intrinsic builder.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add correct vector addition and substraction intrisics when
using Altivec with PPC. Current code uses default path and LLVM backend
ends up issuing carry-out arithmetic instruction while it is expected
saturated ones.
It also includes a fix for PowerPC where char are unsigned by default,
resulting in bogus values for vector shifting.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds the PPC Altivec instrics max/min instruction for
supported Altivec vector types (16xi8, 8xi16, 4xi32, 4xf32).
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch adds PPC Altivec support for pack/unpack operations using Altivec
supported vector type (8xi8, 16xi16, 4xi32, 4xf32).
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Also updated lp_build_const_mask_aos_swizzled to reflect this.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Will allow formats with padding, e.g. RGBX.
Will now allow swizzled formats as long as the alpha is channel 3.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Removed the lp_type argument as it was unnecessary.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
broken since 529fe420ba6836479619ba42e53665724755fc1c,
I forgot some code, only added the comment...
Fixes bug 57644.
|
|
|
|
|
|
|
|
| |
Previously this function would assert if the format didn't fit an expected 4 channel format size.
Now will work with any format type with any amount of channels.
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
| |
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
length 1.
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
No longer have to split fetching into quads dynamically if mip levels
are not the same for all quads (aos sampling still always splits due
to performance reasons).
Instead handle multiple mip levels further down, minification etc. takes
this into account.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This also adds some code to handle per-quad lods for more than 4-wide fetches,
because otherwise I'd have to integrate the texelFetch function into
the splitting stuff... (but it is not used yet outside texelFetch).
passes piglit fs-texelFetch-2D, fails fs-texelFetchOffset-2D due to I believe
a test error (results are undefined for out-of-bounds fetches, we return
whatever is at offset 0, whereas the test expects [0,0,0,1]).
Texel offsets are only handled by texelFetch for now, though the interface
can handle it for everything.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
MinGW has snprintf.
The patch fixes these warnings with the MinGW SCons build.
src/gallium/auxiliary/util/u_snprintf.c:459:1: warning: no previous prototype for ‘util_vsnprintf’ [-Wmissing-prototypes]
src/gallium/auxiliary/util/u_snprintf.c:1436:1: warning: no previous prototype for ‘util_snprintf’ [-Wmissing-prototypes]
Signed-off-by: Vinson Lee <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
| |
"#__symbol" doesn't work with nested macro expansions, at least not on gcc.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The passthrough pipeline needs to check index values (which might be passed
through) as they can be invalid (which causes crashes and various assertion
failures if the clip code runs). Obviously, rendering won't be well-defined,
but those bogus indices might come directly from apps.
There were already debug printfs which reported the out-of-bounds indices but
we really ought to not crash.
While checking at that point doesn't seem like the most efficient solution,
it seems there isn't really another appropriate function to do it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
This might have a slight overhead but handling mip offsets more like
the width (and image) strides should make some things easier (mip level
being just part of the offset calculation) later.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Pointed out by Marek on irc,
no committing after beer!
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
draw_delete_geometry_shader() seems to be the real one.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
util_pack_z_stencil was being unconditionally invoked for all formats,
causing an assertion failure for Z32_FLOAT_S8X24_UINT.
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Alpha is also 1 for formats like R32G32_FLOAT.
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
For drivers with native integer / SM4 support this is just an hindrance.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
should fix http://bugs.freedesktop.org/56906
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds cube array support to the blitter.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support to the softpipe texture sampler and tgsi exec.
In order to handle the extra input to the texture sampling,
I've had to expand the interfaces to take a c1 value for storing
the texture compare value for the TEX2 case.
v1.1: add comments (Brian)
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just adds the texture target and capability along
with 3 new opcodes required to support this extension.
As this extension requires some texture opcodes with samp + 5 args,
we need to use another src register, this is only required
for TEX, TXL and TXB opcodes to implement this spec.
TEX2 is required for shadow cube map arrays
TXL2 is required for cube map array sampler + explicit lod
TXB2 is required for cube map array sampler + lod bias
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decompression is done in-place and only the compressed tiles are
decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.
The texture unit is programmed to use non-displayable tiling and depth
ordering of samples, so that it can fetch the texture in the native DB format.
The latest version of the libdrm surface allocator is required for stencil
texturing to work. The old one didn't create the mipmap tree correctly.
We need a separate mipmap tree for stencil, because the stencil mipmap
offsets are not really depth offsets/4.
There are still some known bugs, but this should save some memory and it also
improves performance a little bit in Lightsmark (especially with low
resolutions; tested with Radeon HD 5000).
The DB->CB copy is still used for transfers.
Reviewed-by: Jerome Glisse <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is a regression since b3921e1f53833420e0a0fd581f7417.
The array stores VS outputs, not FS inputs.
Now llvmpipe can do 32 varyings too.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows updating only a subrange of buffer bindings.
set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that
range. Binding NULL resources unbinds buffers too (both buffer and user_buffer
must be NULL).
The meta ops are adapted to only save, change, and restore the single slot
they use. The cso_context can save and restore only one vertex buffer slot.
The clients can query which one it is using cso_get_aux_vertex_buffer_slot.
It's currently set to 0. (the Draw module breaks if it's set to non-zero)
It should decrease the CPU overhead when using a lot of meta ops, but
the drivers must be able to treat each vertex buffer slot as a separate
state (only r600g does so at the moment).
I can imagine this also being useful for optimizing some OpenGL use cases.
Reviewed-by: Brian Paul <[email protected]>
|