| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
| |
This might have a slight overhead but handling mip offsets more like
the width (and image) strides should make some things easier (mip level
being just part of the offset calculation) later.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Pointed out by Marek on irc,
no committing after beer!
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
draw_delete_geometry_shader() seems to be the real one.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
util_pack_z_stencil was being unconditionally invoked for all formats,
causing an assertion failure for Z32_FLOAT_S8X24_UINT.
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Alpha is also 1 for formats like R32G32_FLOAT.
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
For drivers with native integer / SM4 support this is just an hindrance.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
should fix http://bugs.freedesktop.org/56906
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds cube array support to the blitter.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support to the softpipe texture sampler and tgsi exec.
In order to handle the extra input to the texture sampling,
I've had to expand the interfaces to take a c1 value for storing
the texture compare value for the TEX2 case.
v1.1: add comments (Brian)
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just adds the texture target and capability along
with 3 new opcodes required to support this extension.
As this extension requires some texture opcodes with samp + 5 args,
we need to use another src register, this is only required
for TEX, TXL and TXB opcodes to implement this spec.
TEX2 is required for shadow cube map arrays
TXL2 is required for cube map array sampler + explicit lod
TXB2 is required for cube map array sampler + lod bias
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The decompression is done in-place and only the compressed tiles are
decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F.
The texture unit is programmed to use non-displayable tiling and depth
ordering of samples, so that it can fetch the texture in the native DB format.
The latest version of the libdrm surface allocator is required for stencil
texturing to work. The old one didn't create the mipmap tree correctly.
We need a separate mipmap tree for stencil, because the stencil mipmap
offsets are not really depth offsets/4.
There are still some known bugs, but this should save some memory and it also
improves performance a little bit in Lightsmark (especially with low
resolutions; tested with Radeon HD 5000).
The DB->CB copy is still used for transfers.
Reviewed-by: Jerome Glisse <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This is a regression since b3921e1f53833420e0a0fd581f7417.
The array stores VS outputs, not FS inputs.
Now llvmpipe can do 32 varyings too.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows updating only a subrange of buffer bindings.
set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that
range. Binding NULL resources unbinds buffers too (both buffer and user_buffer
must be NULL).
The meta ops are adapted to only save, change, and restore the single slot
they use. The cso_context can save and restore only one vertex buffer slot.
The clients can query which one it is using cso_get_aux_vertex_buffer_slot.
It's currently set to 0. (the Draw module breaks if it's set to non-zero)
It should decrease the CPU overhead when using a lot of meta ops, but
the drivers must be able to treat each vertex buffer slot as a separate
state (only r600g does so at the moment).
I can imagine this also being useful for optimizing some OpenGL use cases.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns
garbage there.
The 8x MSAA case is broken on Cayman, though at least the result looks somewhat
correct.
Only the 8x MSAA case works on Evergreen and is enabled.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're starting to get apps utilizing more than 16 varyings and
most current hardware supports 32 anyway.
Tested with r600g.
swrast, softpipe and llvmpipe still advertise 16 varyings.
This fixes a WebGL crash after launching this demo:
https://developer.mozilla.org/en-US/demos/detail/falling-cubes
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54402
NOTE: This is a candidate for the stable branches.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 3.1+ haven't more "extern unsigned llvm::StackAlignmentOverride"
and friends for configuring code generation options, like stack
alignment.
So I restrict assiging of lvm::StackAlignmentOverride and other
variables to LLVM 3.0 only, and wrote similiar code using
TargetOptions.
This patch fix segfaulting of WINE using llvmpipe built with LLVM 3.1
Signed-off-by: Alexander V. Nikolaev <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
| |
Use the PRIx64 and PRIu64 format macros from inttypes.h. We made a
similar change in prog_print.c in df2d81ea59993a77bd1f1ef96c5cf19ac692d5f7.
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
unsupported by LLVM.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Per commentary and direction in the LLVM community, support for ppc64 is
going into MCJIT rather than the old JIT. There is no existing support
in prior llvm versions, so no need to specify LLVM version numbers.
Signed-off-by: Will Schmidt <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Fix signed/unsigned comparison warnings and float/int assignment warnings.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Thanks to Brian for pointing this out.
|
|
|
|
| |
This reverts commit bf2edc776b02a2a63862bf69a23adf666ecfcc57.
|
| |
|
|
|
|
|
| |
Note: This is a candidate for the 9.0 branch.
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Could happen when CPU supports AVX, but LLVM doesn't.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example:
VERT
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], GENERIC[12]
DCL CONST[0..4]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
IMM[0] UINT32 {4294967295, 0, 0, 0}
IMM[1] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000}
0: SEQ TEMP[0].x, CONST[3].xxxx, IMM[0].xxxx
1: F2I TEMP[0].x, -TEMP[0]
2: SEQ TEMP[1].x, CONST[4].xxxx, IMM[0].xxxx
3: F2I TEMP[1].x, -TEMP[1]
4: AND TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx
5: IF TEMP[0].xxxx :0
6: MOV TEMP[0], IMM[1].xyxy
7: ELSE :0
8: MOV TEMP[0], IMM[1].yxxy
9: ENDIF
10: MOV OUT[1], TEMP[0]
11: MOV OUT[0], IN[0]
12: END
instead of
VERT
DCL IN[0]
DCL OUT[0], POSITION
DCL OUT[1], GENERIC[12]
DCL CONST[0..4]
DCL TEMP[0], LOCAL
DCL TEMP[1], LOCAL
IMM UINT32 {4294967295, 0, 0, 0}
IMM FLT32 { 0.0000, 1.0000, 0.0000, 0.0000}
0: SEQ TEMP[0].x, CONST[3].xxxx, IMM[0].xxxx
1: F2I TEMP[0].x, -TEMP[0]
2: SEQ TEMP[1].x, CONST[4].xxxx, IMM[0].xxxx
3: F2I TEMP[1].x, -TEMP[1]
4: AND TEMP[0].x, TEMP[0].xxxx, TEMP[1].xxxx
5: IF TEMP[0].xxxx :0
6: MOV TEMP[0], IMM[1].xyxy
7: ELSE :0
8: MOV TEMP[0], IMM[1].yxxy
9: ENDIF
10: MOV OUT[1], TEMP[0]
11: MOV OUT[0], IN[0]
12: END
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lp_build_rsqrt initially did not do any newton-raphson step. This meant that
precision was only ~11 bits, but this handled both input 0.0 and +infinity
correctly. It did not however handle input 1.0 accurately, and denormals
always generated infinity result.
Doing a newton-raphson step increased precision significantly (but notably
input 1.0 still doesn't give output 1.0), however this fails for inputs
0.0 and infinity (both result in NaNs).
Try to fix this up by using cmp/select but since this is all quite fishy
(and still doesn't handle denormals) disable for now. Note that even with
workarounds it should still have been faster since the fallback uses sqrt/div
(which both use the usually unpipelined and slow divider hw).
Also add some more test values to lp_test_arit and test lp_build_rcp() too while
there.
v2: based on José's feedback, avoid hacky infinity definition which doesn't
work with msvc (unfortunately using INFINITY won't cut it neither on non-c99
compilers) in lp_build_rsqrt, and while here fix up the input infinity case
too (it's disabled anyway). Only test infinity input case if we have c99,
and use float cast for calculating reference rsqrt value so we really get
what we expect.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
by changing the format to NORM.
|
|
|
|
| |
Fix breakage from commit 369e468.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"get_transfer + transfer_map" becomes "transfer_map".
"transfer_unmap + transfer_destroy" becomes "transfer_unmap".
transfer_map must create and return the transfer object and transfer_unmap
must destroy it.
transfer_map is successful if the returned buffer pointer is not NULL.
If transfer_map fails, the pointer to the transfer object remains unchanged
(i.e. doesn't have to be NULL).
Acked-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
No idea why this is #ifdef'd. Trace and Noop are definitely useful no matter
how Mesa is built.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Tested-by: Michel Dänzer <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Tested-by: Michel Dänzer <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|