| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis. Lets these files
build on OpenBSD.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA
GLX visuals, which was enough for read-only color-only transfers.
This commit makes write color transfers and depth-stencil transfers work
in a similar manner. It does downsampling in transfer_map and upsampling
in transfer_unmap.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching.
Both of them don't do any format conversion.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
We can use the fragment shader TGSI property WRITES_ALL_CBUFS.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
| |
This patch improves handling of unconditional KILL instructions inside
the conditional blocks, uncovering more opportunities for if-conversion.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
| |
Fixes incorrect condition that prevented optimization for
PRED_SETE/PRED_SETE_INT.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PRED_SET instructions that update exec mask should be scheduled immediately
prior to the "if-then-else" block, because any instruction that is
inserted after alu clause with PRED_SET and before conditional block is
also conditionally executed by hw (exec mask is already updated at that
moment).
Propbably it's better to make PRED_SET a part of conditional
"if-then-else" block in the IR to handle this more cleanly,
but for now this temporary solution should prevent the problem.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For compute shaders we need to let the backend know that
GPRs 0 and 1 are preloaded with some compute-specific input
values, otherwise any use of these regs without previous
definition is considered as undefined value and usually
is simply replaced with 0.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows GVN rewrite pass to propagate non-const (register)
values to FETCH source operands, helping to eliminate unnecessary
copies in some cases.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to assume that all GPRs in compute shader can be indirectly
addressed because LLVM backend doesn't provide any indirect array info.
That's why for compute shaders GPR array is created that covers all used
GPRs (0..r600_bytecode::ngpr-1), but this seriously restricts register
allocation in sb.
This patch checks for actual use of indirect access in the shader and
if it's not used then GPR array is not created, so that regalloc is not
unnecessarily restricted.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
Fixes segfault with bfgminer and R600_DEBUG=sbcl.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
Fixes segfault during bytecode dump with bfgminer kernel
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Niels Ole Salscheider <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes and enables texturing with compressed MSAA colorbuffers
on Evergreen and Cayman. For the first time, multisample textures work
on Cayman.
This requires the libdrm flag RADEON_SURF_FMASK.
v2: require libdrm_radeon 2.4.45
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Static initialization of internal libstdc++ data related to iostream
causes segfaults with some apps.
This patch replaces all uses of std::ostream and std::ostringstream in sb
with custom lightweight classes.
Prevents segfaults with ut2004demo and probably some other old apps.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Parsing and ir construction is required for optimization only,
it's unnecessary if we only need to print shader dump.
This should make new disassembler more tolerant to any new
features in the bytecode.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
v2: fix typo 65535 -> 65536
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
and add assertions to prevent buffer overflow. This fixes corruption
of the r600_shader struct.
NOTE: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can replace CNDxx with MOV (and possibly eliminate after
propagation) in following cases:
If src1 is equal to src2 in CNDxx instruction then the result doesn't
depend on condition and we can replace the instruction with
"MOV dst, src1".
If src0 is const then we can evaluate the condition at compile time and
also replace it with MOV.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use the same limit for kcache constants in alu group on r6xx as on other
chips (two const pairs). Relaxing this will require additional checks to
make sure that all 4 consts in the group come from 2 kcache sets (clause
limit), probably without noticeable improvements of shader performance.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Rather than relying on a predetermined order for the config values.
|
|
|
|
|
| |
The LLVM backend emits raw ISA now, so we can just its output
unmodified.
|
|
|
|
| |
The LLVM backend takes care of this now.
|
|
|
|
|
|
|
|
|
| |
This library is very small, so there is not much to gain from building
it as a shared library. Also, when linking statically with LLVM, a
shared libradeonllvm exports LLVM symbols and creates problems when
used with other shared objects that also link statically to LLVM.
Reviewed-by: [email protected]
|
|
|
|
|
| |
New processors were added to the backend to distinguish between
GPUs with and without vertex caches.
|
|
|
|
|
| |
This is a port of "r600g:mask unused source components for SAMPLE"
patch from Vadim Girlin.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It should be unsigned, not enum pipe_flush_flags.
Fixed a build error:
src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error:
invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive]
v2: replace all occurrences of enum pipe_flush_flags by unsigned
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
[olv: document the parameter now that the type is unsigned]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assigning a struct only copies the members - any padding is left as is.
Thus this code:
struct foo_t foo;
foo = bar;
leaves the padding of foo intact, ie uninitialized random garbage.
This patch fixes constant shader recompiles by initializing the struct
to zero. For completeness, memcpy is used to copy the key to the shader
struct.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Lauri Kasanen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
| |
One build system for linux/unix only drivers should be enough.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It shouldn't be needed since the FLUSH_AND_INV_EVENT has already
made sure the destination caches are flushed. Additionally,
we didn't previously emit the surface_sync until this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b
Emitting them together causes hangs in compute on cayman/TN
and hangs in Heaven on evergreen.
Note: this patch is a candidate for the 9.1 branch, but requires:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a
as well.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
Fixes the bug that prevented propagation of literals in some cases.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|