| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the issue when dst and src is the same reg and operation on one
channel overwrites the source for other channels, e.g.:
UMUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[2].xxxx
In this example the result of the operation on channel x is written in
TEMP[2].x and then used as a second source operand for channels y and z
instead of original value in TEMP[2].x.
This patch stores the results in temp reg and moves them to
dst after performing operation on all channels.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70327
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
| |
All texture instructions can use offsets, not just TXF. Offsets into
the literals array were wrong, too.
Signed-off-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Also slightly optimize r600_buffer_map_sync_with_rings.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This streamout state code will be used by radeonsi.
There are new structures r600_common_context and r600_common_screen.
What is inherited by what is shown here:
pipe_context -> r600_common_context -> r600_context
pipe_screen -> r600_common_screen -> r600_screen
The common structures reside in drivers/radeon. Currently they only contain
enough functionality to be able to handle streamout. Eventually I'd like
the whole pipe_screen implementation to be shared and some of the context
stuff too.
This is quite big, but most changes are because of the new structures and
the fact r600_write_value is replaced by radeon_emit.
Thanks to Tom Stellard for fixing the build for r600g/compute.
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Christian König <[email protected]>
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
| |
We need to export at least one color if the shader writes it,
even when nr_cbufs==0.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
| |
I assume this should have been part of commit
7727fbb7c5d64348994bce6682e681d6181a91e9. This (obviously) fixes a lot tests.
Signed-off-by: Henri Verbeet <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Also use ordered comparisons for old cmp instructions.
Tested-by: Michel Dänzer <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional
kill (if any src component < 0). The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.
This patch renames both opcodes:
TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0)
TGSI_OPCODE_KILP -> KILL (unconditional kill)
Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.
I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up. Driver authors should review their code.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Tested-by: Aaron Watry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
byteswap.h and bswap_32 aren't portable, replace them with calls to
gallium's util_bswap32 as suggested by Mark Kettenis. Lets these files
build on OpenBSD.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
| |
Signed-off-by: Niels Ole Salscheider <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
and add assertions to prevent buffer overflow. This fixes corruption
of the r600_shader struct.
NOTE: This is a candidate for the stable branches.
|
|
|
|
|
| |
The LLVM backend emits raw ISA now, so we can just its output
unmodified.
|
|
|
|
| |
The LLVM backend takes care of this now.
|
|
|
|
|
|
| |
This should fix build issues with GCC < 4.3
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New disassembler is not completely isolated yet from further processing
in r600g/sb that is not required for printing the dump, so it has higher
probability to fail in case of any unexpected features in the bytecode.
This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new
disassembler in r600g/sb for shader dumps when shader optimization
is not enabled.
If shader optimization is enabled, new disassembler is used by default.
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
Optimization is enabled with "R600_DEBUG=sb".
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
| |
This results in more clean shader code and may improve the quality of
optimized code produced by r600-sb due to eliminated false dependencies
in some cases.
Signed-off-by: Vadim Girlin <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
This way we don't need to update the function signature everytime we
emit a new config value. This also fixes the build with
--enable-opencl.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI_OPCODE_IF condition had two possible interpretations:
- src.x != 0.0f
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was false either for
vertex and fragment shaders
- gallivm/llvmpipe
- postprocess
- vl state tracker
- vega state tracker
- most old drivers
- old internal state trackers
- many graw examples
- src.x != 0U
- Mesa statetracker when PIPE_SHADER_CAP_INTEGERS was true for both
vertex and fragment shaders
- tgsi_exec/softpipe
- r600
- radeonsi
- nv50
And drivers that use draw module also were a mess (because Mesa would
emit float IFs, but draw module supports native integers so it would
interpret IF arg as integers...)
This sort of works if the source argument is limited to float +0.0f or
+1.0f, integer 0, but would fail if source is float -0.0f, or integer in
the float NaN range. It could also fail if source is integer 1, and
hardware flushes denormalized numbers to zero.
But with this change there are now two opcodes, IF and UIF, with clear
meaning.
Drivers that do not support native integers do not need to worry about
UIF. However, for backwards compatibility with old state trackers and
examples, it is advisable that native integer capable drivers also
support the float IF opcode.
I tried to implement this for r600 and radeonsi based on the surrounding
code. I couldn't do this for nouveau, so I just shunted IF/UIF
together, which matches the current behavior.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
v2:
- Incorporate Roland's feedback.
- Fix r600_shader.c merge conflict.
- Fix typo in radeon, spotted by Michel Dänzer.
- Incorporte Christoph Bumiller's patch to handle TGSI_OPCODE_IF(float)
properly in nv50/ir.
|
|
|
|
|
|
| |
Never used or implemented.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a hardware bug on Cayman where a BREAK/CONTINUE followed by
LOOP_STARTxxx for nested loops may put the branch stack into a state
such that ALU_PUSH_BEFORE doesn't work as expected. Workaround this
by replacing the ALU_PUSH_BEFORE with a PUSH + ALU
Fixes piglit tests EXT_transform_feedback/order*
v2: Use existing loop count and improve comment
v3: [Vadim Girlin] Set jump address for PUSH instructions
NOTE: This is a candidate for the 9.1 branch
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I've no idea when sample_chan would ever be 4 here, but 4 is most
definitely wrong, array textures have it as 3 as well.
Also the cayman code though unused is obviously wrong.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The multiplication part of tgsi_umad did not work on Cayman, because it did
not populate the correct vector slots.
This fixed hardlocks in the EXT_transform_feedback/order tests.
NOTE: This is a candidate for the stable branches.
(might not be easy to cherry-pick though)
Signed-off-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.
v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
| |
also change names of other functions, so that they make sense
Reviewed-by: Jerome Glisse <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only the disassembler is used to dump shaders. Here's a few examples
how to use R600_DEBUG.
Log compute info:
R600_DEBUG=compute
Dump all shaders:
R600_DEBUG=fs,vs,gs,ps,cs
Dump pixel shaders only:
R600_DEBUG=ps
Disable Hyper-Z:
R600_DEBUG=nohyperz
Disable the LLVM backend:
R600_DEBUG=nollvm
Or use any combination of the above, or print all options:
R600_DEBUG=help
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
|