| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since both r600 and radeonsi use code from libamd_common they need to
static link it. At the same time, adding a common library to LIB_DEPS is
fragile [can lean to multiple symbol definitions] and non-obvious - I
had to do a double-take how things work atm.
So follow the libradeon.la approach and put common libraries in
TARGET_RADEON_COMMON
Fixes: 936f5407a7d ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600")
Cc: Timothy Arceri <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes build failure with --enable-opencl --enable-xvmc:
make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc'
CXXLD libXvMCgallium.la
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read'
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'libXvMCgallium.la' failed
Fixes: dc4c551a345d ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()")
Acked-by: Timothy Arceri <[email protected]>
Tested-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
We now use the shared code in AMD common instead.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
For radeonsi we could probably switch to
ac_shader_binary_read_config(). However the functions have
diverged so just share this helper for now.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
all drivers support it
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]> (VMware driver only)
|
|
|
|
|
|
|
|
| |
It's OK for r300g (because r300g can't write to buffers via the GPU), but
not later hardware. This issue was spotted randomly.
Cc: [email protected]
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Not used and not widely supported. Use MIN+MAX instead.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
also remove the BIND flags
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-by: Edmondo Tommasina <[email protected]>
Tested-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nouveau does not currently have logic to implement this as a library
function. Even though such a library could be written, there's no big
advantage to do it that way for now given that int64 is a very uncommon
use-case. Allow a driver to expose INT64 without supporting division and
modulo operations.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make the cap consistent with PIPE_CAP_INT64.
Aside from the hypothetical case of using draw for vertex shaders (and
actually caring about doubles...), every implementation supports doubles
either nowhere or everywhere.
Also, st/mesa didn't even check the cap correctly in all supported
shader stages.
While at it, add a missing LLVM version check for 64-bit integers in
radeonsi. This is conservative: judging by the log, LLVM 3.8 might be
sufficient, but there are probably bugs that have been fixed since then.
v2: fix clover (Marek)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Should be r600_common_screen instead of r600_screen.
Fixes: 80157a2c20 ("gallium/radeon: clean up r600_query_init_backend_mask")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
to simplify things in draw_vbo a little
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Commit 7b5878ee0491e7a93914389a8369cd6752b9757d increased number of
outputs to 64, but left output array intact. This caused stack overflow
when number of outputs is bigger then 32. Found by ASAN.
Cc: "12.0 13.0 17.0" <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This just needs to be done for r600g in the screen.
We don't need an IB submission for every new context created for GCN.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This matches the behavior of most other drivers, including nouveau,
radeonsi, and i965.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v1.1: move to using a normal CAP. (Marek)
v2: fill in the cap everywhere
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
|
|
|
|
|
|
| |
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
|
|
|
|
|
|
|
|
| |
We will use it for DDIV.
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
|
|
|
|
|
|
|
|
| |
This is so that we can differentiate between flushing any framebuffer
reading caches from regular sampler caches.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
Call r600_dma_emit_wait_idle only when there is a possibility of
a read-after-write hazard. Buffers not yet used by the SDMA IB don't
have to wait.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
It's redundant with the source modifier.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
It's redundant with the source modifier.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Drivers with good compilers don't need aggressive optimizations before TGSI.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make sure unused ops and their references are removed, prior to entering
the GCM (global code motion) pass, to stop GCM from breaking the loop
logic and thus hanging the GPU.
Turns out, that sb has problems with loops and node optimizations
regarding associative folding:
- the global code motion (gcm) pass moves ops up a loop level/basic block
until they've fulfilled their total usage count
- if there are ops folded into others, the usage count won't be
fulfilled and thus the op moved way up to the top
- within GCM the op would be visited and their deps would be moved
alongside it, to fulfill the src constaints
- in a loop, an unused op is moved out of the loop and GCM would move
the src value ops up as well
- now here arises the problem: if the loop counter is one of the src
values it would get moved up as well, the loop break condition would
never get hit and the shader turn into an endless loop, resulting in the
GPU hanging and being reset
A reduced (albeit nonsense) piglit example would be:
[require]
GLSL >= 1.20
[fragment shader]
uniform int SIZE;
uniform vec4 lights[512];
void main()
{
float x = 0;
for(int i = 0; i < SIZE; i++)
x += lights[2*i+1].x;
}
[test]
uniform int SIZE 1
draw rect -1 -1 2 2
Which gets optimized to:
===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 42 dw ===== 1 gprs ===== 2 stack =========================================
ALU 3 @24
1 y: MOV R0.y, 0
t: MULLO_UINT R0.w, [0x00000002 2.8026e-45].x, R0.z
LOOP_START_DX10 @22
PUSH @6
ALU 1 @30 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 2 @32
3 x: ADD_INT R0.x, R0.w, [0x00000002 2.8026e-45].x
TEX 1 @36
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 1 @40
4 y: ADD R0.y, R0.y, R0.x
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================
Notice R0.z being the loop counter/break condition relevant register
and being never incremented at all. Also some of the loop content
has been moved out of it, to fulfill the requirements for the one unused
op.
With a debug build of mesa this would produce an error like
error at : PRED_SETGE_INT __, __, EM.2, R1.x.2||[email protected], C0.x
: operand value R1.x.2||[email protected] was not previously written to its gpr
and the compilation would fail due to this. On a release build it gets
passed to the GPU.
When using this patch, the loop remains intact:
===== SHADER #12 OPT ================================== PS/BARTS/EVERGREEN =====
===== 48 dw ===== 1 gprs ===== 2 stack =========================================
ALU 2 @24
1 y: MOV R0.y, 0
z: MOV R0.z, 0
LOOP_START_DX10 @22
PUSH @6
ALU 1 @28 KC0[CB0:0-15]
2 M x: PRED_SETGE_INT __.x, R0.z, KC0[0].x
JUMP @14 POP:1
LOOP_BREAK @20
POP @14 POP:1
ALU 4 @30
3 t: MULLO_UINT T0.x, [0x00000002 2.8026e-45].x, R0.z
4 x: ADD_INT R0.x, T0.x, [0x00000002 2.8026e-45].x
TEX 1 @40
VFETCH R0.x___, R0.x, RID:0 MFC:16 UCF:0 FMT[..]
ALU 2 @44
5 y: ADD R0.y, R0.y, R0.x
z: ADD_INT R0.z, R0.z, 1
LOOP_END @4
EXPORT_DONE PIXEL 0 R0.____ EOP
===== SHADER_END ===============================================================
Piglit: ./piglit summary console -d results/*_gpu_noglx
name: unpatched_gpu_noglx patched_gpu_noglx
---- ------------------- -----------------
pass: 18016 18021
fail: 748 743
crash: 7 7
skip: 1124 1124
timeout: 0 0
warn: 13 13
incomplete: 0 0
dmesg-warn: 0 0
dmesg-fail: 0 0
changes: 0 5
fixes: 0 5
regressions: 0 0
total: 19908 19908
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94900
Tested-by: Heiko Przybyl <[email protected]>
Tested-on: Barts PRO HD6850
Signed-off-by: Heiko Przybyl <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This enables gallium support for EGL_ANDROID_native_fence_sync, for
drivers which support PIPE_CAP_NATIVE_FENCE_FD.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers that support this benefit by saving one lowering pass in the
GLSL-to-TGSI conversion.
radeonsi already supports this because all outputs are stored in temporary
variables before the export (except for TCS outputs, which have always
been readable in TGSI anyway due to their special semantics).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
it has no effect whatsoever
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
fmask implies that cmask is present too.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the driver to signal that it can't handle random
interleaving of attributes across buffers. This is required for
ARB_transform_feedback3, and it's initialized to whatever the previous
value of PIPE_CAP_STREAM_OUTPUT_PAUSE_RESUME was except for nv50 where
it is disabled. Note that the proprietary drivers never expose
ARB_transform_feedback3 on any GT21x's (where nouveau previously did),
and after some effort I was unable to get it to work.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Check for device reset on flush. It would be nicer if the kernel just
reported this as an error on the submit ioctl (and similarly for fences),
but this will do for now.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There are driver-specific context flags for barriers that are not covered
by the Gallium barrier interfaces.
The R600 settings of these flags may not be optimal, but we're not going
to use them yet anyway.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Based off of Ilia's original patch, but with output values replicated so
that it matches the TGSI semantics.
Signed-off-by: Glenn Kennard <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A couple of forward-declarations were causing warnings in clang:
'value' defined as a class here but previously declared as a struct
[-Wmismatched-tags]
Signed-off-by: Martina Kollarova <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Android porting of the following commits:
f1f1ba3 "radeonsi: move sid.h/r600d_common.h to a common place."
69fca64 "amd/addrlib: move addrlib from amdgpu winsys to common code"
This patch fixes android building errors
Reviewed-by: Dave Airlie <[email protected]>
|