| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just builds on the image support. Evergreen only has ssbo
for fragment and compute no other stages.
v2: handle images and ssbo in the same shader properly (Ilia)
v3: fix RESQ on buffers,
fix missing atom emit
fix first element offset
use R32 format
write separate buffer rat store path.
(from running deqp gles3.1 tests)
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On cayman it appears the cmp component is now in Z.
Fixes:
arb_shader_image_load_store-dead-fragments on cayman.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These didn't handle the TGSI at all properly, this fixes
them to use the common path for 64->32 then adds the 32->int
on at the end.
Fixes:
generated_tests/spec/arb_gpu_shader_fp64/execution/conversion/*
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The idea is ported from RadeonSI, but using 512x512 instead of
256x256 seems slightly better. This improves dota2 performance
by +2%.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
The state no longer exists on GFX9.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Another typo, the glext.h header was being install instead.
Reported-by: Marc Dietrich <[email protected]>
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a typo, gl32.h is installed twice.
Reported-by: Marc Dietrich <[email protected]>
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
The same code for VI doesn't check for scanout either.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
VI has 11 dwords at least. GFX9 has 10 dwords.
Cc: 17.2 17.3 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jon Turney <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jon Turney <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jon Turney <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jon Turney <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jon Turney <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jon Turney <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix incomplete check of input params in blorp_surf_convert_to_uncompressed()
which can lead to NULL pointer dereferencing.
Fixes: 5ae8043fed2 ("intel/blorp: Add an entrypoint for doing
bit-for-bit copies")
Fixes: f395d0abc83 ("intel/blorp: Internally expose
surf_convert_to_uncompressed")
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Andres Gomez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes issues seen with certain versions of Unreal Engine 4 editor
and games built with that using GLSL 4.30.
v2: add driinfo_gallium change (Emil Velikov)
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97852
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103801
Acked-by: Andres Gomez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103909
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
64-bit pull loads are implemented by emitting 2 separate
32-bit pull load messages, where the second message loads from
an offset at +16B.
That addition of 16B to the original offset should not alter the
original offset register used as source for the pull load instruction
though, since the compiler might use that same offset register in other
instructions (for example, for other pull loads in the shader code
that take that same offset as reference).
If the pull load is 32-bit then we only need to emit one message and
we don't need to do offset calculations, but in that case the optimizer
should be able to drop the redundant MOV.
Fixes the following test on Haswell:
KHR-GL45.gpu_shader_fp64.fp64.max_uniform_components
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103007
|
|
|
|
|
|
|
|
|
|
| |
Prepare for two texture handling paths, the descriptor-based
path will be added in a future commit. These are structured
so that the texture implementation handles its own state
emission.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
This needs to be shared between texture_plain and texture_desc.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
This will be shared with the texture descriptor path.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
Need this to efficiently emit texture descriptor invalidations.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
Track varying component offset of the point size output, as well as
provide the offset of the point coord input.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
Update state objects to add new state, and emit function to emit new
state.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
- This core must load shaders from memory (AFAIK)
- Yet another new location for UNIFORMS
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
Update context reset for HALTI3..HALTI5, sorting states for the HALTI
version that has them.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
RS align is not necessary and might even be harmful when using the BLT
engine for blitting.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Add an implemenation of key clear_blit functions using the BLT engine
that replaced the RS on GC7000.
Also set level->size correctly for imported resources. This is important
for the BLT resolve-in-place path to work for them.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Prepare for BLT-based blitting path by moving RS-based
blitting to the RS implementation file, making this
self-contained.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
Want to be able to emit state from the texture implementation,
and the blitter implementation.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
| |
When the BLT is involved as source or target, add an extra BLT
enable/disable sequence around the sync sequence.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The blob does this, as DRAW_INSTANCED can replace fully all the other
draw commands. It is also required to handle integer vertex formats.
The other path is only there for compatibility and might go away (or at
least rot to become buggy due to dis-use) in newer hardware.
As a by-effect this changes the behavior for GC3000-, by no longer using
the index offset for DRAW_INDEXED but instead adding it to INDEX_ADDR.
This should make no difference.
Preparation for GC7000 support.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is used by HALTI2+ (GC3000+) when drawing with DRAW_INSTANCED.
It is also necessary when switching between integer and floating point
vertex element formats.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're about to add more of them, and need to pass the whole lot of them
around together when growing them. Putting them in a struct makes this
much easier.
brw->batch.batch.bo is a bit of a mouthful, but it's nice to have things
labeled 'batch' and 'state' now that we have multiple buffers.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Once we reach the intended size of the buffer (BATCH_SZ or STATE_SZ), we
try and flush. If we're not allowed to flush, we resort to growing the
buffer so that there's space for the data we need to emit.
We accidentally got the threshold wrong. The first non-wrappable call
beyond (e.g.) STATE_SZ would grow the buffer to floor(1.5 * STATE_SZ),
The next call would see we were beyond STATE_SZ and think we needed to
grow a second time - when the buffer was already large enough.
We still want to flush when we hit STATE_SZ, but for growing, we should
use the actual size of the buffer as the threshold. This way, we only
grow when actually necessary.
v2: Simplify the control flow (suggested by Jordan)
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The original state buffer was marked with EXEC_OBJECT_CAPTURE. When
growing it, we want to preserve that flag so we continue to capture it
in GPU hang reports.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The intention here is make the new BO use the same alignment as the old
BO. This isn't strictly necessary, but we would have to update the
'alignment' field in the validation list when swapping it out, and we
don't bother today.
The batch and state buffers use an alignment of 4096, so this should be
equivalent - it's just clearer than cut and pasting a magic constant.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
Compute setup gets emitted into the normal gfx state buffer,
so no need to reinit the basics.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This just makes it easier to bypass for TGSI later.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This just lets us see packets marked for compute.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes overlaps where src/dst are the same.
Fixes a bunch of the deqp bitfield tests.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-and-tested-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
STATE_BASE_ADDRESS specifies a maximum size of the dynamic state
section, beyond which data supposedly reads back as 0. On Gen8+,
we were programming it to the size of the buffer. This worked fine
until we started growing the state buffer in commit 2dfc119f22f25708.
When the state buffer grows, the value in STATE_BASE_ADDRESS becomes
too small, and our state beyond STATE_SZ bytes would read back as 0.
To avoid having to update the value, we program it to MAX_STATE_SIZE.
We used to program the upper bound to the maximum on older hardware
anyway, so programming it too large isn't a big deal.
Bogus SURFACE_STATE can easily lead to GPU hangs and misrendering.
DiRT Rally was hitting the statebuffer growth path, and suffered from
bad texture corruption and GPU hangs (usually around the same time).
This patch fixes both issues.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103101
Tested-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
and handle PIPE_FLUSH_HINT_FINISH in r300.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Most files in gallium/radeon now include si_pipe.h.
chip_class and family are now here:
sscreen->info.family
sscreen->info.chip_class
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|