| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks the original indirect mask was probably copied from
ANV.
Sascha Willems demo results:
tessellation ~4000 -> ~4200 fps
V2: continue lowering local indirects due to llvm deficiencies.
Tested-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
We already set it when above in the nir compilation loop.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at
a different offset.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
We need different regs to end up in s0/s1.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
To prevent VS/TCS collisions in merged shaders.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Needed for GFX9 merged shaders.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Architecture benefits from having more threads/work outstanding.
Patch by Jan Zielinski.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Allow draws in flight to be overridden via SWR_CREATECONTEXT_INFO.
Patch by Jan Zielinski.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Refactored the gather operation to process 16 elements at a time via
paired SIMD8 operations.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Meta's GenerateMipmap implementation binds the same image for both
sampling and rendering - but it samples from one miplevel while
rendering the next. This is a false self-dependency, and there's
no need to disable auxiliary buffers in this case. In fact, we really
want to leave it enabled so the new miplevels gain color compression.
Thankfully, the texture object's _MaxLevel is always one shy of the
miplevel being rendered. So we can simply check if irb->mt_level is
overlaps with the texture's defined levels. If not, there's no self-
dependency and we can leave the auxiliary buffers enabled.
Fixes a performance regression in GFXBench4 Car Chase, which apparently
calls glGenerateMipmap() on every frame.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103247
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by; Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that intel_miptree_prepare_texture takes levels and layers, there's
not much use in this anymore.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by; Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This should avoid unnecessary resolves when working with texture views.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by; Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This effectively exports intel_miptree_prepare_texture_slices() as
intel_miptree_prepare_texture(). The hope is to avoid resolves for
when using texture views that access a subset of the levels/layers.
For now, we pass the same arguments to separate the mechanical change
from the one that actually modifies our behavior.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by; Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A number of double/int64 operations don't have matching
read and write usage masks, which the fallthrough case of
tgsi_util_get_inst_usage_mask assumes for componentwise
tagged instructions.
No regressions in llvmpipe piglit; fixes a large number of
swr regressions.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The restriction is supposed to apply if the width *field* is >= 8192,
meaning the actual width *value* is >= 8193.
The code also incorrectly used == for some reason.
Reviewed-by: Juan A. Suarez Romero <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit a73116ecc60414ade89802150b tried to make add_barrier_deps()
walk to the next barrier, and stop. To accomplish that, it added an
is_barrier flag. Unfortunately, this only works half of the time.
The issue is that add_barrier_deps() walks both backward (to the
previous barrier), and forward (to the next barrier). It also sets
is_barrier. Assuming that we're processing instructions in forward
order, this means that is_barrier will be set for previous instructions,
but not future ones. So we'll never see it, and walk further than we
need to.
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23
now compiles its shaders in 3.6 seconds instead of 3.3 minutes.
Reviewed-by: Matt Turner <[email protected]>
Tested-by: Pallavi G <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This eliminates a layer of wrapping, and makes a backend_instruction
sufficient. The downside is that it exposes 'eot' to the vec4 backend,
which it doesn't need, but can basically happily ignore.
Reviewed-by: Matt Turner <[email protected]>
Tested-by: Pallavi G <[email protected]>
|
|
|
|
|
|
|
|
| |
The logic for handling shadow coords was completely broken.
Fixes be3ab867bd444594f9d9e0f8e59d305d15769afd.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103265
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We only need to add a check to validate output locations here. For
inputs with invalid locations we will fail to link when we can't
find a matching output in the same (invalid) location.
v2: compute location slots properly depending on shader stage and
variable type / direction
Fixes:
KHR-GL45.enhanced_layouts.varying_location_limit
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have up to 16 FS inputs, the SF unit will reorder our inputs
to be consecutive, however, when we have more than 16 we need to
to read our inputs from the URB exactly as they have been
output from the previous stage. This means that for SSO we have to
consider if we have URB padding due to unused input locations.
Specifically, this affects gen9 active components programming, since
for things to work in scenarios with over 16 inputs that have padded
regions we need to ensure that we program active components for the
padded regions too. If we don't do this the hardware won't read
the URB properly for inputs located after padded regions.
Found empirically.
Fixes (these also require a patch in CTS):
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We only want to scare the user away from causing a GPU stall for mapping
a busy bo. The time taken to instantiate the set of pages for a buffer
and their mmapping is unavoidable and flagging idle bo as being busy is
"crying wolf".
Reported-by: Tvrtko Ursulin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
... which does not break C's aliasing rules.
|
|
|
|
| |
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
“Saints Row: Gat out of Hell” benefits from this on slower CPUs in that
usage spikes on individual cores are avoided, which in turn makes it harder
to hit a bug which causes broken audio and the game to hang on exit.
“Saints Row IV” appears to be fine either way, but also exhibits the audio
breakage bug: glthread is therefore being enabled on the grounds that it should
make it a little harder to hit that bug.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Move it to radv_cmd_buffer_flush_state() because if
rasterizerDiscardEnable is true, the flags are not cleared.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It can only be changed when CmdBindIndexBuffer() is called
or when a secondary buffer is used. Though not always, but
let's re-emit the packets in this situation for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This saves few CPU cycles when CmdDrawIndexed() is used a lot.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Otherwise the flag is borderline useless.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes a crash while generating a hang report.
Fixes: 7d45d22fdd2e ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
This reverts commit 8cb84c8477a57ed05d703669fee1770f31b76ae6.
This fixes crashing shader-db/run.
|
|
|
|
|
|
| |
This reverts commit 6414d6bd8d2897f4ba643357fe3037f3acd60879.
This is needed to apply the next revert.
|
|
|
|
|
|
|
|
|
| |
This fixes an assertion failure introduced by 30a2f0dfd46de.
Fixes: 30a2f0dfd46 ("radeonsi: add an assertion that only
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|