| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
For multi-pass rendering, it is common to keep the same depth buffer
from previous pass, to discard geometry that would be hidden by later
draws. In the later passes with depth-test enabled, but depth-write
disabled, there is no reason to do gmem2mem resolve.
TODO probably do something similar for stencil.. although stencil
buffer isn't used as commonly these days
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Before, I had per-stage entryoints with some helpers shared between them.
As I extended for compute shaders and shader-db, it turned out that the
other common code in the middle wanted to be shared too.
|
|
|
|
|
| |
Looking at some assembly dumps for an optimization, we were clearly
missing important parts of the shader!
|
|
|
|
|
|
|
| |
We'll still fail at draw time, but this avoids a regression in shader-db
execution once I enable TLB writes in precompiles.
Fixes: b38e4d313fc2 ("v3d: Create a state uploader for packing our shaders together.")
|
|
|
|
|
|
| |
Team Fortress 2 32-bit version runs out of the CPU address space.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
It seems to be the same, but this doesn't use integer division with
a variable divisor.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will help the new opt introduced in the following patches
allowing us to remove extra duplicate varyings.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There's no way to tell the 3D engine about swizzling on such textures.
While rendering to NPOT ones may be possible, there's no great way to
expose that in gallium, nor would there be any practical benefit.
Fixes the non-compressed-format "copyteximage 3D" failures. Something
odd going on with the compressed formats.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
s3tc layouts are a bit finicky - they're packed, but not swizzled.
Adjust logic to allow for that case:
- Don't set a uniform pitch for POT-sized compressed textures
- Adjust define_rect API to be less confused about block sizes
- Only mark a texture as linear if it has a uniform pitch set
This has been tested to fix xonotic (as well as the s3tc-* piglits)
on nv3x and keeps it working on nv4x.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This doesn't matter since all compressed formats supported by this
hardware use square blocks, but best to use the correct helper.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This logic mirrors what we do on nv50. The relatively new
texture_subdata callback can cause this to happen with 3D textures,
which is triggered at least by xonotic, and probably many piglits.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the last-active context gets deleted, the pushbuf doesn't have a
bufctx to reference. Then there could be a sequence of binds which would
trigger a reset on that bin before validation was done. Instead we just
pass in the bufctx in question directly.
All other instances of PUSH_RESET happen strictly after a validation is
run.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The whole user_priv thing is a mess, but as long as it's there, it
basically has to map 1:1 to the cur_ctx. Unfortunately we were setting
user_priv to some context, then that context could get deleted without
any draws/validations in it, leading user_priv to become NULL, with
cur_ctx still pointing at some old context. Then we wouldn't run the
switch logic, which in turn led to a NULL bufctx being dereferenced.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102349
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
This allows the original shader-db project's run.c runner to parse things
easily, and is probably a good thing to have for GL_ARB_debug_output in
general. I formatted it more like Intel's so I can mostly reuse their
report script.
|
|
|
|
|
|
|
|
|
| |
I've been using my apitrace-based shader-db so far, but it's slow
(apitrace decompression), intrusive (apitrace windows spamming the
screen), and doesn't have much coverage. The original shader-db provides
a lot more coverage and compiles faster, at the expense of not having the
actual runtime variant key. As v3d has a lot less runtime variation than
vc4 did, this tradeoff makes more sense.
|
|
|
|
|
|
|
|
|
| |
It's better to let most applications make use of adaptive sync
by default. Problematic applications can be placed on the blacklist
or the user can manually disable the feature.
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Nicholas Kazlauskas <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were leaking surfaces because the references taken in
etna_set_framebuffer_state weren't being released on context destroy.
Instead of just directly releasing those references in
etna_context_destroy, use the util_copy_framebuffer_state helper.
Take the chance to remove the duplicated buffer references in
compiled_framebuffer_state to avoid confusion.
The leak can be reproduced with a client that continuously creates and
destroys contexts.
Signed-off-by: Tomeu Vizoso <[email protected]>
Reported-by: Sjoerd Simons <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Maybe not 100% perfect, but seems to be a pretty good approximation of
that.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This logic can be re-written as the two cases for 3d (ie. before/after
the miplevel sizes start reducing) vs everything else. I think it is
easier to read this way.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We really only need this logic in one place.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This intrinsic disppeared with llvm 6.0, using it ends up in segfaults
(due to llvm issuing call to NULL address in the jited shaders).
Add code doing the same thing as the autoupgrade code in llvm so it
can be matched and replaced back with a pavgb.
While here, also improve lp_test_format, so it tests both with and without
cache (as it was, it tested the cache versions only, whereas cache is
actually disabled in llvmpipe, and in any case even with it enabled
vertex and geometry shaders wouldn't use it). (Although at least for
the unorm8 uncached fetch, the code is still quite different to what
llvmpipe is using, since that would use unorm8x16 type, whereas
the test code is using unorm8x4 type, hence disabling some intrinsic
paths.)
Fixes: 6f4083143bb8 ("gallivm: use llvm jit code for decoding s3tc")
Reviewed-by: Jose Fonseca <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
| |
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
|
|
|
|
|
|
|
| |
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
|
|
|
|
|
|
|
| |
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
The shadow state is now in the sampler.
|
|
|
|
|
|
| |
i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db. Replace them with direct i915 calls
if we detect that we're on one of their gem fds.
|
|
|
|
|
|
| |
This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.
|
|
|
|
|
| |
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
|
|
|
|
|
|
|
| |
This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.
|
| |
|
|
|
|
|
|
|
| |
They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.
Fixes: 6ad9e8690d14 ("v3d: Add support for texturing from linear.")
|
|
|
|
|
|
|
|
|
|
| |
We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state. This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.
Fixes: ee0549ff9ab3 ("v3d: Add the V3D TFU submit interface to the simulator.")
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on. Currently we only do exclusive
reservations, anyway. However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).
Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
|
|
|
|
|
|
| |
We can remove some duplicated code.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
A resource is just a buffer with some metadata.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf. Now, let's just flush
the data when we unmap.
Neither method is optimal, for example:
glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)
We'll end up flushing 25 --> 70. Maybe we can fix this later.
v2: Add fixme comment in the code (Elie)
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
We can reuse the helpers we created.
Reviewed-by: Elie Tournier <[email protected]>
|