| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This logic can be re-written as the two cases for 3d (ie. before/after
the miplevel sizes start reducing) vs everything else. I think it is
easier to read this way.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We really only need this logic in one place.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Whenever llvm removes an intrinsic (we're using), we're hitting segfaults
due to llvm doing calls to address 0 in the jitted code instead.
However, Jose figured out we can actually detect this with
LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder
what got broken. (Of course, someone still needs to fix the code to
no longer use this intrinsic.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This intrinsic disppeared with llvm 6.0, using it ends up in segfaults
(due to llvm issuing call to NULL address in the jited shaders).
Add code doing the same thing as the autoupgrade code in llvm so it
can be matched and replaced back with a pavgb.
While here, also improve lp_test_format, so it tests both with and without
cache (as it was, it tested the cache versions only, whereas cache is
actually disabled in llvmpipe, and in any case even with it enabled
vertex and geometry shaders wouldn't use it). (Although at least for
the unorm8 uncached fetch, the code is still quite different to what
llvmpipe is using, since that would use unorm8x16 type, whereas
the test code is using unorm8x4 type, hence disabling some intrinsic
paths.)
Fixes: 6f4083143bb8 ("gallivm: use llvm jit code for decoding s3tc")
Reviewed-by: Jose Fonseca <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
The library is called libgalliumvl_stub - note singular.
Fixes: 42ea0631f10 ("meson: build clover")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
| |
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
|
|
|
|
|
|
|
| |
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
This is the right channel to report these things, so that end-users don't
need to know each driver's custom debug options.
|
|
|
|
|
|
|
| |
This lets the driver use pipe_debug_message() for GL_ARB_debug_output.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
The shadow state is now in the sampler.
|
|
|
|
|
|
| |
i915 render nodes refuse the dumb ioctls, so the simulator would crash on
the original non-apitrace shader-db. Replace them with direct i915 calls
if we detect that we're on one of their gem fds.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is (much) faster than using the util fallback.
(Note that there's two methods here, one would use a cache, similar to
the existing code (although the cache was disabled), except the block
decode is done with jit code, the other directly decodes the required
pixels. For now don't use the cache (being direct-mapped is suboptimal,
but it's difficult to come up with something better which doesn't have
too much overhead.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
This calls the expensive uif offset function once per utile, but it still
gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over
calling it on each pixel.
|
|
|
|
|
| |
These implementations of whole-utile load/stores would be the same for
v3d, though the layouts of blocks of utiles has changed.
|
|
|
|
|
|
|
| |
This lets us store the non-PBO glTexImage data directly into the tiled
image without making an extra untiled memcpy for the gallium transfer.
Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around
in the kernel mapping and unmapping the transfer's temporary area.
|
| |
|
|
|
|
|
|
|
| |
They're raster order anyway, so we'd assertion fail along with wasting
bandwidth.
Fixes: 6ad9e8690d14 ("v3d: Add support for texturing from linear.")
|
|
|
|
|
|
|
|
|
|
| |
We're waiting for the jobs-completed count to increment (with wrapping),
not to reach its starting state. This mostly ended up working out because
the next v3d_hw_tick() for a submit CL would end up doing the TFU
operation first, but it did fail when a blit was used for glReadPixels()
at the end of a test.
Fixes: ee0549ff9ab3 ("v3d: Add the V3D TFU submit interface to the simulator.")
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the UAPI, the first BO is the destination, and the one the kernel
should do an exclusive reservation on. Currently we only do exclusive
reservations, anyway. However, in the simulator path I was only copying
back the "destination" BO (actually src in this case), and this caused
regressions once I fixed the simulator to actually complete TFU before
returning (since otherwise, the TFU op would happen at the start of the
next CL submit and the draw would get the right contents).
Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
|
|
|
|
|
|
|
|
|
| |
Fixes build failure if the LLVM headers aren't in a standard include
directory.
Fixes: ec22dd34c88f "radeonsi: move SI_FORCE_FAMILY functionality to
winsys"
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
We can remove some duplicated code.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
A resource is just a buffer with some metadata.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we ignored the the glUnmap(..) operation and
flushed before we flush the cbuf. Now, let's just flush
the data when we unmap.
Neither method is optimal, for example:
glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT)
glFlushMappedBufferRange(.., 25, 30)
glFlushMappedBufferRange(.., 65, 70)
We'll end up flushing 25 --> 70. Maybe we can fix this later.
v2: Add fixme comment in the code (Elie)
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
We can reuse the helpers we created.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
| |
util_format_get_blocksize returns 1 for R8 formats (all
PIPE_BUFFERs are R8).
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
| |
We could allocate and destroy transfers in one place.
v2: Keep l_stride around.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
| |
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
Will be reused.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
Will be reused.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
Will be reused.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With commit 89b479, we moved to tracking buffer cleanliness
when binding.
TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
| |
It's used for all types of resources.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Remove a level of indirection to make the code more explicit -- should
make it easier to follow what's going on.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is a move towards using composition instead of inheritance for
different query types.
This change weakens out-of-memory error reporting somewhat, though this
should be acceptable since we didn't consistently report such errors in
the first place.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Other callers of si_set_constant_buffer don't need it.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Reduce the number of places that encode buffer descriptors.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This is rather important for merged VS/TCS as LSHS shaders...
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
There is never a read-after-write hazard because the command doesn't read.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Prepare for some later refactoring.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This helps some debugging cases by initializing addrlib with
slightly more appropriate settings.
Reviewed-by: Marek Olšák <[email protected]>
|