| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As documented in:
https://www.opengl.org/registry/specs/ARB/compute_shader.txt
const uvec3 gl_WorkGroupSize;
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Anonymous structures are only supported with newer versions of
GCC. They will not work with GCC 4.2.1 used by OpenBSD or
GCC 4.4.7 shipped with RHEL6 going by a commit to fix a similiar
problem in radeonsi earlier in the year
(74388dd24bc7fdb9e62ec18096163f5426e03fbf).
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Jonathan Gray <[email protected]>
|
|
|
|
|
|
|
|
| |
We need parenthesis around the expression which computes the number of
blocks per row.
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.3 10.4" <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Was not called from any other place.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The pipe-loader code wasn't finding util/u_atomic.h
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
They are defined by windows.h, which got included slightly more
frequently than before with u_atomic.h
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The i965 backends pass something out of 'screen', which is allocated
per-process, making using this as a ralloc context not thread-safe.
All callers ra_alloc_interference_graph() already ralloc_free() its
return value.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was totally broken:
- p_atomic_dec_zero() was returning the negation of the expected value
- p_atomic_inc_return()/p_atomic_dec_return() was
post-incrementing/decrementing, hence returning the old value instead
of the new
- p_atomic_cmpxchg() was returning the new value on success, instead of
the old
It is clear this never used in the past. I wonder if it wouldn't be better to
yank it altogether.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was much easier for me to verify things build and run as expected
with this simple test, than building and testing whole Mesa.
With scons the test can be build and run merely by doing:
scons u_atomic_test
Building the test with autotools is left as a future exercise.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
like how C11's stdatomic.h provides generic functions. GCC's __sync_*
builtins already take a variety of types, so that's simple.
MSVC and Sun Studio don't, but we can implement it with something that
looks a little crazy but is actually quite readable.
Thanks to Jose for some MSVC fixes!
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
GCC >= 4.1 support the __sync_* intrinsics. That seems like a
sufficiently old baseline.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
There was already an intrinsics path that implemented all of the same
functions, plus more.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
To be shared outside of Gallium.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't reschedule much currently, just tries to fit things into the
regfile A/B write-versus-read slots (the cause of the improvements in
shader-db), and hide texture fetch latency by scheduling setup early and
results collection late (haven't performance tested it). This
infrastructure will be important for doing instruction pairing, though.
shader-db2 results:
total instructions in shared programs: 61874 -> 59583 (-3.70%)
instructions in affected programs: 50677 -> 48386 (-4.52%)
|
|
|
|
| |
This is actually implicitly handled by the TLB operations.
|
|
|
|
|
| |
Prevents a regression with QPU scheduling, which happens to put the no-op
reads for unused VPM contents end up at the end of the program.
|
|
|
|
|
|
|
| |
We're supposed to be checking that nothing else writes r4, which is done
by the TMU result collection signal, not the coordinate setup.
Avoids a regression when QPU instruction scheduling is introduced.
|
|
|
|
| |
This was caught by an assertion in the simulator.
|
|
|
|
|
|
|
|
|
| |
Otherwise vertex shader can see stale cache data. This in particular
happens when the same vbo is updated and reused. Not sure yet if vbo's
at differing addresses but bound to same vertex buffer slot could have
issues, but seems safest to flush whenever new vertex buffers are bound.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For drivers building up to GL(ES)3, only expose the actual extension if
the API will let it be used (e.g. via overrides/debug flags that enable
higher versions).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
The mesa state tracker doesn't fall back on similar integer formats, so
they must all be provided. Remove the restriction against integer color
rendering.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
We need to produce a u32 destination type on integer sampling
instructions, so keep that in a shader key set based on the
currently-bound textures.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Integer outputs end up getting mangled due to cov.f32f16, and float32
loses precision. Use full precision shaders in both of those cases.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Also add support for the BLENDABLE bind flag, similarly predicated on
non-int formats.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Just pass the data through unmolested. This probably has no effect since
blending isn't actually enabled.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Leaving it around in the struct in case we want to use it later.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Looks like none of the mad variants do u16 * u16 + u32, so just add in
the extra value "by hand".
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.3 10.4" <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This lets us move emitting SP_FS_MRT_REG back to fd4_program_emit.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The table contains all the relevant information about each format. The
helper functions now just do lookups in the table.
Note that this adds support for a lot of formats that were previously
unsupported. Additionally it adds disabled support for integer render
buffers, which will require more work to actually enable.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Switch both of them from independently inconsistent conventions to having
UINT/SINT/UNORM/SNORM/FLOAT/FIXED suffixes.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
All the "util" helpers are actually format-related
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes arb_color_buffer_float-render GL_RGBA16F.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.3 10.4" <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
BRW_CACHE_VS_PROG is more easily associated with program caches than
plain BRW_VS_PROG.
While we're at it, rename BRW_WM_PROG to BRW_CACHE_FS_PROG, to move away
from the outdated Windowizer/Masker name.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|