| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
When there is a colormask active that does not cover all the channels,
enable reading in the destination like with a combining blend
operation. This fixes fbo-blending-formats on a3xx.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Enables ARB_depth_buffer_float. There is no sampling support for
interleaved Z32F_S8, so we store the two textures separately, one as
Z32F, the other as S8. As a result, we need a lot of additional logic
for restores and transfers.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
32-bit depth buffers are stored as unorm, and thus need special handling
when moving to and from gmem. They are copied into gmem by writing
depth, and resolved from gmem using a special resolve bit which
apparently float-ifies the data.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The bit to enable .z is still commented out, as it is triggering gpu
hangs in 0ad. But at least gl_FragCoord.w works now, and we know what
bits we are *supposed* to set for .z (with that uncommented all piglit
fragcoord tests are passing).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This was the missing bit to get dolphin-emu working on a4xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Similar to a3xx, the compiler needs to know the return type of the sam,
etc, instructions.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Update formats table with new formats that Ilia has figured out, and fix
sampling from srgb texture and integer vbo's.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
... to manage the LIBDRM*_CFLAGS. The former is the recommended approach
by the Android build system developers while the latter has been
depreciated for quite some time.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
isaml needs to scale up coords based on LoD. Also fix bogus bary.f
varying # when there are non-bary frag shader inputs. And use sub.s of
a positive immediate rather than add.s of negative (since CP is better
about figuring out that those can be collapsed into the cat2 instr).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
For now, completely flatten if/else blocks. That will almost certainly
change once we have flow control.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We'll also want it in NIR f/e for implementing UBO support.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Basically just sync up the cmdstream emit parts to match the changes
already done on a3xx.
Also, fix scheduling for mem instructions. This is needed on a4xx, and
I am a bit surprised it isn't needed for a3xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
These correspond to the tgsi TXQ opcode
(plus sneak in a fix for two-sided color)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
We'll need these in one or two other spots.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Just build up arrays for src0/src1, and use create_collect()..
Also add back missing .3d flag for 3d/cube textures.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I noticed some cases where we where trying to copy-propagate indirect
src's into places they cannot go, like 2nd src for cat3 (mad, etc).
Expand out valid_flags() to be aware of relativ flag, and fix up a few
related spots.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we get in a scenario where we cannot schedule any more instructions
due to address register conflict, clone the instruction that writes the
address register, and switch the remaining unscheduled users for the
current address register over to the new clone.
This is simpler and more robust than the previous attempt (which tried
and sometimes failed to ensure all other dependencies of users of the
address register were scheduled first).. hint it would try to schedule
instructions that were not actually needed for any output value.
We probably need to do the same with predicate register, although so far
it isn't so heavily used so we aren't running into problems with it
(yet).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
A bit fugly.. try and make this cleaner.. note if we hoist all the
get_addr() out of the loop we can drop the hashtable and just use
create_addr()..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
It probably *should* be an assert, but for now TGSI f/e isn't very good
about dealing w/ CONST vs ABS/NEG. So for debug builds, print a warning
instead of crashing with an assert for now.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Without this, a3xx breaks.. a4xx would too if it had already implemented
support for passing driver params.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a normal MAD (ie. not MADSH), if first source is gpr and second
source is const, we can swap the first two sources to avoid needing a
mov instruction.
This gives back the biggest advantage TGSI f/e had over NIR f/e for
common shaders, since TGSI f/e had this logic in the f/e. Note that
doing this in copy-prop step has the advantage that it will also work
for cases like:
MOV TEMP[b], CONST[x]
MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c]
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR compiler frontend is an alternative to the TGSI f/e, producing
the same ir3 IR and using the same backend passes for scheduling, etc.
It is not enabled by default yet, as there are still some regressions.
To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for
example, xonotic or supertuxkart.
With the NIR f/e, scalarizing and a number of other lowering steps
happen in NIR, so we don't have to do them in ir3. Which simplifies the
f/e and allows the lowered instructions to pass through other
optimization stages.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Use the correct sprite replacement depending on the flip of the coord
mode, using either T or 1-T depending on whether we have an upper-left or
lower-left coordinate origin. This fixes all the point sprite piglits.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Copies nouveau_buffer and radeon_buffer. This allows a write to proceed
to an uninitialized part of a buffer even when the GPU is using the
previously-initialized portions.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Waiting on a bo being ready is handled in fd_bo_cpu_prep. No need to
keep separate timestamps around.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A resource flush is an upload of a hypothetically-staging texture to the
GPU. For a UMA system, this will largely be a no-op or
cache-maintenance. Move the render flush logic into transfer_map where
it belongs, and clear out the transfer_flush function.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
pipe_sampler_view already contains a texture, remove the redundant
tex_resource member which pointed at the same thing.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Fallback to picking based on semantic name.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since NIR f/e currently encodes immediates in instructions (rather than
passing via const), we need to ensure that when const's are used the get
initialized to the proper values. Otherwise comparing NIR to TGSI
compiler, it will use proper immediate values in one case, and randomly
initialize values in the other. Which confuses ir3test.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Since we dropped the old compiler, we don't need this hack anymore.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Be smarter about propagating copies from const or immed, or with abs/neg
modifiers. Also, realize that absneg.s and absneg.f are really "fancy"
mov instructions.
This opens up the possibility to remove more copies. It helps the TGSI
frontend a bit, but will be really needed for the NIR f/e which builds
everything up in SSA form (ie. will *always* insert a mov from const or
immediate).
Signed-off-by: Rob Clark <[email protected]>
|