| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Even though luminance formats don't have alpha, we still want the alpha
output to go to the blender. This fixes the luminance blending tests.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit 545a3cbb011e0e7722c2accb330c0994aea5cc38)
|
|
|
|
|
|
|
| |
Fixes glamor, which wants to use R8 integer textures.
Signed-off-by: Rob Clark <[email protected]>
(cherry picked from commit 000e225360c020e8b3de142c4c898baad321d242)
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
(cherry picked from commit afb6c24a207fe7b9917644b940e4c5d1870c5c92)
|
|
|
|
|
|
|
|
|
|
| |
The hardware is capable of dealing with GL1-style user clip planes.
No clip vertex, no clip distances. Fixes a number of ucp tests, as well
as neverball.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
(cherry picked from commit 58e24b4761ec8c348bf6825c2355a6e047599306)
|
|
|
|
|
|
|
|
|
| |
Since i965 is now using make_reg_conflicts_transitive and doesn't need
q-value computations, they are disabled on i965. They are enabled
everywhere else so that they get the old behavior. This reduces the time
spent in eglInitialize() on BDW by around 10-15%.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
To properly support the case of waiting on a fence with a 0 timeout, we
still need to call down to the kernel. Which requires the use of the
new fd_pipe_wait_timeout() API.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Don't take current timestamp/fence from current ring, as we might have
already rolled over to new rb.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This only appears in cubemaps which have have packed layers, so are very
sensitive to any layout disagreement between sw and hw.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
A few other drivers do this, fixes the gl-1.4-polygon-offset piglit test
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
The default is to enable seamless cubemap filtering, but there's a bit
to turn it off.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Also fixes mipmap level generation for srgb textures.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
a4xx can do both float and half-float, while a3xx can only do half-float
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74329
v2: add a CAP for half floats
drivers should not expose the CAPs if they don't support the formats
v3: update relnotes
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Basic texture buffer support. Should be straightforward to add first/
last_element support. And with a bit of work in ir3 emulate larger
texture buffer sizes. But this seems to be enough for stk gl31 render
paths.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This shows up with a glamor shader, which does a TXF and uses the result
for conditional kill. Before we wouldn't group the fanin (collect)
neighbors which need to be allocated adjacently at RA, resulting in
badness.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
When debugging compiler, this is useful to see.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
a4xx needs similar treatment as 995f55a6
Also fixup a few point-size and vpsrepl issues and drop fix_blit_fp()
hack previously needed for mem2gmem.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Move a few things around to group stuff that is common to a3xx/a4xx
together. Also, introduce is_ir3() for things that are more specific to
the compiler / shader-ISA than to the gpu generation.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For gmem restore (mem2gmem), we swap blit programs, in order to have a
different frag shader for depth vs color restore. But we weren't
actually clearing the cached fp, so it would not actually change the
frag shader as expected.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For gmem restore (mem2gmem), we swap blit programs, in order to have a
different frag shader for depth vs color restore. But we weren't
actually clearing the cached fp, so it would not actually change the
frag shader as expected.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
needed for MRT
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Both a3xx and a4xx need the same logic to decide if half-precision can
be used for blit shaders. So move it to core and simplify things a bit
with a helper that considers all render targets.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Collapse dirty/reading bools into status bitmask (and drop writing which
should really be the same as dirty). And use 'used_resources' list for
all tracking, including zsbuf/cbufs, rather than special casing the
color and depth/stencil buffers.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Should be in units of components, not vec4's
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We hard-coded 4 or 8 as the max in various places. Switch it all to a
define since the limit will go up with a4xx (and maybe even again in the
future?)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had a fixed array to track kills, since they don't
generate an SSA value, and then cheated by stuffing them in the
outputs array before sending things through depth/sched/etc. But
store instructions will need similar treatment. So convert this
over to a more general array of instructions that must be kept
and fix up the places that were previously relying on kills being
in the output array.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For store instructions, the "dst" register is a read register, not a
written register. (Ie. it is the address to store to.) Lets not
confuse register allocation, scheduling, etc, with these details.
Instead just leave a dummy instr->regs[0], and take "dst" from
instr->regs[1] and srcs following.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add 'enum ir3_driver_param' to track driver-param slots, and a
create_driver_param() helper to avoid having the knowledge about
where driver params are placed in const regs spread throughout
the code as we add additional driver-params.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
With stream-out (transform-feedback) we have the case where resources
are *written* by the gpu, which needs basically the same tracking to
figure out when rendering must be flushed.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This will be used for stream-out (transform-feedback)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
A bit hard-coded configuration at the moment, but sufficient for now.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Details of the cmdstream packets are different between a3xx and a4xx,
but the logic about the layout of const registers is the same, as that
is dictated by the ir3 shader compiler. So rather than duplicating
logic that is tightly coupled to ir3 between a3xx and a4xx, move this
into ir3 and use per-generation callbacks for to build the cmdstream
packets.
This should make it easier to pass additional const regs (such as for
transform feedback). And it also keeps the layout internal to ir3 in
case we want to make the layout more dynamic some day.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since for transform-feedback, we'll need more than just the TGSI
tokens from the state object, just pass the entire state object to
ir3_shader_create(). This also cleans things up a bit for some
day in the future when we could take shader either as TGSI or
directly NIR (for ex, glsl2nir or spirv2nir paths). In the same
spirit, drop extra args from ir3_compile_shader_nir() (since it
can anyways get what it needs from the ir3_shader_variant).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Sync updated cat6 encoding from freedreno.git, needed to properly encode
store instructions.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile
and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability
to remove mentions of the inline define.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|