| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Jakob Sinclair <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The home-grown heap scheme (which is ultra-simple but probably not good
to always allocate and memset such a chunk of memory up front) was a
remnant of fdre (where the ir originally came from). But since we have
ralloc in mesa, lets just use that instead.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Normally this would never happen (constant-propagation in NIR would
eliminate the instruction), except it does happen for 'undef' which
we turn into immed 0.0 for bookkeeping purposes.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
See a7eb12d0.. but that wasn't restrictive enough. Fixes
dEQP-GLES3.functional.rasterization.primitives.line_strip_wide, and
similar
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We seem to need range reduction to get sane results. Fixes glmark2
jellyfish bench, and a whole bunch of
dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.*
v2: squashed in android build fixes from Rob Herring
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Currently we were two restrictive, and would insert an output move in
cases like: MOV OUT[0], IN[0].xyzw
Loosen the restriction to allow the current instruction to appear in the
neighbor list but only at it's current possition.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Normally the offset in the group would be the same, but not always. For
example, in a sam(w) which only writes the 4th component.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions. But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures. Work around this
setting up a second texture state and using that to sample alpha
separately.
This way, srgb->linear conversion happens in hw *prior* to
interpolation.
This fixes 546 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Better workaround in the following patch.
This reverts commit 899bd63acefd49a668e11c42d2ad92fa55aa157d.
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a whole bunch of dEQP-GLES3.functional.fragment_ops.random.* (now
they all pass)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Pull in RB_BLEND_* fixes.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Even when begin_query succeeds, there can still be failures in query handling.
For example for radeon, additional buffers may have to be allocated when
queries span multiple command buffers.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions. But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures. Work around this
by doing the srgb->linear conversion in the shader instead.
This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*
(The remaining fails seem to be a bug w/ ASTC + linear filtering, also
possibly a420.0 specific.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The separate FS/VS entrypoints are no longer used since a3ed98f. So
just inline them.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have something like:
MOV OUT[n], IN[m].wzyx
the existing grouping code was missing a potential conflict. Due to
input needing to be sequential scalar regs, we have:
IN: x <-> y <-> z <-> w
which would be grouped to:
OUT: w <-> z2 <-> y2 <-> x (where the 2 denotes a copy/mov)
but that can't actually work. We need to realize that x and w are
already in the same chain, not just that they aren't both already in
new chain being built.
With this fixed, we probably no longer need the hack from f68f6c0.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 62fa868728c729152af0d7cecd1d3e47e831cb7d.
dEQP-GLES3.functional.occlusion_query.* was unhappy about that change.
Still not really sure *what* the other slots in the sample results
buffer are.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This one is slightly annoying, since trying to write RBRC from draw
would clobber values set in the tiling/gmem code. We could do command-
stream patching for RBRC, as is done on a3xx. Although since it seems
to be a rarely used feature, it is easier just to do RMW to set/clear
the bit.
Fixes dEQP-GLES3.functional.rasterizer_discard.basic.write_depth_triangles
and related tests.
a3xx still needs the same feature, although there it probably makes more
sense to take advantage of the existing cmdstream patching which is
required for RBRC for other reasons.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Seems like a4xx needs offset added to array index for all arrays,
whereas a3xx only for cubemap arrays. Fixes a whole swath of dEQP fails
(roughly *sampler2darray*).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We need to increment offset by # of vertices, not by # of prims. Fixes
a bunch of dEQP fails involving prims other than points. For example,
dEQP-GLES3.functional.transform_feedback.position.lines_separate
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If changed && append, we shouldn't be resetting the internal offset back
to zero. This fixes issues w/ sequences like:
glBeginTransformFeedback()
glDraw()
glPauseTransformFeedback()
glDraw()
glResumeTransformFeedback()
glDraw()
glEndTransformFeedback()
Fixes dEQP-GLES3.functional.transform_feedback.array.separate.points.lowp_vec3
and related tests.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
This should only count when TF is not paused.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
dEQP noticed that we were advertising completely bogus values. The
actual maximum is 127.0f.
*But* we have to use an artifically low maximum to work around a bug
in the dEQP test, which gets confused when the max line width is too
large and lines start going off-screen.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
There are still some edge cases which result in a neighbor-loop. Which
needs to be fixed, but this hack at least makes deqp tests finish.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a bunch of flat-varying fail on a4xx (where we need to use ldlv to
read the un-interpolated varying).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we cannot mov into a predicate register, the frontend uses a
'cmps.s p0.x, cond, 0' as a stand-in for mov to p0.x. It does this
since it has no way to know that the source cond instruction (ie.
for a kill, br, etc) will only be used to write the predicate reg.
Detect this, and re-write the instruction writing p0.x to skip the
original cmps.[sfu]. (It is done like this, rather than re-writing
the dest of the first cmps.[sfu] in case the first cmps.[sfu]
actually has other users.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add PIPE_CAP to determine if the GL extension
'GL_ARB_framebuffer_no_attachments' shall be
supported.
The driver is required to support 'PIPE_FORMAT_NONE'
via its 'is_format_supported()' callback in order
to determine the MSAA modes the hardware supports so
that values requested from the application using
'GL_ARB_framebuffer_no_attachments' may be quantized
to what the hardware expects.
V.2:
Fix doc for a more detailed description of the PIPE_CAP
and the corresponding GL constant.
V.3:
Renamed and repurposed once again.
V.4:
Remove CAP from cap_mapping array.
[airlied: fix damaged whitespace]
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We had an implicit assumption that the phi src was assigned in it's
source (pred) block leading into the phi. But this is not true with
NIR, so we can't just ignore the source block specified in the
nir_phi_src. Insert an extra mov in the source block. If it is not
required the CP pass will take it back out again.
Fixes:
./tests/spec/glsl-1.10/execution/vs-call-in-nested-loop.shader_test
./tests/spec/glsl-1.10/execution/vs-inner-loop-modifies-outer-loop-var.shader_test
and probably others.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The frontend inserts (abs) and (neg)'s to convert between NIR boolean
(~0/0) and native boolean (1/0). So we'd end up with things like:
cmps.s.ge r1.x, ...
absneg.s r1.x, (neg)r1.x
absneg.s r1.x, (abs)r1.x
sel.b32 r2.x, r0.x, r1.x, r0.y
The (neg) already gets collapsed due to the following (abs). Now by
realizing that r1.x comes from a cmps.s instruction, we can drop the
(abs) as well.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we end up with funny things like:
mov.f32f32 r0.x, r1.y
mov.f32f32 r0.x, r1.y
(It doesn't happen as much after fixing the problem w/ CP into phi src,
but it can still happen since we aren't too clever about generating phi
sources in the first place.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
We want to consider all the vars, not 1/32nd of them, when extending
live-ranges.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
The block defining a phi source might not have been executed. If we
allow copy propagation, we could end up pointing to a src instruction in
the wrong block.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes some transform-feedback piglits, like:
bin/ext_transform_feedback-nonflat-integral
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Turned out to be useful to debug an issue in RA. Let's keep it.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
No longer used, so drop the extra arg to ir3_instr_create()
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Been on my TODO list for a while. If nothing else this will make gdb
properly grok the opc_t enum.
This first step preserves ir3_instruction::category (with an added
assert that category matches what is encoded in opc_t). Next step is
to drop the category field (and arg to ir3_instr_create()), but that
is split into next commit for bisectability and so that we can run
piglit in the intermediate state to flush out any problems.
Signed-off-by: Rob Clark <[email protected]>
|