| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
v2: reword comment about lower_helper_invocations to be more clear
that it might not work on all hardware
v3: add special variant of load_sample_id which does not imply per-
sample shading
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Annoyingly we still have to briefly drop the lock to unref resources..
but push the lock down into __fd_batch_destroy() so we can invalidate
the batch and reset resources before dropping the lock.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Re-allocate rather than re-use. Originally we had an unnecessarily
complex design to avoid re-allocating cmdstream buffers. But now that
support for "growable" cmdstream buffers has been in place for a couple
years, I guess we can care a bit less about the extra overhead on older
kernels.
But making the batches one-shot removes a class of potential race
conditions vs the flush_queue.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of the reading batch setting a dependency on the writing batch,
simply flush the writing batch immediately. This avoids situations
where we have to flush the context's current batch later.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This was basically to avoid a zero-dword IB (indirect-branch), but
instead just don't emit the IB packet in that case.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
pipe_framebuffer_state can have samples=0 in various cases, which is
actually the same thing as samples=1. So use the _get_num_samples()
helper to populate the key, to avoid this looking like two distinct
fb states to the cache.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
It is possible for a batch to be freed under our feet when flushing, so
it is best to hold a reference to all of them up-front.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Python 2, `print` was a statement, but it became a function in
Python 3.
Using print functions everywhere makes the script compatible with Python
versions >= 2.6, including Python 3.
Signed-off-by: Mathieu Bridon <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Dylan Baker <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Fixes: cf0c7258ee0 freedreno/a5xx: MSAA
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This was mistakenly exposed, even though we want atomic counters to be
lowered to atomic ops on an SSBO like nearly every other GPU. Which
somehow recently started getting segfaults due to calling a null
pipe->set_hw_atomic_buffers().
Fixes a crash in stk, and probably other things.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This completely reworks the pass to support deref instructions and
delete support for old deref chains
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that it's rewritten for deref instructions, we can turn it back on.
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It's only used by the ir3 stand-alone compiler and Rob said we could
delete it.
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vc4+vc5 is not really effected by the deref chain to deref instr
conversion, so it no longer needs this pass. For others, now that
all the passes mesa/st uses are using deref instructions, push the
lowering to deref chains back into driver.
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To simplify the transition, and make things bisectable, split out a
legacy copy or lower_samplers. This way the i965 and gallium drivers
can independently switch over to deref instructions.
Since the lower_samplers_as_deref pass is only used by gallium drivers,
it can be converted in lock-step with moving the lower_deref_instrs
pass, and so does not need a corresponding _legacy clone.
This legacy pass will be removed in a future commit.
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass doesn't handle deref instructions yet. Making it handle both
legacy derefs and deref instructions would be painful. Since it's not
important for correctness, just disable it for now.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This inserts a call to nir_lower_deref_instrs at every call site of
glsl_to_nir, spirv_to_nir, and prog_to_nir.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
the format of the CLEAR_COLOR register doesn't depend on the target format
this fixes clear color when rendering to 32-bit RGBA and 16-bit targets
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
blend can be NULL, so check for that
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
this patch adds support for a20x, which has some differences with a220:
-no VGT_MAX_VTX_INDX register
-no CLEAR_COLOR register
-set RB_BC_CONTROL in restore (hangs without)
-different CP_DRAW_INDX format
tested with kmscube and glmark2 scenes, on par with a220
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
The offset field is 22 bit large.
11 bits are necessary because MaxVertexAttribRelativeOffset = 2047
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Similar to the combined limit for VS+FS, there is an upper limit for
shader size to run from internel memory.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Fixes: c366f422f0a nir: Offset vertex_id by first_vertex instead of base_vertex
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v1 -> v2:
- nv30 is _NOT_ scalar as suggested by Ilia Mirkin.
- Change from a screen cap to a shader cap as suggested
by Eric Anholt.
- radeonsi is scalar as suggested by Marek Olšák.
- Change missing ones to be scalar.
v2 -> v3:
- r600 prefers vec4 as suggested by Marek Olšák.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The scratch registers move again in a6xx.. so for post-a4xx let's just
move this into the backend, and move the one place it used to be needed
in core into fd5_emit_ib(). For a6xx we will do similar, calling
emit_marker6() from fd6_emit_ib().
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
dEQP-GLES31.stress.vertex_attribute_binding.buffer_bounds.bind_vertex_buffer_offset_near_wrap_10
This is kind of a hack, but really the only problem is the
debug_assert() in OUT_RELOC(). But the debug_assert() is
useful to catch real issues. So just add some #ifdef DEBUG
code to filter things out before we hit the assert.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
It is impolite, and a bit annoying with dEQP (all tests running in
single process).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Fixes a problem seen with dEQP-GLES31.functional.ssbo.layout.single_basic_array.shared.row_major_mat4
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
If we aren't going to get full occupancy, then use TWO_QUAD.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Need a bit of hand-holding for stencil bordercolor, and add border color
values for sRGB.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
These never got updated in fd_context_all_dirty() so actually trying to
rely on them (in the case of fd5_emit_images()) ends up in some cases
where state is not emitted but should be. Best to just rip this out.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
I think this ends up just setting uniform/const memory. But we upload
x/y/z stride differently. At best this is unneeded, at worst it could
possibly clobber other uniform/const memory.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Similar to txf case, we need to insert a 2nd coordinate (zero).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Unlike textures, this doesn't get lowered for us. (Would be nice
if they were.. at least until we are ready to deal w/ indirect
indexing..)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Seems I previously toally forgot about 2d-arrays, etc..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Brian Paul <[email protected]> (v2)
Reviewed-by: Marek Olšák <[email protected]> (v2)
|
|
|
|
|
|
|
| |
At least for PIPE_BUFFER, we could get the resource used as (for
example) R32F imageBuffer. So using cpp=1 from the rsc is wrong.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
copy-pasta fail from how SSBO sizes are handled.
Signed-off-by: Rob Clark <[email protected]>
|