| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
util_is_power_of_two_or_zero
The new name make the zero-input behavior more obvious. The next
patch adds a new function with different zero-input behavior.
Signed-off-by: Ian Romanick <[email protected]>
Suggested-by: Matt Turner <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Aaron Watry <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
In the absence of a general NIR or VIR-level scheduler, this at least
avoids spilling in
GTF-GLES3.gtf.GL3Tests.uniform_buffer_object.uniform_buffer_object_storage_layouts
|
|
|
|
| |
Just like TLB without a config uniform, we don't have a register index.
|
|
|
|
|
| |
Fixes failure in
GTF-GLES3.gtf.GL3Tests.draw_instanced.draw_instanced_attrib_size
|
|
|
|
|
|
| |
Our backend needs some sort of vertex position value to emit the scaled
viewport values and such. Fixes potential segfaults in
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately TGSI doesn't record the type of the FS output like GLSL
does, but VC5's TLB writes depend on the output's base type. Just record
the type in the key at variant compile time when we've got a TGSI input
and then fix it up.
Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a
GPU hang that breaks most tests that come after it.
|
|
|
|
|
| |
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet. Just skip the dump in that case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).
We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.
Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
|
|
|
|
| |
The point was to get the MOV, which the MOV_dest already returned.
|
|
|
|
| |
This is nice for debugging when you've made a bad instruction.
|
|
|
|
|
| |
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
|
|
|
|
| |
Anywhere we want to multiply, we probably want this.
|
| |
|
|
|
|
| |
Otherwise our start/ends ips won't line up with the actual instructions.
|
|
|
|
| |
This will be used for detecting last thread segment in register spilling.
|
|
|
|
|
| |
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
|
|
|
|
|
| |
The QPU scheduling code calling this function already separately checked
this signal.
|
|
|
|
| |
This will be reused in register spilling.
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
Obviously it would be good to have an ADD and a MUL and a signal together,
but we can even potentially have multiple signals merged, as well.
total instructions in shared programs: 100423 -> 97874 (-2.54%)
instructions in affected programs: 78812 -> 76263 (-3.23%)
|
|
|
|
|
|
|
|
| |
We emit some MOVs to track lifetimes of payload registers, but we don't
need there to be actual MOV instructions for them.
total instructions in shared programs: 101045 -> 100423 (-0.62%)
instructions in affected programs: 37083 -> 36461 (-1.68%)
|
|
|
|
| |
I must have misplaced it in the instruction packing rework.
|
|
|
|
| |
We don't have a src1 to look up if the compare instruction is "i2b".
|
|
|
|
|
|
|
| |
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
|
|
|
|
| |
After the 4.1 spec, 4.2 retroactively renamed patchid to barrierid because
it's used for other barriers in compute.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the meson.build, meson_options.txt, and a few scripts that are
used exclusively by the meson build.
v2: - Remove accidentally included changes needed to test make dist with
LLVM > 3.9
Signed-off-by: Dylan Baker <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Prevents potential infinite loops when a non-dispatched or discarded
channel never triggers the loop break condition.
|
|
|
|
|
| |
I think this should be equivalent other than power, and it's the kind of
comparison we use for nir_op_ieq.
|
|
|
|
| |
I was trying to do a NULL-destination UF, and it got removed.
|
|
|
|
|
| |
Now that the actions are reused for centroid and nonperspective, give them
a more generic name.
|
|
|
|
|
|
|
| |
The LDVARY signal now writes an arbitrary register, so I took out the
magic src register file and replaced it with an instruction with LDVARY
set so we have somewhere to hang a QFILE_TEMP destination for register
allocation.
|
| |
|
|
|
|
|
|
| |
The V3D 3.x series of TMU writes with meaning depending on the texture
type is replaced with writes to specific registers for each texture
argument semantic.
|
|
|
|
|
| |
V3D 4.x texturing changes enough that #ifdefs would just make a mess of
it.
|
|
|
|
|
| |
For V4.1 texturing, I need the V4.1 XML, so the main compiler needs to
stop including V3.3 XML.
|
| |
|
|
|
|
|
| |
I want the library's entrypoints to still be unversioned, but the actual
packet dumping needs to be per-version.
|
|
|
|
|
| |
This is a major performance boost on all of V3D, but is required on V3D
4.x where shaders are always either 2- or 4-threaded.
|
|
|
|
|
|
|
|
|
|
| |
This fills in the delay slots of thread end as much as we can (other than
being cautious about potential TLBZ writes).
In the process, I moved the thread end THRSW instruction creation to the
scheduler. Once we start emitting THRSWs in the shader, we need to
schedule the thread-end one differently from other THRSWs, so having it in
there makes that easy.
|
|
|
|
| |
Apparently the VPM writes need to be flushed out before we end the shader.
|
|
|
|
|
| |
I had a .ifb being decoded weird in sampid, so this is to check that .ifb
is fine.
|
| |
|