| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
These reworks were combined into this patch:
* Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+
* Francisco Jerez: intel/eu/validate/gen12: Disable
qword_low_power_no_depctrl eu_validate test.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Reworks:
* adjust 64-bit support, hiz (Jason Ekstrand)
* sim-id (Lionel Landwerlin)
* adjust threads, urb size (Rafael Antognolli)
* adjust urb size (Kenneth Graunke)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
| |
Calculated the number for allocation and did not
reserve space ....
Fixes: 2117c53b723 "radv: Add temporary datastructure for submissions."
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
VGPR spilling is implemented via MUBUF instructions and scratch memory.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
| |
This patch also moves private_segment_buffer and
scratch_offset to Program to easily access it.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
| |
predecessor
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
| |
Variables spilled on both branch legs need to be assigned to the same spilling slot.
These affinities can be transitive through multiple merge blocks.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
| |
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
| |
This patch makes the live variable analysis more precise
w.r.t. killed phi operands and the block's register pressure.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Converting to 'Conventional SSA Form' ensures correctness w.r.t. spilling of phi nodes.
Previously, it was possible that phi operands have intersecting live-ranges, and thus,
couldn't get spilled to the same spilling slot. For this reason, ACO tried to avoid to
spill phis, even if it was beneficial.
This patch implements a conversion pass which is currently only called if spilling is necessary.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes these deqp tests (and more):
dEQP-GLES2.functional.draw.draw_arrays.points.single_attribute
dEQP-GLES2.functional.draw.draw_arrays.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_arrays.points.default_attribute
dEQP-GLES2.functional.draw.draw_elements.points.single_attribute
dEQP-GLES2.functional.draw.draw_elements.points.multiple_attributes
dEQP-GLES2.functional.draw.draw_elements.points.default_attribute
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The final version of previous stencil fix patch ended up breaking one-sided
stencil.
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.fragment_ops.depth_stencil.*
Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0
Fixes: 05da025f ("etnaviv: fix two-sided stencil")
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes remaining failures in these deqp tests (tested on GC3000/GC7000L):
dEQP-GLES2.functional.polygon_offset.*
Fixes: 6c3c05dc ("etnaviv: fix polygon offset")
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.
v2: Fix typo case in the comment (Nanley)
v3: Rebase and fix conflicts.
v4: Fix rebase mistake (Nanley).
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.
v2: Assert that image->planes[plane].offset is 4K aligned (Nanley)
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
| |
While we're at it, make sure we error out if it's not supported when
required.
This brings us a bit closer to being able to test on SwiftShader, which
doesn't currently support KHR_external_memory_fd.
|
|
|
|
|
|
|
|
|
|
| |
Winsys semaphores without signal operation get silently ignored.
Not so for syncobjs, so actually signal them.
Fixes: 84d9551b232 "radv: Always enable syncobj when supported for all fences/semaphores."
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2030
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Closes: #1974
Signed-off-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
| |
It was joining from the wrong blocks and block.kind is a bitmask instead
of an enum.
Reviewed-By: Timur Kristóf <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
| |
TGL uses different data (and even a different format!) for each source.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
| |
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.
enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.
Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
|
|
|
|
|
|
|
|
| |
Mostly for vertex formats, but they are supported as texture formats too
(untested however).
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Disable sdma on gfx10 until all timeouts bugs are fixed.
See:
https://gitlab.freedesktop.org/mesa/mesa/issues/1907
https://bugs.freedesktop.org/show_bug.cgi?id=111481
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
SDMA IB doesn't need to be padded for SDMA.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If src/dst addresses are dw aligned and size is > 4 then we align
byte count to dw as well.
PAL implementation works like this.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
HW 8K decode support starts at Renoir
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
| |
Require increase of context buffer size
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: remove accidental shaderInt16 change
v2: simplify can_move_down initialization
v2: simplify VMEM_CLAUSE_MAX_GRAB_DIST
Reviewed-by: Daniel Schürmann <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously, the scheduler tried to move up instructions from below depending
VMEM instructions only to move them down again when scheduling the VMEM
instruction.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
| |
These got lost due to some refactoring.
Due to the way our scheduler works currently, for now
we add back the reorder flag for divergent loads only.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch changes VMEM scheduling in a way that they can only
be moved upwards by previous VMEM instructions but not downwards.
This way, it improves the order of VMEM instructions in relation
to their users.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously, we allowed all shaders to reduce the number of max_waves to as low as 5.
Restricting this on shaders with low register demand, increases the total number of waves
while the VMEM def-use distances hardly change.
This patch also changes the max number of move operations per MEM instruction.
Reviewed-by: Rhys Perry <[email protected]>
|
|
|
|
|
|
|
|
| |
This shaves around 4-5% off of a CPU-limited example running with the
Dawn WebGPU implementation.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|