| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
Same as ARL, just has extra rounding.
Useful for st/nine.
Tested-by: Pavel Ondračka <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: David Heidelberg <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
| |
This code is already in if (!variable->C->is_r500) so no need check
twice.
Reviewed-by: Tom Stellard <[email protected]>
Signed-off-by: David Heidelberger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The definition of rc_pair_regalloc_inputs_only() is no longer
around so drop the declaration.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The r300 gallium driver is using it outside of the Mesa tree, and I wanted
to do so for vc4 as well. Rather than make the multiple-definitions
problem even more complicated, just move it to more-shared code.
v2: Don't forget to delete the symlink in r300 (review by Matt).
Delete more r300-helper references (review by Emil)
Don't prefix util/ header inclusion with "util/" (review by Emil)
Reviewed-by: Matt Turner <[email protected]> (v1)
Reviewed-by: Emil Velikov <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 567e2769b81863b6dffdac3826a6b729ce6ea37c ("ra: make the p, q
test more efficient") I unknowingly introduced a new requirement to the
register allocator API: the user must set the register class of all
nodes before setting up their interferences, because
ra_add_conflict_list() now uses the classes of the two interfering
nodes. i965 already did this, but r300g was setting up register classes
interleaved with setting up the interference graph. This led to us
calculating the wrong q total, and in certain cases
e78a01d5e6f77e075fe667a0f0ccb10d89c0dd58 (" ra: optimistically color
only one node at a time") made it so that this bug caused a segfault. In
particular, the error occurred if the q total was decremented to 1 below
0 for the last node to be pushed onto the stack. Since q_total is an
unsigned integer, it overflowed to 0xffffffff, which is what
lowest_q_total happens to be initialzed to. This means that we would
fail the "new_q_total < lowest_q_total" check on line 476 of
register_allocate.c, and so the node would never be pushed onto the
stack, which led to segfaults in ra_select() when we failed to ever give
it a register.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82828
Cc: "10.3" <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
Tested-by: Pavel Ondračka <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- Add top_srcdir/src/gallium/winsys to GALLIUM_DRIVER_C{XXFLAGS}.
- Remove top_srcdir/src/gallium/drivers/radeon from the includes.
As a result:
- Common radeon headers are prefixed with 'radeon/'
- Winsys header inclusion is prefixed 'radeon/drm'
Cc: Marek Olšák <[email protected]>
Cc: Michel Dänzer <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Fixes make check in that case.
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit glean "do-loop with continue and break" on RS690
It's based on Tom Stellard patch and improved to handle CMP instruction.
[v2] handle CMP instruction
Reviewed-by: Tom Stellard <[email protected]>
Signed-off-by: David Heidelberger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, there were 3 entrypoints into parts of the actual allocator,
and an API called ra_allocate_no_spills() that called all 3. Nobody
would ever want to call any of the 3 entrypoints by themselves, so
everybody just used ra_allocate_no_spills(). So just make them static
functions, and while we're at it rename ra_allocate_no_spills() to
ra_allocate() since there's no equivalent "with spills," because the
backend is supposed to handle spilling.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This gathers macros that have been included across components into util so
that the include chain can be more vertical. In particular, this makes
util stand on its own without any dependence whatsoever on the rest of
mesa.
Signed-off-by: "Jason Ekstrand" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
CC: "9.2" "10.0" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
CC: "9.2" "10.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
rc_find_free_temporary_list() returns signed integer
(in case of lack of free temporary registers returns -1),
so new_index in radeon_rename_regs() should be signed.
https://bugs.freedesktop.org/show_bug.cgi?id=54867
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
CC: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional
kill (if any src component < 0). The later was unconditional kill.
At one time KILP was supposed to work with NV-style condition
codes/predicates but we never had that in TGSI.
This patch renames both opcodes:
TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0)
TGSI_OPCODE_KILP -> KILL (unconditional kill)
Note: I didn't just transpose the opcode names to help ensure that I
didn't miss updating any code anywhere.
I believe I've updated all the relevant code and comments but I'm
not 100% sure that some drivers had this right in the first place.
For example, the radeon driver might have llvm.AMDGPU.kill and
llvm.AMDGPU.kilp mixed up. Driver authors should review their code.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=63520
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
The assembly parser can be used to load r300 assembly dumps
and run them through any of the r300 compiler passes.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=60503
NOTE: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
|
| |
The OMOD value was only being folded to one instruction in cases where
the MUL instruction was reading a value written by more than one
instruction.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now you can convert assembly strings into a full struct radeon_compiler
object and use it to test individual compiler pases.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This way make check can report whether or not the tests pass.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This will be used by the test suite in later commits.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are all files that I authored, but forgot to add the license
headers.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Tom Stellard <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
The GLSL compiler can simplify clamp(v,0,1) to saturate. The state tracker
doesn't use it yet, but it will.
Reviewed-by: Tom Stellard <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If an instruction reads from a constant register that contains
immediates using an invalid swizzle, we can avoid generating MOV
instructions to fix up the swizzle by loading the immediates into a
different constant register that can be read using a valid swizzle.
This only affects r300 and r400 cards.
For example:
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__;
========== Before this change would be lowered to: =========
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
MOV temp[0].x, const[1].x___;
MOV temp[0].y, const[1]._z__;
MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__;
========== After this change is lowered to: ===============
CONST[1] = { -3.5000 3.5000 2.5000 1.5000 }
CONST[2] = { 0.0000 -3.5000 2.5000 0.0000 }
MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__;
============================================================
This change reduces one of the Lightsmark shaders from 133 to 91
instructions.
v2:
- Fix crash caused by swizzles with only inline constants.
|
| |
|
|
|
|
|
|
| |
Initializing the regalloc state is expensive, and since it is always
the same for every compile we only need to initialize it once per
context. This should help improve shader compile times for the driver.
|
| |
|
|
|
|
|
|
| |
This allows the user to pass precomputed q values to the allocator.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch has been generated by the following Coccinelle semantic
patch:
// Don't cast the return value of malloc/realloc.
//
// Casting the return value of malloc/realloc only stands to hide
// errors.
@@
type T;
expression E1, E2;
@@
- (T)
(
_mesa_align_calloc(E1, E2)
|
_mesa_align_malloc(E1, E2)
|
calloc(E1, E2)
|
malloc(E1)
|
realloc(E1, E2)
)
|
|
|
|
| |
This fixes some integer division tests.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
No driver supports this extension, and it seems unlikely than any driver
ever will. I think r300c may have supported it at one time, but that
driver has already been removed.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
This way we correctly report "Too many temporaries" errors.
https://bugs.freedesktop.org/show_bug.cgi?id=48680
Note: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instruction attributes like WriteALUResult and ALUResultCompare
were being discarded during the some of the local transformations.
This fixes the following piglit tests:
glsl1-inequality (vec2, pass)
loopfunc
fs-any-bvec2-using-if
fs-op-ne-bvec2-bvec2-using-if
fs-op-ne-ivec2-ivec2-using-if
fs-op-ne-mat2-mat2-using-if
fs-op-ne-vec2-vec2-using-if
fs-op-ne-mat2x3-mat2x3-using-if
fs-op-ne-mat2x4-mat2x4-using-if
https://bugs.freedesktop.org/show_bug.cgi?id=45921
NOTE: This is a candidate for the stable branches.
|
| |
|
|
|
|
|
| |
On R500 chips, shader instructions can take 7-bit (3-bit mantissa, 4-bit
exponent) floating point values as inputs in place of registers.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v2:
- s/$(top_builddir)/$(top_srcdir)/
- Always generate Makefile.in
v3:
- Fixes from Matt Turner
- Use Mesa CFLAGS
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
KILP instruction inside IF blocks were being lowered to an unconditional
KIL. Since r300 doesn't support branching, when the IF's were lowered
to conditional moves, the KIL would always be executed. This is not a
problem with the mesa state tracker, because the GLSL compiler handles
lowering IF's, but this bug was appearing in the VDPAU state tracker,
which does not use the GLSL compiler.
Note: This is a candidate for the stable branches.
|
| |
|
|
|
|
|
|
| |
This fixes a memory leak on i965 context destruction.
NOTE: This is a candidate for the 8.0 branch.
|