| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
The fix in the vc4-jobs series ended up triggering the fallback path on
GLX apps that use depth but not stencil.
|
|
|
|
|
| |
I was curious if my Z/S buffer was actually ZS or ZX, and the vc4 format
of "0" didn't tell me much.
|
| |
|
|
|
|
|
|
|
| |
This slightly reduces instructions on shader-db, but I think it's just
perturbing register allocation -- the allocator should have always
trivially colored these nodes, before. This commit is just to make QIR
code failing more intelligible when register allocation fails.
|
|
|
|
|
|
|
|
|
| |
If a conditional assignment is only conditioned on the exec mask, that's
still screening off the value in the executed channels (and, since we're
not storing to the unexcuted channels, we don't care what's in there).
Fixes a bunch of extra register pressure on Processing's Ribbons demo,
which is failing to allocate.
|
|
|
|
|
|
| |
We would assertion fail in setting up the simulator the second time
around. This at least postpones the assertion failure until we've closed
all of the first set of screens and started opening a new set.
|
|
|
|
|
| |
Fixes 100 piglit tests since the assertions were added to nir.h. What's
amazing is that these tests used to pass, even when casting garbage.
|
|
|
|
|
|
|
|
|
| |
Checking if MAD is supported is definitely wrong, and it's
more likely a typo I introduced few days ago which breaks
NV50 because SHLADD is not supported there.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
When the chipset is forced with NV50_PROG_CHIPSET, we actually
only want to output the binary if NV50_PROG_DEBUG is also
enabled. Otherwise, this pollutes the shader-db output.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Only expose 512 threads/block on Fermi to not be limited by
32 GPRs/thread.
v4: - use 512 threads on Fermi, 1024 on Kepler+
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
This allows to use 64 GPRs/thread.
v4: - use 512 threads on Fermi, 1024 on Kepler+
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v3: - use a new case statement in r600_pipe_common.c
- fix compilation of softpipe...
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
helped some ue4 demos and divinity OS shaders
total instructions in shared programs : 2818674 -> 2818606 (-0.00%)
total gprs used in shared programs : 379273 -> 379273 (0.00%)
total local used in shared programs : 9505 -> 9505 (0.00%)
total bytes used in shared programs : 25837792 -> 25837192 (-0.00%)
local gpr inst bytes
helped 0 0 33 33
hurt 0 0 0 0
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
One of NIR's invariants is that control flow lists always start and end
with blocks. There's no good reason why we should return a cf_node from
these functions since we know that it's always a block. Making it a block
lets us remove a bunch of code.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise it won't be picked in the tarball and the build will fail.
Fixes: 0035f7f1365 ("svga: add guest statistic gathering interface")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Currently, program binaries are only dumped at upload time, but
when the chipset has been forced via NV50_PROG_CHIPSET we might
want to show the generated code, especially with shaderdb.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
There are VM faults without this.
Cc: 12.0 <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Not returning garbage in .zw seems pretty important.
This fixes:
GL45-CTS.shader_multisample_interpolation.render.interpolate_at_*_check.*
Cc: 11.2 12.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
fixes a crash in the case simplify reports an error
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
Check for device reset on flush. It would be nicer if the kernel just
reported this as an error on the submit ioctl (and similarly for fences),
but this will do for now.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: slab_alloc_st -> slab_alloc
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This is enabled automatically if shader printing is enabled, or separately
by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later
pass.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Caught by R600_DEBUG=checkir (next commit).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This changes the order of basic blocks to be equal to the order of code in the
original TGSI, which is nice for making sense of shader dumps.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
In particular, we no longer emit an else block when there is no ELSE
instruction.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the existing code is needlessly complicated. The basic principle
should be: control-flow opcodes emit branches to properly terminate the
current block, _unless_ the current block already has a terminator (which
happens if and only if there was a BRK or CONT).
This also fixes a bug where multiple terminators were created in a block.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Really fix the bug that was supposed to be fixed by commits 3e7cced4b and
a48bf02d: even when virtual addresses are used, the legacy relocation-based
method with offsets relative to the kernel's buffer object are used for
video submissions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969
Reviewed-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
I guess this is not needed because dead code elimination removes
the declaration.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
We can finally do this, because the opcodes are scalar now.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
si_llvm_emit_ddxy is called once per element, so we don't have to generate
code for 4 elements at once.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
| |
do it at bind time, so that pipe_sampler_view is immutable with regard to
buffer reallocations and we don't have to remember all existing buffer
views.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|