| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Marek v2: add a cap
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use
the GK110 path when this chip is detected.
Signed-off-by: Alexandre Courbot <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
GK20A is mostly compatible with GK104, but features a new 3D
class. Add it to the relevant header and use it when GK20A is
detected.
Signed-off-by: Alexandre Courbot <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Provide an accelerated path for ARB_clear_buffer_object
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
In commit af38ef907, I added a "fix" to color outputs not being assigned
correctly when sample mask was being output. This was totally wrong --
the color indices (i.e. "si" values) were the ones that were wrong. Undo
that hunk.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
But don't count their size towards the allocated memory, since that
belongs to whoever created it.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
[imirkin: moved default case out of switch]
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
Note that predicated instructions with defs are still not supported
because transformation to SSA doesn't handle them yet.
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
| |
[imirkin: add logic to also clear the "regular" scissors]
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Make sure to normalize the z coordinates as well as the x/y ones when
there are mipmaps present. Fixes 3d mipmap generation, which now uses
the blit path.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These instructions can come in either through IMUL_HI/UMUL_HI TGSI
opcodes, or from OP_DIV constant folding.
Also make sure that the constant foldings which delete the original
instruction still get counted as having done something.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.1 10.2" <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Retrieving the high 32 bits of a signed multiply is rather annoying. It
appears that the simplest way to do this is to compute the absolute
value of the arguments, and perform a u32 x u32 -> u64 operation. If the
arguments' signs differ, then negate the result. Since there is no u64
support in the cvt instruction, we have the perform the 2's complement
negation "by hand".
This logic can come into use by the IMUL_HI instruction (very unlikely
to be seen), as well as from constant folding of division by a constant.
Fixes dolphin's divisions by 255.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.1 10.2" <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
UNION appears to expect that all of its sources are conditionally
defined. Otherwise it inserts an unpredicated mov instruction which
overwrites the desired result. This fixes tests that use UMUL_HI, and
much less directly, unsigned integer division by a constant, which uses
this functionality in a peephole pass.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.1 10.2" <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.2" <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The big missing part here is proper sched data calculations, but
hopefully the chosen placeholder will be sufficient for now.
Passes piglit as well as GK107 does.
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
SM50 backend requires 21 bits per instruction, not 8.
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Maxwell doesn't have immediate-mode.
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Maxwell no longer has the methods to set constant attributes, and we'll
want to be treating stride 0 vtxbufs the same as for stride > 0.
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Skeggs <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Need to adjust coordinates since the shader receives the array index as
depth in z, but the TEX instruction expects it to be the second
coordinate for a 1D array texture. This fixes fbo-generatemipmap-array.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes the new logic of the conditional rendering piglit test.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Different textures may be bound to each slot for each stage. So we need
to be able to upload ms parameters for each one without stages
overwriting each other.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
Cc: "10.1 10.2" <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
Cc: "10.2 10.1" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Should fix comparison opcodes like SGE/SLT/etc which expected a float to
be returned. These were previously getting integer 0/-1 values.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ben Skeggs <[email protected]>
Cc: 10.2 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The old condition disallowed load propagation any time flags were
defined, even with e.g. set and a constbuf reference. The new condition
disallows it only with immediate propagation. (There are no opcodes that
set the condition flag and have an immediate argument.)
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
S8_UINT will become useful when ARB_texture_stencil8 becomes supported by
mesa. The other stencil formats are needed for ARB_stencil_texturing.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This fixes textureGather(2DRect) piglit tests, and does not appear to
have any adverse effects.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes textureGatherOffset when used with a shadow sampler. Also verified
against blob compiler with textureLodOffset manually (no piglit tests
for texture[Lod]Offset + shadow samplers).
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to have non-constant offsets for textureGatherOffset and
textureGatherOffsets.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This adds support for:
IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC
Which are all required for ARB_gs5 support.
Signed-off-by: Ilia Mirkin <[email protected]>
|