| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is some special-casing needed in a competent back-end. However, they
can do their special-casing easily enough based on whether or not the
offset is a constant. In the mean time, having the *_indirect variants
adds special cases a number of places where they don't need to be and, in
general, only complicates things. To complicate matters, NIR had no way to
convdert an indirect load/store to a direct one in the case that the
indirect was a constant so we would still not really get what the back-ends
wanted. The best solution seems to be to get rid of the *_indirect
variants entirely.
This commit is a bunch of different changes squashed together:
- nir: Get rid of *_indirect variants of input/output load/store intrinsics
- nir/glsl: Stop handling UBO/SSBO load/stores differently depending on indirect
- nir/lower_io: Get rid of load/store_foo_indirect
- i965/fs: Get rid of load/store_foo_indirect
- i965/vec4: Get rid of load/store_foo_indirect
- tgsi_to_nir: Get rid of load/store_foo_indirect
- ir3/nir: Use the new unified io intrinsics
- vc4: Do all uniform loads with byte offsets
- vc4/nir: Use the new unified io intrinsics
- vc4: Fix load_user_clip_plane crash
- vc4: add missing src for store outputs
- vc4: Fix state uniforms
- nir/lower_clip: Update to the new load/store intrinsics
- nir/lower_two_sided_color: Update to the new load intrinsic
NIR and i965 changes are
Reviewed-by: Kenneth Graunke <[email protected]>
NIR indirect declarations and vc4 changes are
Reviewed-by: Eric Anholt <[email protected]>
ir3 changes are
Reviewed-by: Rob Clark <[email protected]>
NIR changes are
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We still have several failures in the newly enabled tests in simulation:
sRGB downsampling is done as if it was just linear, stencil blits are not
supported on MSAA either, and derivatives are still not supported
(breaking some MSAA simulation shaders). So, other than sRGB downsampling
quality, things seem to be in good shape.
|
|
|
|
|
| |
The pipe_transfer_map API requires that we do an implicit
downsample/upsample and return a mapping of that.
|
|
|
|
|
|
|
|
| |
This is the core of ARB_texture_multisample. Most of the piglit tests for
GL_ARB_texture_multisample require GL 3.0, but exposing support for this
lets us use the gallium blitter for multisample resolves. We can
sometimes multisample resolve using just the RCL, but that requires that
the blit is 1:1, unflipped, and aligned to tile boundaries.
|
|
|
|
|
|
|
|
| |
This includes GL_SAMPLE_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, and
GL_SAMPLE_ALPHA_TO_COVAGE.
I haven't implemented a dithering function yet, and gallium doesn't give
me a good chance to do so for GL_SAMPLE_COVERAGE.
|
|
|
|
|
|
| |
I only stumbled on this while experimenting due to reading about HW-2905.
I don't know if the EZ disable in the Z-clear is actually necessary, but
go with it for now.
|
| |
|
| |
|
|
|
|
|
| |
I was thinking this was the only MSAA resolve thing, so it should be noted
separately, but actually load/store general also do MSAA resolve.
|
|
|
|
|
|
|
| |
The recent unaligned fix successfully prevented RCL blits that weren't
aligned inside of the surface, but we also want to be able to do RCL blits
for the whole surface when the width or height of the surface aren't
aligned (we don't care what renders inside of the padding).
|
|
|
|
| |
I keep typing variants of this while debugging RCL blits for MSAA.
|
|
|
|
|
| |
This was a typo in 3a508a0d94d020d9cd95f8882e9393d83ffac377 that didn't
show up in testcases at that moment.
|
|
|
|
| |
I missed this when bringing over the kernel changes.
|
|
|
|
|
|
|
|
|
|
| |
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Even if the rasterizer has scissor disabled, we'll have whatever
vc4->scissor bounds were last set when someone set up a scissor, so we
shouldn't clip to them in that case.
Fixes piglit fbo-blit-rect, and a lot of MSAA tests once they're enabled.
|
|
|
|
|
|
|
|
| |
We could potentially handle scissored blits when they're tile aligned, but
it doesn't seem worth it. If you're doing a scissored blit, you're
probably a testcase.
Fixes piglit's fbo-scissor-blit fbo
|
| |
|
| |
|
|
|
|
|
|
| |
For MSAA, we store full resolution tile buffer contents, which have their
own tiling format. Since they're full resolution buffers, we have to
align their size to full tiles.
|
|
|
|
|
| |
From the API perspective, writing 1 bits can't turn on pixels that were
off, so we AND it with the sample mask from the payload.
|
|
|
|
|
|
|
|
| |
We were checking that the blit started at 0 and was 1:1, but not that it
went to the full width of the surface, or that the width was aligned to a
tile. We then told it to blit to the full width/height of the surface,
causing contents to be stomped in a bunch of MSAA tests that happen to
include half-screen-width blits to 0,0.
|
| |
|
|
|
|
|
|
| |
We can't dump it in the real driver, since the kernel doesn't give us a
handle to it (except after a GPU hang, using a root ioctl). In the
simulator we can.
|
|
|
|
|
|
|
|
| |
In the pipe-loader reworks, it was missed in one of the new directories it
was used.
Cc: [email protected]
Reviewed-by: Emil Velikov <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
I think I may have regressed this in the NIR conversion. TGSI-to-NIR is
putting the PSIZ in the .x channel, not .w, so we were grabbing some
garbage for point size, which ended up meaning just not drawing points.
Fixes glean pointAtten and pointsprite.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get
the absolute value would get lowered. The lowering doesn't bother to try
to restrict the lifetime of the lowered uniforms, so we'd end up register
allocation failng due to this on 5 of the tests (More tests still fail in
RA, which look like we'll need to reduce lowered uniform lifetimes to
fix).
No changes on shader-db, though fewer extra MOVs are generated on even
glxgears (MOVs pair well enough that it ends up being the same instruction
count).
|
|
|
|
|
| |
This does actually happen in the wild (particularly fabs of a uniform), so
we'd like to support it.
|
| |
|
|
|
|
|
|
|
|
| |
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear. For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
|
|
|
|
|
|
|
| |
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
| |
I was afraid our callers weren't prepared for this, but it looks like
at least for resource creation, mesa/st throws an error appropriately.
Cc: "11.0" <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
This greatly increases the pressure you can put on the driver before
create fails. Ultimately we need to let the kernel take control of
our cached BOs and just take them from us (and other clients)
directly, but this is a very easy patch for the moment.
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
It's surprising to see "0kb" printed for debug on short shaders, while
4kb alignment won't be suprising.
|
|
|
|
| |
60MB of cached BOs are a lot less scary than 600MB.
|
|
|
|
|
| |
Improves low-settings openarena performance by 31.9975% +/- 0.659931%
(n=7).
|
|
|
|
|
|
| |
For ARB_copy_image.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
total instructions in shared programs: 89251 -> 87862 (-1.56%)
instructions in affected programs: 52971 -> 51582 (-2.62%)
|
|
|
|
| |
Another step in reducing the special-casing of instructions.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This paves the way for copy propagating our unpacks. We end up with a
small change on shader-db:
total instructions in shared programs: 89390 -> 89251 (-0.16%)
instructions in affected programs: 19041 -> 18902 (-0.73%)
which appears to be because we no longer convert MOVs for an FMAX dst,
r4.unpack, r4.unpack (instead of the previous MOV dst, r4.unpack), and
this ends up with a slightly better schedule.
|
|
|
|
|
|
|
|
|
| |
At one point I thought packs and unpacks were in the same field of the
instruction. They aren't. These instructions therefore never cause a
pack.
total instructions in shared programs: 89472 -> 89390 (-0.09%)
instructions in affected programs: 15261 -> 15179 (-0.54%)
|
|
|
|
|
| |
I'm going to introduce some more types of MOV, which also want the elision
of raw MOVs.
|
|
|
|
| |
We can do 16a/16b from float as well. No difference on shader-db.
|
| |
|
|
|
|
| |
No problems being fixed, but needed for the new unpack changes.
|