| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Fixes texturing from EGL images created from cubemap faces, as in
dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture
Cc: [email protected]
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
We're going to use UBO loads for implementing YUV linear-to-T-format
blits.
|
|
|
|
|
| |
Previously we would assertion fail about having no hardware format. This
is enough to get kmscube -M nv12-2img working.
|
|
|
|
|
|
|
| |
When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field. When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.
|
|
|
|
| |
Improves u_debug_refcount output.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because vc4 can control the order that tiles are rasterized in, we can use
it to implement overlapping blits using normal drawing and
GL_ARB_texture_barrier, as long as we can tell the kernel what order to
render the tiles in.
v2: Fix on the simulator.
v3: Add the cap (disabled) to other drivers, add rst docs for the cap.
v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS
v5: Split from the core gallium commit, drop some unnecessary code related
to glBlitFramebuffer(), fix a crash with clears before state has been
bound.
|
|
|
|
|
|
|
| |
This has proven to be incredibly useful for debugging CMA allocation
failures and driving memory management improvements. However, we don't
want to burden entry and exit from the BO cache with the labeling ioctl's
overhead on release builds.
|
|
|
|
|
|
|
|
|
|
|
| |
I was overwriting view->texture with the shadow resource when we need to
do shadow copies (retiling or baselevel rebase), but that tripped up some
critical new sanity checking in state_tracker (making sure that stObj->pt
hasn't changed from view->texture through TexImage-related paths).
To avoid that, move the shadow resource to the vc4_sampler_view struct.
Fixes: f0ecd36ef8e1 ("st/mesa: add an entirely separate codepath for setting up buffer views")
|
|
|
|
|
|
|
|
|
|
| |
This gets our vc4_emit.c size back down a bit:
before:
1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o
after:
968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
|
|
|
|
|
| |
We only ever attached one vtbl, so it was a waste of space and
indirections.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
pipe_draw_info::indexed is replaced with index_size. index_size == 0 means
non-indexed.
Instead of pipe_index_buffer::offset, pipe_draw_info::start is used.
For indexed indirect draws, pipe_draw_info::start is added to the indirect
start. This is the only case when "start" affects indirect draws.
pipe_draw_info::index is a union. Use either index::resource or
index::user depending on the value of pipe_draw_info::has_user_indices.
v2: fixes for nine, svga
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
The format isn't flagged as enabled at runtime yet, because we need kernel
validation support.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Track rendering to each FBO independently and flush rendering only when
necessary. This lets us avoid the overhead of storing and loading the
frame when an application momentarily switches to rendering to some other
texture in order to continue rendering the main scene.
Improves glmark -b desktop:effect=shadow:windows=4 by 27%
Improves glmark -b
desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4
by 17%
While I haven't tested other apps, this should help X rendering a lot, and
I've heard GLBenchmark needed it too.
|
|
|
|
|
| |
This is a preparation step for having multiple jobs being queued up at the
same time.
|
|
|
|
|
| |
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v1 → v2:
- Fixed indentation (noted by Brian Paul)
- Removed second assert from nouveau's switch statements (suggested by
Brian Paul)
Signed-off-by: Kai Wasserbäch <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Fixes failure to initialize the force_first_level flag, causing
failures in piglit levelclamp.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary
miptree. However, if a single level is being selected, we can use the
existing miptree and force all the sampling to be from that particular
level.
This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses
base levels in the blit implementation in gallium. Improves "glmark2 -b
terrain" from 2 fps to 3 (perhaps some more precision would be useful?),
and cuts its CPU usage during the benchmarking from ~30% to ~10% (total
CPU time from 8.8s to 7.6s).
|
|
|
|
|
|
| |
The compiler uses the per-stage flags already, so it didn't need this.
vc4_uniforms was using it, so just replace it with both of the stage flags
for now.
|
|
|
|
| |
We use a big VC4_DIRTY_FRAGTEX/VC4_DIRTY_VERTEX on the stage, instead.
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This is apparently a weirdness of gallium -- nr_samples==1 is occasionally
used and means the same thing as nr_samples==0. Fixes a bunch of
ARB_framebuffer_srgb blit cases in piglit.
|
| |
|
| |
|
|
|
|
|
|
|
| |
Since X has undefined contents in new pixmaps, it will allocate new
textures for an FBO and draw to them without an explicit clear. For
VC4, it's much faster to emit a clear than the load of the actual
undefined memory contents, so just do that instead.
|
|
|
|
|
|
|
| |
I'm not sure what the caller does is appropriate (just have a NULL sampler
at this slot), but it fixes the immediate crash.
Cc: "11.0" <[email protected]>
|
|
|
|
|
| |
Improves low-settings openarena performance by 31.9975% +/- 0.659931%
(n=7).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't do this all the time, because you want blending to be done in
linear space, and sRGB would lose too much precision being done in 4x8.
The win on instructions is pretty huge when you can, though.
total uniforms in shared programs: 32065 -> 32168 (0.32%)
uniforms in affected programs: 327 -> 430 (31.50%)
total instructions in shared programs: 92644 -> 89830 (-3.04%)
instructions in affected programs: 15580 -> 12766 (-18.06%)
Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.
|
|
|
|
|
| |
Cuts another 12% of vc4_uniforms.o, in exchange for computing it at
CSO creation time.
|
|
|
|
|
| |
In exchange for a bit of space and computation in CSO setup, we cut
vc4_uniform.c (draw time) code size by 4.8%.
|
|
|
|
|
| |
No code generation changes from this, but it'll be useful to have this
next time I go checking -Wdouble-promotion.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea I had when I wrote the original shadow code was that you'd see a
set_index_buffer to the IB, then a bunch of draws out of it. What's
actually happening in openarena is that set_index_buffer occurs at every
draw, so we end up making a new shadow BO every time, and converting more
of the BO than is actually used in the draw.
While I could maybe come up with a better caching scheme, for now just
do the simple thing that doesn't result in a new shadow IB allocation
per draw.
Improves performance of isosurf in drawelements mode by 58.7967% +/-
3.86152% (n=8).
|
|
|
|
|
|
| |
Not sure what happened in my testing that made the previous shadow
code fix glxgears swapbuffering, but this also fixes lots of CopyArea
in X (like dragging xlogo around in metacity).
|
|
|
|
|
| |
We don't know who else has written to it, so we'd better update it every
time. This makes the gears spin in X again.
|
|
|
|
|
|
|
|
|
|
|
| |
So, it turns out my simulator doesn't *quite* match the hardware. And the
errata about raster textures tells you most of what's wrong, but there's
still stuff wrong after that. Instead, if we're asked to sample from
raster, we'll just blit it to a tiled temporary.
Raster textures should only be screen scanout, and word is that it's
faster to copy to tiled using the tiling engine first than to texture from
an entire raster texture, anyway.
|
|
|
|
| |
This is the same basic logic from the original Broadcom driver.
|
|
|
|
|
| |
When we upload shadow indices at draw time, we need the source offset.
Fixes the piglit draw-elements test.
|
|
|
|
|
|
|
| |
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).
Fixes glean's stencil2 and fbo-clear-formats on stencil.
|
|
|
|
|
| |
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
|
|
|
|
| |
Fixes assertion failures in 14 piglit tests (half of which now pass).
|
|
|
|
|
|
| |
GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't. Fixes
piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap
rendering.
|
|
|
|
| |
Fixes about 15 piglit tests about interpolation and clipping.
|
|
|
|
|
|
|
|
| |
The texturing hardware takes the POT level 0 width/height and minifies
those. This is different from what we were doing, for example, for
273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16.
Fixes piglit-depthstencil-render-miplevels 273.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The non-base NPOT levels are stored as POT-aligned images. We get that
POT alignment by minifying the POT-aligned base level.
This means that level strides are also POT aligned, so we have to tell the
rendering mode config that our resource is larger than the actual
requested area.
Fixes the fbo-generatemipmap-formats NPOT cases. Regresses
depthstencil-render-miplevels 273 * -- the texture presentation now works
(where it was completely broken before), it looks like there's some
overflow of image bounds happening at the lower miplevels.
|
|
|
|
| |
This is the support for both the global and per-vertex modes.
|
|
|
|
|
| |
Setting the bit without setting the offset values is kind of useless.
Fixes piglit polygon-offset (but not polygon-mode-offset).
|
|
|
|
|
|
|
| |
This is just the GL 1.1 flat shading of colors -- we don't need to support
TGSI constant interpolation bits, because we don't do GLSL 1.30.
Fixes 7 piglit tests.
|
|
|
|
|
|
|
| |
While depth test state is passed through the fragment shader as sideband,
data, the stencil test state has to be set by the fragment shader itself.
Many tests are still failing, but this gets most of hiz/ passing.
|