| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Alessandro Pignotti noted when I added this code in commit
0e723b135bfd59868c92c3ae243f1adaedaec3a5 that it's in the else block for
"if (busy)", so this debug print couldn't happen.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
v2: Support Ironlake as well.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally when submitting the first batch buffer after a flush, we
check whether the GPU has completed processing of the first batch
buffer of the previous frame. If it hasn't, we wait for it to finish
before submitting any more batches. This prevents GPU-heavy and
CPU-light applications from racing too far ahead of the current frame,
but at the expense of possibly lower frame rates. Sometimes when
benchmarking we want to disable this mechanism.
This patch adds the driconf option "disable_throttling" to disable the
throttling mechanism.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.
It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).
No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.
This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
v2: Eliminate unnecessary call to _mesa_is_depthstencil_format(). Fix
handling of depth buffer in depth/stencil format.
v3: Use correct bitfields for clear_mask. Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Ander Conselvan de Oliveira <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP. Hopefully this should be useful in
developing additional BLORP features.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
I was looking at the list to see what might be interesting to document for
application developers, and it turns out some are completely dead.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Mesa core is the place for encoding what format/type matches a mesa
format, so rely on that.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
We were allowing things like copying RG1616 to a user's ARGB8888
format, while we were denying anything that wasn't ARGB8888 or
RGB565.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is similar code to intel_miptree_copy_slice, but the knobs
are all set differently.
v2: fix whitespace
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I'm trying to move us away from the region structure, and all the
callers are currently dereferencing a miptree to get the region.
In this change, the map_refcount is dropped. However, the bo->virtual is
itself map refcounted, so that's already dealt with.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
The point of tracking the value was removed in February 2012
(65b096aeddd9b45ca038f44cc9adfff86c8c48b2), and this should have
been removed at the same time.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't see any reason for it -- it was introduced with the DRI2
invalidate work by krh in 2010 with no explanation. I suspect it was
something about wanting the same drm_intel_bo struct underneath multiple
openings of the BO within one process, but that's covered by libdrm at
this point. As far as the struct region goes, it is not threadsafe, so
multiple contexts sharing a region could have mixed up the map_count and
assertion failed or worse.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
I was testing the ARB_debug_output code and wrote an obvious sample that
should have hit this, and got confused that my ARB_debug_output was
broken.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
I tried to ensure that performance in the non-debug case doesn't change
(we still just check one condition up front), and I think the impact is
small enough in the debug context case to warrant including all of it.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
They're about to change to handle GL_ARB_debug_output, so just make one
function.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This doesn't provide detailed error type information, but it's important
to get these relatively severe but rare error messages out to the
developer through whatever mechanism they are using.
v2: Rebase on new WARN_ONCE additions.
Reviewed-by: Jordan Justen <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
| |
The second digit was off by one, which meant we accidentally treated
GTn as GT(n-1). This also meant no support for GT1 at all.
NOTE: This is a candidate for stable branches.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
V2: Works on Ivy Bridge now too, so this can be 6+.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7 has an erratum affecting the ld_mcs message, making it unsafe to
use when the surface doesn't have an associated MCS.
From the Ivy Bridge PRM, Vol4 Part1 p77 ("MCS Enable"):
"If this field is disabled and the sampling engine <ld_mcs>
message is issued on this surface, the MCS surface may be
accessed. Software must ensure that the surface is defined
to avoid GTT errors."
To allow the shader to treat all surfaces uniformly, force UMS if the
surface is to be used as a multisample texture, even if CMS would have
been possible.
V3: - Quoted erratum text
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
V2: - Fix for state moving from texobj to image
- Rebased onto Paul's logical/physical cleanup
- Fixed missing quantization of sample count
- Fold in IMS renderbuffer wrapper fixes from later in the series
- Use correct physical slice offset for UMS/CMS surfaces on Gen7
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Reviewed-and-tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLX extension lets you expose visuals that explicitly guarantee you
that the GL_FRAMEBUFFER_SRGB_CAPABLE flag will be set, but we can set
the flag even while the visual doesn't provide the guarantee. This
appears to be consistent with other implementations, as we've seen
several apps now that don't require an srgb visual and assume sRGB will
work without checking the GL_FRAMEBUFFER_SRGB_CAPABLE flag.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55783
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60633
Reviewed-and-tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have W-tiled S8, we can't just region_map and poke at bits --
there has to be some swizzling. Rely on intel_miptree_map to get that job
done. This should also get the highest performance path we know of for the
mapping (interesting if I get around to finishing movntdqa some day).
v2: Fix stale name of the bit in a comment.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
I want to reuse intel_miptree_map() to replace some region mapping that's
broken for separate stencil, but doing so would result in new demands on
ETC transcode that we actually don't want to happen.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Without this set, dri_util.c:dri2CreateContextAttribs
will reject requests to create a context with
__DRI_API_OPENGL_CORE.
This prevents a 3.2 core profile context from being created
even when MESA_GL_OVERRIDE_VERSION=3.2 is used.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the override is version is >= 3.1, then update the
max_gl_core_version. Otherwise, update max_gl_compat_version.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The field was equivalent to (etc_format != MESA_FORMAT_NONE), and
therefore duplicate information.
This patch removes field and replaces all references to it with
`etc_format != MESA_FORMAT_NONE`.
No Piglit ETC test regresses on Intel Sandybridge.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Regardless of what we put in the screen structure, all of the extensions
that compute_version_es2 checks are present and 3.0 will be exposed
anyway.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The default alignment is 4, so this fast path was rarely hit. Rather
than introduce logic to handle alignment, just use the Mesa core
function.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46632
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_texture_rgb10_a2ui pre-GEN4
Older hardware cannot do ARB_texture_rgb10_a2ui, and the translation
code for OES_compressed_ETC1_RGB8_texture was never implemented in the
i915 driver.
NOTE: This is a candidate for all stable branches.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes unused pointer value defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous to this patch, there were 13 identical definitions of this
macro in Mesa source. That's ridiculous. This patch consolidates 6
of them to a single definition in src/mesa/main/macros.h.
Unfortunately, I wasn't able to eliminate the remaining definitions,
since they occur in places that don't include src/mesa/main/macros.h:
- include/pci_ids/pci_id_driver_map.h
- src/egl/drivers/dri2/egl_dri2.h
- src/egl/main/egldefines.h
- src/gbm/main/backend.c
- src/gbm/main/gbm.c
- src/glx/glxclient.h
- src/mapi/mapi/stub.c
I'm open to suggestions as to how to deal with the remaining redundancy.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the i965 driver enabled EXT_framebuffer_multisample even
on pre-gen6 chipsets. However, since we don't support multisampling
on these chips, we set GL_MAX_SAMPLES=1 (the minimum allowed by
EXT_framebuffer_multisample), and if the client ever requested a
multisample buffer, we quietly supplied them with a single-sampled
buffer instead.
After some discussion on the mailing list (see thread
"ext_framebuffer_multisample: check for num_samples<=1"), it's clear
that this was the wrong approach. The correct approach is to only
expose EXT_framebuffer_multisample when we truly support
multisampling; that frees us to set a sensible value of
GL_MAX_SAMPLES=0 on other chipsets, so that we never have to deal with
a client requesting a multisample buffer when multisampling isn't
supported.
This change causes the following piglit tests to be skipped on
chipsets prior to Gen6:
- "ARB_framebuffer_sRGB/blit {renderbuffer,texture}
{linear,linear_to_srgb,srgb,srgb_to_linear}
{downsample,msaa,upsample} {disabled,enabled}"
- EXT_framebuffer_multisample/blit-mismatched-formats
- EXT_framebuffer_multisample/blit-mismatched-sizes
- EXT_framebuffer_multisample/dlist
- EXT_framebuffer_multisample/interpolation 0 *
- EXT_framebuffer_multisample/minmax
- EXT_framebuffer_multisample/negative-copypixels
- EXT_framebuffer_multisample/negative-copyteximage
- EXT_framebuffer_multisample/negative-max-samples
- EXT_framebuffer_multisample/negative-mismatched-samples
- EXT_framebuffer_multisample/negative-readpixels
- EXT_framebuffer_multisample/renderbuffer-samples
- EXT_framebuffer_multisample/renderbufferstorage-samples
- EXT_framebuffer_multisample/samples
This is expected, since the above tests exercise MSAA functionality,
and shouldn't be run on systems prior to Gen6.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The BLT engine has many limitations. Currently, it can only blit
X-tiled buffers (since we don't have a kernel API to whack the BLT
tiling mode register), which means all depth/stencil operations get
punted to meta code, which can be very CPU-intensive.
Even if we used the BLT engine, it can't blit between buffers with
different tiling modes, such as an X-tiled non-MSAA ARGB8888 texture
and a Y-tiled CMS ARGB8888 renderbuffer. This is a fundamental
limitation, and the only way around that is to use BLORP.
Previously, BLORP only handled BlitFramebuffer. This patch adds an
additional frontend for doing CopyTexSubImage. It also makes it the
default. This is partly to increase testing and avoid hiding bugs,
and partly because the BLORP path can already handle more cases. With
trivial extensions, it should be able to handle everything the BLT can.
This helps PlaneShift massively, which tries to CopyTexSubImage2D
between depth buffers whenever a player casts a spell. Since these
are Y-tiled, we hit meta and software ReadPixels paths, eating 99% CPU
while delivering ~1 FPS. This is particularly bad in an MMO setting
because people cast spells all the time.
It also helps Xonotic in 4X MSAA mode. At default power management
settings, I measured a 6.35138% +/- 0.672548% performance boost (n=5).
(This data is from v1 of the patch.)
No Piglit regressions on Ivybridge (v3) or Sandybridge (v2).
v2: Create a fake intel_renderbuffer to wrap the destination texture
image and then reuse do_blorp_blit rather than reimplementing most
of it. Remove unnecessary clipping code and conditional rendering
check.
v3: Reuse formats_match() to centralize checks; delete temporary
renderbuffers. Reorganize the code.
v4: Actually copy stencil when dealing with separate stencil buffers but
packed depth/stencil formats. Tested by a new Piglit test.
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]> [v4]
Reviewed-by: Ian Romanick <[email protected]> [v3]
Reviewed-and-tested-by: Carl Worth <[email protected]> [v2]
Tested-by: Martin Steigerwald <[email protected]> [v3]
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Strangely, the DRIimage interface we have passes the pitch in pixels
instead of bytes, which anholt missed in the change to using bytes for
region pitch.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=60212
Tested-by: Scott Moreau <[email protected]>
Tested-by: Tiago Vignatti <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Save miptree level info to DRIImage:
- Appropriately-aligned base offset pointing to the image
- Additional x/y adjustment offsets from above.
v8: -Bump intelImageExtension version
v9: -Don't use internal _eglError but implement error reporting in new DRI inteface
instead. This fixes Android build problems based on feedback from
Adrian M Negreanu and Chad Versace.
-Move the non-tile-aligned check and error-reporting to intel_set_texture_image_region
v10: -Don't #include "egl/main/eglcurrent.h". [chadv]
Reviewed-by: Eric Anholt <[email protected]> (v6)
Acked-by: Chad Versace <[email protected]> (v10)
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to take account the offset from original bo when using glTexSubImage()
and other functions that manipulate the subregion of an exported texture.
Offsets are appended to mapped region address and when blitting from a source
region.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When binding a region to a texture image, re-create the miptree base-level
considering the offset and dimension information exported by DRIImage.
v8: - Move the alignment surface address checks from the image-from-texture
code to the texture-from-image side. This allows the error reporting to conform to
OES_EGL_Image and to prevent mixing up EGL and GL errors. Reported by Chad Versace.
- Addressed an existing issue in renderbuffer case where there is a
a possibility of creating EGL images out of depthstencil textures which isn't
really possible. This was spotted by Eric earlier.
Reviewed-by: Eric Anholt <[email protected]> (v6)
Reviewed-by: Chad Versace <[email protected]> (v8)
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add helper to calculate fine-grained x and y adjustment pixels
to an image within a miptree level for tiled regions.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
| |
v8: - Append has_depthstencil field in DRIImage structure.
Reviewed-by: Eric Anholt <[email protected]> (v6)
Reviewed-by: Chad Versace <[email protected]> (v8)
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The packet length may change at some point in the future. Specifying it
explicitly (rather than hardcoding it in the command #define) allows us
to change it much more easily in the future.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we were keeping a CPU-only buffer to accumulate the batchbuffer in,
which was an improvement over mapping the batch through the GTT directly
(since any readback or other failure to stream through write combining
correctly would hurt). However, on LLC-sharing architectures we can do better
by mapping the batch directly, which reduces the cache footprint of the
application since we no longer have this extra copy of a batchbuffer around.
Improves performance of GLBenchmark 2.1 offscreen on IVB by 3.5% +/- 0.4%
(n=21). Improves Lightsmark performance by 1.1 +/- 0.1% (n=76). Improves
cairo-gl performance by 1.9% +/- 1.4% (n=57).
No statistically significant difference in GLB2.1 on SNB (n=37). Improves
cairo-gl performance by 2.1% +/- 0.1% (n=278).
|
|
|
|
| |
Comment change only.
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|