| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
| |
This makes dump_instructions more useful.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for
RED, RG, and RGB textures in order to force alpha to 1.0 in case we
actually stored the texture as RGBA.
This had a unforseen performance implication: the shader precompile
assumes that the texture swizzle mode will be XYZW for non-shadow
sampler types. By setting it to XYZ1, this means every shader used with
a RED, RG, or RGB texture has to be recompiled. This is a very common
case.
Unfortunately, there's no way to improve the precompile, since RGBA
textures still need XYZW, and there's no way to know by looking at
the shader source what texture formats might be used.
However, we only need to smash alpha to 1.0 if the texture's memory
format actually has alpha bits. If not, the sampler already returns 1.0
for us without any special swizzling. XRGB8888, for example, is a very
common case where this occurs.
This partially fixes a performance regression since commit 33599433c7.
More work is required to fully fix it in all cases. This at least helps
Warsow.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Carl Worth <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the old context creation mechanism, an application asked the GL to
give it a context. Failing to produce a context was a fatal error.
Now, with GLX_ARB_create_context, the application can request a specific
version. If it's higher than the maximum version we support, context
creation will fail. But this is a normal error that applications
recover from.
In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1,
4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1
context. This led to it printing the following message 6 times:
"brwCreateContext: failed to init intel context"
There's no need to alarm users (and developers) with such a message.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW
workaround: blit" (the printouts from the misaligned-depth workaround
blits) from 725 to 675.
It doesn't totally eliminate the workaround blit, because we still have
problems with Y offsets that we can't fix (since texturing can only align
miplevels up to 2 or 4, not 8).
No regressions on piglit/es3conform on IVB.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous to this patch, when using fixed function fragment shading,
bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set
differently during precompiles and normal usage. During precompiles
it was being set only if the fragment shader reads from window
position (which it never does), so it was always being set to 0.
During normal usage it was being set if the vertex shader writes to
all 4 components of gl_Position (which it usually does), so it was
usually being set to 1. As a result, we were almost always doing an
extra recompile for the fixed function fragment shader.
The recompile was totally unnecessary, though, because
brw_wm_prog_key::proj_attrib_mask is only consulted for
fs_visitor::emit_general_interpolation(), which isn't used for
VARYING_SLOT_POS.
This patch avoids the unnecessary recompile by always setting bit
VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, right after calling _mesa_glsl_link_shader(), the fixed
function fragment shader code made several calls with the ostensible
purpose of setting up uniforms for the fragment shader it just
created.
These calls are unnecessary, since _mesa_glsl_link_shader() calls
driver->LinkShader(), which takes care of calling these functions (or
their equivalent). Also, they are dangerous to call after
_mesa_glsl_link_shader() has returned, because on back-ends such as
i965 which do precompilation, _mesa_glsl_link_shader() may have
already cached pointers to the existing uniform structures; attempting
to set up the uniforms again invalidates those cached pointers.
It was only by sheer coincidence that this wasn't manifesting itself
as a bug. It turns out that i965's precompile mechanism was always
setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed
function fragment shaders, but during normal usage this bit usually
gets set to 1. As a result, the precompiled shader (with its invalid
uniform pointers) was not being used.
I'm about to introduce some changes that cause bit 0 of
proj_attrib_mask to be set consistently between precompilation and
normal usage, so to avoid regressions I need to get rid of the
dangerous duplicate uniform setup code first.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since apps typically begin rendering with a call to glClear(), it is
likely that when brw_workaround_depthstencil_alignment() moves a
miplevel to a temporary buffer, it can avoid doing a blit, since the
contents of the miplevel are about to be erased.
This patch adds the necessary plumbing to determine when
brw_workaround_depthstencil_alignment() is being called as a
consequence of glClear(), and avoids the unnecessary blit when it is
safe to do so.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
v2: Eliminate unnecessary call to _mesa_is_depthstencil_format(). Fix
handling of depth buffer in depth/stencil format.
v3: Use correct bitfields for clear_mask. Fix handling of depth
buffer in depth/stencil format when hardware uses separate stencil.
When invalidating, make sure we still reassociate the image to the new
miptree.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1
v2: Move the added line immediately after -I$(top_srcdir)/src/mapi
NOTE: This is a candidate for the 9.1 and 9.0 branches.
Acked-by: Kenneth Graunke <[email protected]> (v1)
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Taken from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1
Fixes Debian bug #349437.
Patch written by David Nusinow.
NOTE: This is a candidate for stable branches.
Acked-by: Kenneth Graunke <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
They shouldn't be necessary any more.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of allocating everything as temporaries, use the
new array allocation functions.
v2: fix bug in simplify_cmp, declare arrays on demand
Signed-off-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Ander Conselvan de Oliveira <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This debug flag prints out the native GEN assembly for a blitting
shader produced using BLORP. Hopefully this should be useful in
developing additional BLORP features.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
The only format returned by _mesa_get_format_base_format() that
satisfies _mesa_is_depthstencil_format() is GL_DEPTH_STENCIL, so we
can simplify the check.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fast depth clears have the same depth/stencil alignment requirements
as other drawing operations. Therefore, we need to call
brw_workaround_depthstencil_alignment() from both the clear and
drawing paths.
Without this fix, we get image corruption if the following conditions
hold: (a) the first ever drawing operation to a depth miplevel (or the
first drawing operation after having used the texture for sampling) is
a clear, (b) the depth miplevel has a size that is eligible for fast
depth clears, and (c) the depth miplevel has an offset within the
miptree that isn't 8x8 aligned.
Fixes piglit "depthstencil-render-miplevels" tests with size 273.
NOTE: This is a candidate for stable branches
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that there is no difference between the enums that represent
vertex outputs and fragment inputs, there's no need for a conversion
function. But we still need to be able to detect when a given vertex
output has no corresponding fragment input. So it is replaced by a
new function, _mesa_varying_slot_in_fs(), which tells whether the
given varying slot exists as an FS input or not.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_frag_attrib enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_geom_result -> gl_varying_slot
GEOM_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_geom_result enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_geom_attrib -> gl_varying_slot
GEOM_ATTRIB_* -> VARYING_SLOT_*
GEOM_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_geom_attrib enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_vert_result enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Future patches will make use of the enum. It will eventually take the
place of the existing enums gl_vert_result, gl_geom_attrib,
gl_geom_result, and gl_frag_attrib, all of which represent essentially
the same information but using inconsistent values.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the bitfields brw_context::wm.input_size_masks,
tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of
which are indexed by gl_frag_attrib, from 32-bit to 64-bit.
This paves the way for supporting geometry shaders, and for merging
the gl_frag_attrib and gl_vert_result enums. The combination of these
two will require at least 55 bits in the bitfields.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This option is needed for some applications that neglect to request
a depth buffer when choosing a visual/fbconfig.
The Linux app Topogun is an example of this problem.
|
|
|
|
|
|
|
| |
Move the options into the proper section (Debug, Quality, Performance,
etc).
Update comments and add some whitespace to improve readability.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Untyped Atomic Operation messages are illegal for non-RAW formats. The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't. The simulator for gen7 does complain, though.
v2: Rebase against updates to previous patches. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.
It will diverge shortly.
I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.
v2: Replace tabs in the new code (by anholt)
Add back dropped memset() and add a comment about HSW channel selects.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.
Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.
v2: Use the #defines from the previous commit. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
| |
We were sparsely using some of these message types, but I'll just fill
them all in now. It will be used for fixing shader_time on HSW.
v2: Add missing MEDIA_BLOCK_READ.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).
Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).
v2: Add a define for the stride with a comment explaining its units and
why.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.
V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.
Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495
Note: Candidate for all the stable branches.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Fixes the scons build.
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes this build error with make check.
CC collision.o
In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
from collision.c:31:
../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Should get the builds going again.
|
|
|
|
|
|
| |
To allow rendering in 16-bit/channel RGBA buffers.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Handled by top level .gitignore.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
One fewer place to have to update.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We were in four already...
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes mixing enum types defects reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|