| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This looks just like the VS dump for now.
|
|
|
|
|
|
|
|
|
| |
This saves both register space and upload bandwidth for unused values.
Note that previously we were relying on the visitor not initially
generating references to different sets of uniforms between the 8-wide
and 16-wide code generation, and now we're relying on them dead-code
eliminating the same stuff, too.
|
|
|
|
| |
This saves some 35MB when the program only uses GLSL shaders.
|
|\
| |
| |
| |
| |
| | |
Conflicts:
src/mesa/state_tracker/st_atom_pixeltransfer.c
src/mesa/state_tracker/st_program.c
|
| | |
|
| |
| |
| |
| |
| |
| | |
These looked more like copy-and-paste to me than the others (which
looked more like possibly someone forgot to write some code in a
refactor), so I didn't verify where they came from.
|
| |
| |
| |
| |
| | |
These have been unused since this function's introduction in the FBO
support development around 2009.
|
| |
| |
| |
| | |
These have been unused since 2009.
|
| |
| |
| |
| | |
r100 doesn't support 3D GL_EXT_texture3D.
|
| |
| |
| |
| | |
This has been around since the initial import in 2003 and never used.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This makes piglit a lot more happy. The errors are logged when
INTEL_DEBUG=fallbacks because the application is about to hit a big
software fallback. We frequently ask people to run applications that
are hitting software fallbacks with INTEL_DEBUG=fallbacks so the we
can help them debug the reason for the software fallback.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This can only happen in GLSL shaders because assembly shaders that use
too many temps are rejected by core Mesa. It is easiest to make this
happen with shaders that contain flow-control that could not be lowered.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For power-of-two sizes, h0 == mt->height0 since it's already a multiple
of two. However, for NPOT, they're different; h1 should be computed
based on the original size.
Fixes piglit test "cubemap npot" and oglconform test "textureNPOT".
NOTE: This is a candidate for stable release branches.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| | |
Removes 0.8% of the fragment shader instructions on Unigine Tropics.
|
| |
| |
| |
| |
| | |
This appears in our instruction stream as a result of the
brw_vs_constval.c handling.
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
This is part of fixing a ~1% performance regression in OpenArena when
changing the fixed function fragment shader to using the new backend.
Right now this just avoids the LINTERP of the projector, not the math
using it.
|
| |
| |
| |
| |
| |
| | |
This reverts commit 3412069e23b7fa5656262f3dd1aa86f66980594d. We're
about to start using it in fragment shaders to handle avoiding
projection for fixed function.
|
| |
| |
| |
| |
| | |
The old style has gone out of favor in the project, but I kept copy
and pasting from existing iterator code.
|
| |
| |
| |
| |
| |
| |
| | |
This was done in the old codegen path, but not the new one. Caught by
piglit fbo tests after the conversion to GLSL ff_fragment_shader.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When we do a glReadPixels into the temporary buffer, we don't want to
use GL_LUMINANCE, GL_LUMINANCE_ALPHA or GL_INTENSITY since they will
compute L=R+G+B which is not what we want.
This bug has existed all along but was only exposed by the elimination
of the driver hook for glCopyTexImage() in
5874890c26f434f54e9218b83fae4eb8175c24e9.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=39604
Tested-by: Ian Romanick <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
The previous commit removed the last use of this field.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The purpose of the (irb->draw_offset & 4095) != 0 check was to ensure
that we don't have XYy offsets into a tile, since Gen4 hardware doesn't
support that. However, it's insufficient: there are cases where
draw_offset & 4095 is 0 but we still have a Y-offset. This leads to an
assertion failure in brw_update_renderbuffer_surface with tile_y != 0.
Instead, simply call intel_renderbuffer_tile_offsets to compute the
actual X/Y offsets and check if either are non-zero. This makes both
the workaround and the assertion check the same things.
Fixes piglit test fbo-generatemipmap-formats, and should also fix
bugs #34009 and #39487.
NOTE: This is a candidate for stable release branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39487
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We were neglecting to load dvdx and dvdy. v is not optional.
Fixes glslparsertests tex-grad-0[12345].frag on Broadwater/Crestline.
(We still need an execution test using sampler1D.)
NOTE: This is a candidate for the 7.11 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Ian Romanick <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Tobias Droste <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The index buffer state emit only occurred if there was an IB in place
and we were in either a new batch or a new IB state. But because we
only flagged new IB state if IB state changed from the last IB state
we calculated, we could simply never emit IB state after batchbuffer
wraps if the first draw didn't use the IB and we didn't actually
change the IB.
Fixes piglit glx-multi-context-ib-1.
|
| |
| |
| |
| | |
Improves firefox-talos-gfx around 5%.
|
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes user-clip on 965 with 3D clears enabled. I created a separate
flag because I wanted to avoid the overhead of the matrix operations
in this path.
Reviewed-by: Brian Paul <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
It turns out that internally the texture cache gets flushed in a
couple of cases, particularly around 2D operations mixed with 3D. In
almost all cases one of those happens between rendering to an
FBO-attached texture and rendering from that texture. However, as of
the next patch, glean tfbo (and the new fbo-flushing-2 test) would
manage to get stale texture values because one of those flushes didn't
occur. The intention of this code was always to get the render cache
cleared and ready to be used from the sampler cache (and it does on <=
gen4), so this just catches gen5 up.
This patch was also tested to fix fbo-flushing on gen7.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When emitting a MAC instruction in a vertex shader, brw_vs_emit()
calls accumulator_contains() to determine whether the accumulator
already contains the appropriate addend; if it does, then we can avoid
emitting an unnecessary MOV instruction.
However, accumulator_contains() wasn't checking the val.negate or
val.abs flags. As a result, if the desired value was the negation, or
the absolute value, of what was already in the accumulator, we would
generate an incorrect shader.
Fixes piglit test vs-refract-vec4-vec4-float.
Tested on Gen5 and Gen6.
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
On Ivybridge, the shadow comparitor goes in the first slot, rather than
at the end. It's not necessary to send u, v, and r.
Fixes tests texturing/texdepth and glean/fbo.
NOTE: This is a candidate for the 7.11 branch.
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Commit 53c89c67f33639afef951e178f93f4e29acc5d53 ("i965: Avoid generating
MOVs for assignments of expressions.") added the line "this->result =
reg_undef" all over the code. Unfortunately, since Eric developed his
patch before I landed Ivybridge support, he missed adding it to
fs_visitor::emit_texture_gen7() after rebasing.
Furthermore, since I developed TXD support before Eric's patch, I
neglected to add it to the gradient handling when I rebased.
Neglecting to set this causes the visitor to use this->result as storage
rather than generating a new temporary. These missing statements
resulted in the same register being used to store several different
values.
Fixes the following piglit tests on Ivybridge:
- glsl-fs-shadow2dproj.shader_test
- glsl-fs-shadow2dproj-bias.shader_test
NOTE: This is a candidate for the 7.11 branch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Fixes i965 piglit vs-varying-array-mat[234]-row-rd.
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes i965 piglit:
vs-varying-array-mat[234]-col-row-wr
vs-varying-array-mat[234]-index-col-row-wr
vs-varying-array-mat[234]-index-row-wr
vs-varying-array-mat[234]-row-wr
vs-varying-mat[234]-col-row-wr
vs-varying-mat[234]-row-wr
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|\ \ |
|
| | | |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The behavior of flushes in the hardware is a maze of twisty passages,
and strangely the VS constants appear to be loaded during a pipeline
flush instead of at the time of the packet emit according to the
simulator. On moving the STATE_BASE_ADDRESS packet to where it really
needed to live (in order for data loads by other packets to be
correct), we sometimes no longer got a flush between those packets
where we apparently needed it. This replicates the flushes implied by
a STATE_BASE_ADDRESS update, fixing the GPU hangs in OGLC and the
"engine" demo.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36821
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39257
Tested-by: Keith Packard <[email protected]> (bzflag and etracer fixed)
Acked-by: Kenneth Graunke <[email protected]>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
There's scary stuff going on in PIPE_CONTROL internals, and if the
BSpec says to do this to make PIPE_CONTROL work, I'll go ahead and do
it because we'll probably never be able to debug it after the fact.
v2: Use stall at scoreboard instead of depth stall, as noted by Ken.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
For this and occlusion queries, we're trying to avoid setting
I915_GEM_DOMAIN_RENDER for the write domain, because the data written
is definitely not going through the render cache, but we do need to
tell the kernel that the object has been written. However, with using
I915_GEM_DOMAIN_GTT, the kernel on retiring the batchbuffer sees that
the w/a BO has a write domain of GTT, and puts it on the flushing
list. If something tries to wait for that BO to finish rendering
(such as the AUB dumper reading the contents of BOs), we get into
wait_request (since obj->active) but with a 0 seqno (since the object
is on the flushing list, not actually on a ringbuffer), and BUG_ONs.
To avoid the kernel bug (which I'm hoping to delete soon anyway), just
use I915_GEM_DOMAIN_INSTRUCTION like occlusion queries do. This
doesn't result in more flushing, because we invalidate INSTRUCTION on
every batchbuffer now that we're state streaming, anyway.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Improves firefox-talos-gfx performance under GL when 3D clears are
enabled:
[ 0] gl-before firefox-talos-gfx 20.193 20.251 0.27% 3/3
[ 0] gl-after firefox-talos-gfx 18.013 18.040 0.19% 3/3
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| | |
This cuts out a large portion of the overhead of glClear() from
resetting the texenv state and recomputing the fixed function
programs. It also means less use of fixed function internally in our
GLES2 drivers, which is rather bogus.
Reviewed-by: Brian Paul <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
W-Major Tile Format is used for separate stencil.
The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.
This fix touches the following portions of code:
- In intel_allocate_renderbuffer_storage(), allocate the stencil
buffer with I915_TILING_NONE.
- In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
not tiled.
- In the stencil buffer's span functions, the tile's layout must be
decoded in software.
This commit mutually depends on the xf86-video-intel commit
dri: Do not tile stencil buffer
Author: Chad Versace <[email protected]>
Date: Mon Jul 18 00:38:00 2011 -0700
On Gen6 with separate stencil enabled, fixes the following Piglit tests:
bugs/fdo23670-drawpix_stencil
general/stencil-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
spec/EXT_packed_depth_stencil/readpixels-24_8
Note: This is a candidate for the 7.11 branch.
Signed-off-by: Chad Versace <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| | |
The previous define was the full 32-bit header, while the new define
was just the top 16 bits.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Including the full "3DSTATE_VF_STATISTICS" should make it easier to
cross-reference the code and documentation.
Also, move the 965/GM45 suffix to the beginning for consistency with
newer #defines.
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| | |
This makes our code use the same names as the documentation.
Signed-off-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The documentation uses 3DSTATE_DRAWING_RECTANGLE, and we already had it
defined in brw_defines.h; we were simply using an old #define from
intel_reg.h.
Signed-off-by: Kenneth Graunke <[email protected]>
|