| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_vert_result enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Future patches will make use of the enum. It will eventually take the
place of the existing enums gl_vert_result, gl_geom_attrib,
gl_geom_result, and gl_frag_attrib, all of which represent essentially
the same information but using inconsistent values.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch updates the bitfields brw_context::wm.input_size_masks,
tracker::size_masks, and brw_wm_prog_key::proj_attrib_mask, all of
which are indexed by gl_frag_attrib, from 32-bit to 64-bit.
This paves the way for supporting geometry shaders, and for merging
the gl_frag_attrib and gl_vert_result enums. The combination of these
two will require at least 55 bits in the bitfields.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This option is needed for some applications that neglect to request
a depth buffer when choosing a visual/fbconfig.
The Linux app Topogun is an example of this problem.
|
|
|
|
|
|
|
| |
Move the options into the proper section (Debug, Quality, Performance,
etc).
Update comments and add some whitespace to improve readability.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Untyped Atomic Operation messages are illegal for non-RAW formats. The
IVB hardware proceeds happily (after all, who cares what the format of the
surface is if you're doing untyped ops on it?), but later hardware
apparently doesn't. The simulator for gen7 does complain, though.
v2: Rebase against updates to previous patches. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is basically a copy and paste of gen7_create_constant_surface, but
with the parameters filled in to offer a simpler interface.
It will diverge shortly.
I didn't bother adding it to the vtable for now since shader time is only
exposed on Gen7+.
v2: Replace tabs in the new code (by anholt)
Add back dropped memset() and add a comment about HSW channel selects.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Haswell's "Data Cache" data port is a single unit, but split into two
SFIDs to allow for more message types without adding more bits in the
message descriptor.
Untyped Atomic Operations are now message 0010 in the second data cache
data port, rather than 6 in the first.
v2: Use the #defines from the previous commit. (by anholt)
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
| |
We were sparsely using some of these message types, but I'll just fill
them all in now. It will be used for fixing shader_time on HSW.
v2: Add missing MEDIA_BLOCK_READ.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids some snooping overhead between EUs processing separate shaders
(so VS versus FS).
Improves performance of a minecraft trace with shader_time by 28.9% +/-
18.3% (n=7), and performance of my old GLSL demo by 93.7% +/- 0.8% (n=4).
v2: Add a define for the stride with a comment explaining its units and
why.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Framebuffer blitting operation should be skipped if any of the
dimensions (width/height) of src/dst rect is zero.
V2: Move the dimension check after error checking in _mesa_BlitFramebuffer.
Fixes: fbblit(negative.nullblit.zeroSize) in Intel oglconform
https://bugs.freedesktop.org/show_bug.cgi?id=59495
Note: Candidate for all the stable branches.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Fixes the scons build.
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes this build error with make check.
CC collision.o
In file included from ../../../../../src/mesa/main/hash_table.h:34:0,
from collision.c:31:
../../../../../src/mesa/main/compiler.h:51:53: fatal error: c99_compat.h: No such file or directory
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Should get the builds going again.
|
|
|
|
|
|
| |
To allow rendering in 16-bit/channel RGBA buffers.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Handled by top level .gitignore.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
One fewer place to have to update.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We were in four already...
NOTE: Candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes mixing enum types defects reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
s/brw_state_upload/brw_upload_state/
Found because the link was broken.
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.
Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We were allocating an adjacency_list entry for every possible
interference that could get created, but that usually doesn't happen.
We can save a lot of memory by resizing the array on demand.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We're already walking the list, and we can easily know when something
has no reason to be in the list any longer, so take a brief extra step
to reduce our worst-case runtime (an oglconform test that emits the
maximum instructions in a fragment program). I don't actually know what
the worst-case runtime was, because it was too long and I got bored.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can execute way fewer instructions by doing our boolean manipulation
on an "int" of bits at a time, while also reducing our working set size.
Reduces compile time of L4D2's slowest shader from 4s to 1.1s
(-72.4% +/- 0.2%, n=10)
v2: Remove redundant masking (noted by Ken)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were handling the the dependency workaround for the first written reg
of a send preceding the one we're fixing up, but didn't consider the other
regs. Thus if you had two sampler calls that got allocated to the same
set of regs, one might, rarely, ovewrite the other. This was occurring in
XBMC's GLSL shaders.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44567
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When forcing the compiler to always generate pull constants instead of
push constants (in order to have an easy to use testcase), improves
performance of my old GLSL demo 23.3553% +/- 1.42968% (n=7).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60866
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The lowering process creates a new vgrf on gen7 that should be represented
in live interval analysis. As-is, it was getting a conflicting allocation
with gl_FragDepth in the dolphin emulator, producing broken rendering.
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
I was going to fix the code above like the previous commit, but we already
had that covered (otherwise all our uniform access would have been broken,
unlike just pull constants).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were allowing a compressed instruction to write a register that
contained the last use of a uniform pull constant (either UBO load or push
constant spillover), so it would get half its values smashed.
Since we need to see the actual instruction to decide this, move the
pre-gen6 pixel_x/y logic here, which should improve the performance of
register allocation since virtual_grf_interferes() is called more than
once per instruction.
NOTE: This is a candidate for the stable branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=61317
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
I was looking at the list to see what might be interesting to document for
application developers, and it turns out some are completely dead.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Force C++ linking of i965_dri.so by adding a dummy C++ source file.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Fixes piglit's oes_compressed_etc2_texture-miptree tests on Desktop GL.
Reported-by: Marek Olšák <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=61947
Note: this is a candidate for the stable branches
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a crash when a display list is created in one context
but executed from a second one. The vbo_save_context::vertex_store
memeber will be NULL if we never created a display list with the
context. Just check for that before dereferencing the pointer.
Fixes http://bugzilla.redhat.com/show_bug.cgi?id=918661
Note: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
|
|
| |
If the sampler object has been deleted on another context, an
alternative context may reference the old sampler. So ensure the sampler
object still exists.
Note: this is a candidate for the stable branch.
Signed-off-by: Alan Hourihane <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This change specifically unbinds a sampler object from the texture unit
if it's bound to a unit. The spec calls for default object when deleting
sampler objects which are currently bound.
Note: this is a candidate for the stable branches
Signed-off-by: Alan Hourihane <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This was only necessary because our bounds checking was off by one, and
thus we read an extra pair of values.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If we've written N pairs of values to the buffer, then last_index = N,
but the values are 0 .. N-1. Thus, we need to use <, not <=.
This worked anyway because we fill the buffer with zeroes, so we just
added an extra (0 - 0) to our results.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Mesa core is the place for encoding what format/type matches a mesa
format, so rely on that.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
We were allowing things like copying RG1616 to a user's ARGB8888
format, while we were denying anything that wasn't ARGB8888 or
RGB565.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is similar code to intel_miptree_copy_slice, but the knobs
are all set differently.
v2: fix whitespace
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I'm trying to move us away from the region structure, and all the
callers are currently dereferencing a miptree to get the region.
In this change, the map_refcount is dropped. However, the bo->virtual is
itself map refcounted, so that's already dealt with.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
The point of tracking the value was removed in February 2012
(65b096aeddd9b45ca038f44cc9adfff86c8c48b2), and this should have
been removed at the same time.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I don't see any reason for it -- it was introduced with the DRI2
invalidate work by krh in 2010 with no explanation. I suspect it was
something about wanting the same drm_intel_bo struct underneath multiple
openings of the BO within one process, but that's covered by libdrm at
this point. As far as the struct region goes, it is not threadsafe, so
multiple contexts sharing a region could have mixed up the map_count and
assertion failed or worse.
Reviewed-by: Chad Versace <[email protected]>
|