| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
A convenient front end to indices generate/translate code, for emulating
primitives which are not supported natively by the driver.
This handles saving/restoring index buffer state, etc.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Add 'start' parameter to generator/translator.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
pull in some fixes to draw-initiator/prim-type.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea of the original order was that you'd dead code eliminate accesses
to push constants. But I've never seen a case of that (nor has
shader-db), while we frequently see sparse accesses of large constant
arrays that would overflow into pull constants.
Cuts pull constant use on csgo, serious sam, planeshift, and the cave:
total instructions in shared programs: 1695103 -> 1688795 (-0.37%)
instructions in affected programs: 92024 -> 85716 (-6.85%)
GAINED: 339
LOST: 0
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
* Use more consistant data sources
* Fix improper color space assignments
* Remove unnecessary comments and code
* Drop unnecessary round_up function (this was leftover
from moving winsys code out of renderer)
Acked-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
* Instead of assuming the displaytarget is the same
stride / colorspace as the destination, lets
actually check the source bitmap.
* Fixes random stride issues in rendering
Acked-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70891.
Reported-by: Bruno Jiménez <[email protected]>
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
The MRF variant is going to be used extensively by the atomic counter
intrinsics to assemble untyped atomic and surface read messages
easily.
Reviewed-by: Paul Berry <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The maximum number of atomic buffer objects is somewhat arbitrary, we
can change it in the future easily if it turns out it's not enough...
v2: Add comments with the relevant mesa dirty bits. Fix usage of
BRW_NEW_UNIFORM_BUFFER in the GS ABO state atom.
v3: Update binding table layout diagrams.
v4: Resolve conflicts with the recent dynamic surface index assignment changes.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
And add Gen7 implementation.
v2: Fix off by one error in buffer size calculation.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
Almost a trivial change, it boils down to renaming a few identifiers
so their names still make sense for opaque types other than sampler.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
ARB_shader_atomic_counters.
v2: Represent atomics as GLSL intrinsics.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix the linker to deal with intrinsic functions which are undefined
all the way down to the driver back-end, and introduce intrinsic
definition helpers in the built-in generator.
We still need to figure out what kind of interface we want for drivers
to communicate to the GLSL front-end which of the supported intrinsics
should use a default GLSL implementation and which should use a
hardware-specific override. As there's no default GLSL implementation
for atomic ops, this seems like something we can worry about later on.
Reviewed-by: Ian Romanick <[email protected]>
v2: Define local helper function to generate ir_call nodes in the
builtin generator.
|
|
|
|
|
|
|
|
|
|
|
| |
And use it to forbid comparisons of opaque operands. According to the
GL 4.2 specification:
> Except for array indexing, structure member selection, and
> parentheses, opaque variables are not allowed to be operands in
> expressions.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Fix GLSL version in which the type became available. Add
contains_atomic() convenience method. Split off atomic counter
comparison error checking to a separate patch that will handle all
opaque types. Include new ir_variable fields for atomic types.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the common support code required for the
ARB_shader_atomic_counters extension. It defines the necessary data
structures for tracking atomic counter buffer objects (from now on
"ABOs") associated with some specific context or shader program, it
implements support for binding buffers to an ABO binding point and
querying the existing atomic counters and buffers declared by GLSL
shaders.
v2: Fix extension checks. Drop unused MAX_ATOMIC_BUFFERS constant.
Acked-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Add XML file for the dispatch code generator, update the
dispatch_sanity test and add stub definition for the new entry point.
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These ralloc contexts belong to a specific object and are being
deallocated manually from the class destructor. Now that we've hooked
up destructors to ralloc there's no reason for them to be children of
any other context, and doing so might to lead to double frees under
some circumstances. The class destructor has all the responsibility
of freeing class memory resources now.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes sure that class destructors are called as they should
be when a C++ object allocated by ralloc is released.
Based on a previous patch by Kenneth Graunke, but it doesn't exhibit
the ~0.8% performance regression in shader compilation times because
we now use the HAS_TRIVIAL_DESTRUCTOR() macro to detect the typical
case where the indirect function call can be avoided because the
object's destructor doesn't need to do anything.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
destructible.
Only implemented on GCC and Clang for now. Other compilers use a
dummy implementation that always returns false, which should be a safe
[but slightly inefficient] assumption in all cases.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
This will let us use strcasecmp() from anywhere inside Mesa without
having to worry about the fact that it doesn't exist in MSVC.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The layer coming from GS needs to be clamped (not sure if that's actually
the correct error behavior but we need something) as the number can be higher
than the amount of layers in the fb. However, this code was using the layer
calculation from the scene, and this was actually calculated in
lp_scene_begin_rasterization() hence too late (so setup was using the value
from the _previous_ scene or just zero if it was the first scene).
Since the value is used in both rasterization and setup, move calculation up
to lp_scene_begin_binning() though it's a bit more inconvenient to calculate
there. (Theoretically could move _all_ code which was in
lp_scene_begin_rasterization() to there, because ever since we got rid of
swizzled render/depth buffers our "map" functions preparing the fb data for
render don't actually change the data in there at all, but it feels like
it would be a hack.)
v2: improve comments
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Before we were only checking the st->vertex_array_out_of_memory flag
after updating array state. But if there's two consecutive glDrawArrays
calls and the first one is skipped because of OOM, the second one should
be skipped too.
Cc: 9.2 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Orbital Explorer was generating a 4000 instruction geometry shader, which
was taking 275 trips through dead code elimination and register
coalescing, each of which updated live variables to get its work done, and
invalidated those live variables afterwards.
By using bitfields instead of bools (reducing the working set size by a
factor of 8) in live variables analysis, it drops from 88% of the profile
to 57%, and reduces overall runtime from I-got-bored-and-killed-it (Paul
says 3+ minutes) to 10.5 seconds.
Compare to f179f419d1d0a03fad36c2b0a58e8b853bae6118 on the FS side.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents unnecessary (and wrong) register allocation in the
scheduler for preloaded values in fixed registers.
Fixes interpolation-mixed.shader_test on rv770
(and probably on all other pre-evergreen chips).
Signed-off-by: Vadim Girlin <[email protected]>
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this in a shader in Unigine Heaven that was spilling. While it
doesn't really reduce register pressure, it shaves a few instructions
anyway (7955 -> 7882).
v2: Fix turning "0 >> x" into "x" instead of "0" (caught by Erik
Faye-Lund).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
While ir_builder is slightly less efficient, we're only increasing the
work when there's actual optimization being done, and it's way more
readable code.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Matt and I had each screwed up these common required patterns recently, in
ways that wouldn't have been noticed for a long time if not for code
review. Just enforce it in the caller so that we don't rely on code
review catching these bugs.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is nothing in the OpenGL specification which prevents the user from
calling glGenQueries to generate a new query object while another object is
active. Neither is there anything in the Mesa implementation which prevents
this. So remove the INVALID_OPERATION errors in this case.
Similarly, it is explicitly allowed by the OpenGL specification to delete an
active query, so remove the assertion for that case, replacing it with the
necesssary state updates to end the query, (clear the bindpt pointer and call
into the driver's EndQuery hook).
CC: <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The normal drawing path does this, and it's necessary on Ivybridge,
so let's try it on Sandybridge too. It's not explicitly documented
as necessary, but might help with hangs.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the documentation:
"[DevIVB] 3DSTATE_DEPTH_BUFFER must always be programmed along with the
other Depth/Stencil state commands(i.e. 3DSTATE_CLEAR_PARAMS,
3DSTATE_STENCIL_BUFFER, or 3DSTATE_HIER_DEPTH_BUFFER)."
We normally do this, but BLORP was failing to do so in the case where it
disables depth.
Not observed to fix anything yet.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason, we put the flush in the caller, rather than just before
emitting the packet. This is more than a cosmetic problem: BLORP calls
gen6_emit_3dstate_multisample() directly, and so it missed the flush.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Non-pipelined commands need this flush.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is another non-pipelined command that needs a flush on Sandybridge.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the comments above intel_emit_post_sync_nonzero_flush:
"[DevSNB-C+{W/A}] Before any depth stall flush (including those
produced by non-pipelined state commands), software needs to first
send a PIPE_CONTROL with no bits set except Post-Sync Operation != 0."
This suggests that every non-pipelined (0x79xx) command needs a
post-sync non-zero flush before it.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise the gen6 w/a in the kernel won't kick in and the write will
land nowhere.
Inspired by a patch Ken pointed me at which had the same issue (but
isn't yet merged and also for a gen7+ feature). An audit of the entire
driver didn't reveal any other case than the one in in the write_reg
helper used by the gen6 queryobj code.
Acked-by: Kenneth Graunke <[email protected]>
Signed-off-by: Daniel Vetter <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Setting bilinear_filter flag in case of multisample blits with
GL_LINEAR filter causes incorrect behavior in translate_dst_to_src()
function. This broke Modern Warfare (1, 2 and 3) on SNB, IVB and HSW.
Tested on SNB and IVB, no Piglit regressions. Trace file of the game
(taken with apitrace) works fine with this patch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69078
Cc: [email protected]
Signed-off-by: Anuj Phogat <[email protected]>
Reported-by: Armin K <[email protected]>
Tested-by: Armin K <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rico Schüller <[email protected]>
Signed-off-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main purpose of this patch is to increase readability of
the array code by introducing is_unsized_array() to glsl_types.
Some redundent is_array() checks are also removed, and small number
of other related clean ups.
The introduction of is_unsized_array() should also make the
ARB_arrays_of_arrays code simpler and more readable when it arrives.
V2: Also replace code that checks for unsized arrays directly with the
length variable
Signed-off-by: Timothy Arceri <[email protected]>
v3 (Paul Berry <[email protected]>): clean up formatting.
Separate whitespace cleanups to their own patch.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Timothy Arceri <[email protected]>
v2 (Paul Berry <[email protected]>): Separate from "glsl: Add
check for unsized arrays to glsl types".
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
Add alot of missing fields as well.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|