| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With NIR, it actually hurts things.
total instructions in shared programs: 6529329 -> 6528888 (-0.01%)
instructions in affected programs: 14833 -> 14392 (-2.97%)
helped: 299
HURT: 1
In all affected programs I inspected (including the single hurt one) the
pass CSE'd some multiplies and caused some reassociation (e.g., caused
(A * B) * C to be A * (B * C)) when the original intermediate result was
reused elsewhere.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
instructions in affected programs: 44204 -> 43762 (-1.00%)
helped: 221
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
We're not using any fs_inst fields, and the next commit will make the
peephole used by the vec4 backend.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We never emit IF instructions with an embedded comparison (lost in the
switch to NIR), so this code is not used. If we want to readd support,
we should have a pass that merges a CMP instruction with an IF or a
WHILE instruction after other optimizations have run.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the Intel Software Development Manual (Volume 1: Basic
Architecture, 12.10.3 Streaming Load Hint Instruction):
Streaming loads may be weakly ordered and may appear to software to
execute out of order with respect to other memory operations.
Software must explicitly use fences (e.g. MFENCE) if it needs to
preserve order among streaming loads or between streaming loads and
other memory operations.
That is, a memory fence is needed to preserve the order between the GPU
writing the buffer and the streaming loads reading it back.
Reported-by: Joseph Nuzman <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are three types of fast clears:
a. fast depth clears
b. fast singlesample color clears
c. fast multisample color clears
Function intel_miptree_is_fast_clear_capable() checks if a miptree
supports fast clears of type (b).
Rename the function to disambiguate what it does:
old: intel_miptree_is_fast_clear_capable
new: intel_miptree_supports_non_msrt_fast_clear
The functionally accidentally rejected multisampled color surfaces
because it thought they were singlesample array surfaces. Fix that by
explicitly rejecting surfaces with samples > 1.
This fix would have been needed before we enabled layered fast
singlesample color clears (introduced in gen8), which we want to do
eventually. For now, though, this patch changes no behavior; it just
fixes how the driver chooses its behavior.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
intel_tiling_supports_non_msrt_mcs() and
intel_miptree_is_fast_clear_capable() are not used outside of
intel_mipmap_tree.c.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need a virtual destructor when at least one of the class' methods is virtual.
Failure to do so might lead to undefined behavior when destructing derived classes.
Fixes the following warning:
brw_vec4_gs_visitor.cpp: In function 'const unsigned int* brw::brw_gs_emit(brw_context*, gl_shader_program*, brw_gs_compile*, void*, unsigned int*)':
brw_vec4_gs_visitor.cpp:703:11: warning: deleting object of polymorphic class type 'brw::vec4_gs_visitor' which has non-virtual destructor might cause undefined behaviour [-Wdelete-non-virtual-dtor]
delete gs;
Curro: This shouldn't be causing any actual bugs at the moment because
gen6_gs_visitor is the only subclass of vec4_visitor destroyed through
a pointer of a base class (vec4_gs_visitor *) and its destructor is
basically the same as its parent's. Anyway it seems sensible to change
this so it doesn't bite us in the future.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes following Piglit test:
global-scope-binding-qualifier.frag
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
| |
In theory we can't break this assertion since the compiler frontend checks
that we don't exceed any of the individual limits, but it does not hurt to
be extra safe.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
These share the space with UBO surfaces but we need to make sure we
allocate enough space for both sets (12 of each)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Instead of using hard-coded values.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Instead of using hard-coded values.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The thing you want to do with the output files is diff them, which is
made more difficult by line numbers changing.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It seems like things are either coming in slighly wrong, or perhaps
uploaded incorrectly, but either way passing them through the translate
module seems to fix everything. Eventually we should figure out what's
going wrong and fix it "for real", but this should do for now.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
This puts us in line with what the DDX/DRI2 st are expecting. It also
happens to work... no idea why, but seems better to have it work than to
ask lots of questions.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
| |
Fixes Gallium based DRI drivers failing to load on big endian hosts
because they can't find any matching fbconfigs.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71789
Signed-off-by: Michel Dänzer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
The uniform will only be of a single type so store the data for
opaque types in a single array.
Cc: Francisco Jerez <[email protected]>
Cc: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Unfortunately it has to stay in gen6_gs_visitor.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Geometry and tessellation shaders process multiple vertices; their
inputs are arrays indexed by the vertex number. While GLSL makes
this look like a normal array, it can be very different behind the
scenes.
On Intel hardware, all inputs for a particular vertex are stored
together - as if they were grouped into a single struct. This means
that consecutive elements of these top-level arrays are not contiguous.
In fact, they may sometimes be in completely disjoint memory segments.
NIR's existing load_input intrinsics are awkward for this case, as they
distill everything down to a single offset. We'd much rather keep the
vertex ID separate, but build up an offset as normal beyond that.
This patch introduces new nir_intrinsic_load_per_vertex_input
intrinsics to handle this case. They work like ordinary load_input
intrinsics, but have an extra source (src[0]) which represents the
outermost array index.
v2: Rebase on earlier refactors.
v3: Use ssa defs instead of nir_srcs, rebase on earlier refactors.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
get_io_offset() already walks the dereference chain and discovers
whether or not we have an indirect; we can just return that rather than
computing it a second time via deref_has_indirect(). This means moving
the call a bit earlier.
By returning a nir_ssa_def *, we can pass back both an existence flag
(via NULL checking the pointer) and the value in one parameter. It
also simplifies the code somewhat. nir_lower_samplers works in a
similar fashion.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
| |
Now st/mesa won't generate 2 variants for this state.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
This will be a derived state used for changing center->sample and
centroid->sample at runtime.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This is only a half of the work. The next patch will handle
gl_SampleID/SamplePos, which is the other half of ARB_sample_shading.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Required by ARB_sample_shading for drivers that don't want a shader variant
in st/mesa.
Reviewed-by: Ilia Mirkin <[email protected]>
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Nothing sets it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Nothing overrides it.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Nothing sets it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing reimplements it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing reimplements it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing sets it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing sets it.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Nothing sets them.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|