| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reorder the uniforms to load first the dvec4-aligned variables in the
push constant buffer and then push the vec4-aligned ones. It takes
into account that the relocated uniforms should be aligned to their
channel size.
This fixes a bug were the dvec3/4 might be loaded one part on a GRF and
the rest in next GRF, so the region parameters to read that could break
the HW rules.
v2:
- Fix broken logic.
- Add a comment to explain what should be needed to optimise the usage
of the push constant buffer slots, as this patch does not pack the
uniforms.
v3:
- Implemented the push constant buffer usage optimization.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
offset
It was setting XYWZ swizzle and writemask to all uniforms, no matter if they
were a vector or scalar, so this can lead to problems when loading them
to the push constant buffer.
Moreover, 'shift' calculation was designed to calculate the offset in
DWORDS, but it doesn't take into account DFs, so the calculated swizzle
for the later ones was wrong.
The indirect case is not changed because MOV INDIRECT will write
to all components. Added an assert to verify that these uniforms
are aligned.
v2:
- Fix 'shift' calculation (Curro)
- Set both swizzle and writemask.
- Add assert(shift == 0) for the indirect case.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vec4_gs_visitor execution
We are going to add a packing feature to reduce the usage of the push
constant buffer. One of the consequences is that 'nr_params' would be
modified by vec4_visitor's run call, so we need to restore it if one of
them failed before executing the fallback ones. Same thing happens to the
uniforms values that would be reordered afterwards.
Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when
the dvec4 alignment and packing patch is applied.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "17.1" <[email protected]>
Acked-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 6facb0c0 ("android: fix libz dynamic library dependencies")
unconditionally adds libz as a dependency to all shared libraries.
That is unnecessary.
Commit 85a9b1b5 introduced libz as a dependency to libmesa_util.
So only the shared libraries that use libmesa_util need libz.
Fix Android Lollipop build by adding the include path of zlib to
libmesa_util explicitly instead of getting the path implicitly
from zlib since it doesn't export the include path in Lollipop.
Fixes: 6facb0c0 "android: fix libz dynamic library dependencies"
Signed-off-by: Chih-Wei Huang <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
|
|
|
|
|
|
|
|
| |
All combined depth stencil buffers (even those with just stencil)
require a 4x4 alignment on Sandy Bridge. The only depth/stencil buffer
type that requires 4x2 is separate stencil.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Ivy Bridge PRM provides a nice table that handles most of the
alignment cases in one place. For standard color buffers we have a
little freedom of choice but for most depth, stencil and compressed it's
hard-coded. Chad's original functions split halign and valign apart and
implemented them almost entirely based on restrictions and not the
table. This makes things way more confusing than they need to be. This
commit gets rid of the split and makes us implement the exact table
up-front. If our surface isn't one of the ones in the table then we
have to make real choices.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reasoning Chad gave in the comment for choosing a valign of 4 is
entirely bunk. The fact that you have to multiply pitch by 2 is
completely unrelated to the halign/valign parameters used for texture
layout. (Not completely unrelated. W-tiling is just Y-tiling with a
bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled
chunks so it makes everything easier if miplevels are always aligned to
8x8.) The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet
doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you
can't do stencil texturing anyway.
v2 (Jason Ekstrand):
- Delete most of Chad's comment and add a more descriptive commit
message.
Signed-off-by: Topi Pohjolainen <[email protected]>
Cc: "17.0 17.1" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
Broken by commit a7217e909ce6 ("i965: Pass pointer and end of assembly
to brw_validate_instructions").
Reported-by: Aaron Watry <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
Which will allow us to print validation errors found in shader assembly
in GPU hang error states.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
Newer Gens' names don't have the brackets. Having common names will make
some later patches simpler.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
intel_asm_annotation.c is part of libintel_compiler.la, which contains
code for disassembling and validating shaders that we want to call in
aubinator_error_decode.
dump_assembly() calls nir_print_instr() to print annotations, and
although dump_assembly() is not called by aubinator_error_decode (nor is
any function in intel_asm_annotation.c) it causes undefined references
to nir_print_instr().
To work around, provide a no-op weak symbol to resolve against.
|
|
|
|
|
|
|
|
| |
This will allow the validator to run on shader programs we find in the
GPU hang error state.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
This will allow us to more easily run brw_validate_instructions() on
shader programs we find in GPU hang error states.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
In the unlikely case the parsing of genxml files fails, we were
leaking an xml parser object.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
Use an alias for this field on 3DSTATE_INDEX_BUFFER on gen6+, so we can set
the same value as the defines.
Signed-off-by: Rafael Antognolli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
It's actually not that much code.
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
We use Instruction State Base Address on Ironlake, so we want KSP to be
an offset not an actual pointer. Gen4/G45 use pointers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For whatever reason, we had an INTEL_DEBUG=stats option that enabled
various statistics counters on Gen4-5 systems. It's been around
forever, though I can't think of a single time that it's been useful.
On Gen6+, we enable statistics all the time because they're necessary
to support various query object targets. Turning them off would break
those queries.
Gen4-5 don't support those queries, so the statistics counters generally
aren't useful; we disabled them by default. This patch disables them
altogether.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
After successful drmGetDevices2() call, drmFreeDevices() needs to be
called.
Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]> # radv version
|
|
|
|
|
|
|
|
|
|
| |
drmGetDevices2 takes count and not size. Probably hasn't caused problems
yet in practice and was missed as setups with more than 8 DRM devices
are not very common.
Fixes: b1fb6e8d "anv: do not open random render node(s)"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The only thing still using it is INVOCATION_ID for geometry shaders.
That's easily enough inlined into the nir_intrinsic_load_invocation_id
handling code.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We're already doing this in the FS back-end. This just does the same
thing in the vec4 back-end.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR pass already handles remapping system values to attributes for
us so we delete the system value code as part of the conversion.
We also change nir_lower_vs_inputs to take an explicit inputs_read
bitmask and pass in the inputs_read from prog_data instead from pulling
it out of NIR. This is because the version in prog_data may get
EDGEFLAG added to it on some old platforms.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We also add a nice little comment to make it more clear exactly what
happens with the edge flag copy.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
NIR calls these system values but they come in from the VF unit as
vertex data. It's terribly convenient to just be able to treat them as
such in the back-end.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The vec4 backend will want to count in units of vec4s, not scalar
components. The simplest solution is to move the multiplication by 4
into the scalar backend. This also improves consistency with how we
count varyings.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Now that we have nice block iterators, there's no good reason for this
to be off on it's own. While we're here, we convert to using the NIR
const index getters/setters instead of whacking const_index values
directly.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info
from being embedded into being just a pointer. The idea was that
sharing the shader_info between NIR and GLSL would be easier if it were
a pointer pointing to the same shader_info struct. This, however, has
caused a few problems:
1) There are many things which generate NIR without GLSL. This means
we have to support both NIR shaders which come from GLSL and ones
that don't and need to have an info elsewhere.
2) The solution to (1) raises all sorts of ownership issues which have
to be resolved with ralloc_parent checks.
3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been
using nir_gather_info to fill out the final shader_info. Thanks to
cloning and the above ownership issues, the nir_shader::info may not
point back to the gl_shader anymore and so we have to do a copy of
the shader_info from NIR back to GLSL anyway.
All of these issues go away if we just embed the shader_info in the
nir_shader. There's a little downside of having to copy it back after
calling nir_gather_info but, as explained above, we have to do that
anyway.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
CID: 1399477, 1399478 (Integer handling issues)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
CID: 1399470: (Control flow issues)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We should get either 0 or 1 here.
CID: 1373562 (Control flow issues)
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
CID: 1405919 (Error handling issues)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The application might not give an output structure.
CID: 1405765 (Null pointer dereferences)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
| |
This fixes the build when not building against valgrind headers.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100945
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
| |
Reviewed-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
| |
Now that we can allocate states larger than the block size, we no longer
need a block size of 1MB which can be rather wasteful.
Reviewed-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
| |
Reviewed-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the maximum size of a state that could be allocated from a
state pool was a block. However, this has caused us various issues
particularly with shaders which are potentially very large. We've also
hit issues with render passes with a large number of attachments when we
go to allocate the block of surface state. This effectively removes the
restriction on the maximum size of a single state. (There's still a
limit of 1MB imposed by a fixed-length bucket array.)
For states larger than the block size, we just grab a large block off of
the block pool rather than sub-allocating. When we go to allocate some
chunk of state and the current bucket does not have state, we try to
pull a chunk from some larger bucket and split it up. This should
improve memory usage if a client occasionally allocates a large block of
state.
This commit is inspired by some similar work done by Juan A. Suarez
Romero <[email protected]>.
Reviewed-by: Juan A. Suarez Romero <[email protected]>
|