| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
It is always the draw ring. Except for a5xx queries like time-elapsed,
where we will eventually want to emit cmds into both binning and draw
rings.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some queries on a4xx and all queries on a5xx can do result accumulation
on CP so we don't need to track per-tile samples. We do still need to
handle pausing/resuming while switching batches (in case the query is
active over multiple draws which are executed out of order).
So introduce new accumulated-query helpers for these sorts of queries,
since it doesn't really fit in cleanly with the original query infra-
structure.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Move a bit more of the logic shared by all query types (active tracking,
etc) into common code. This avoids introducing a 3rd copy of that logic
for a5xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For a5xx (and actually some queries on a4xx) we can accumulate results
in the cmdstream, so we don't need this elaborate mechanism of tracking
per-tile query results. So make it into vfuncs so generation specific
backend can use it when it makes sense.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
opt_register_coalesce() was optimizing sequences such as:
mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) vgrf5.xy:D, attr18.xyyy:D, attr19.xyyy:D
mov(8) m4.zw:F, vgrf5.xxxy:F
into:
mul(8) acc0:D, attr18.xyyy:D, attr19.xyyy:D
mach(8) m4.zw:D, attr18.xxxy:D, attr19.xxxy:D
This doesn't work - if we're going to reswizzle MACH, we'd need to
reswizzle the MUL as well. Here, the MUL fills the accumulator's .zw
components with attr18.yy * attr19.yy. But the MACH instruction expects
.z to contain attr18.x * attr19.x. Bogus results ensue.
No change in shader-db on Haswell. Prevents regressions in Timothy's
patches to use enhanced layouts for varying packing (which rearrange
code just enough to trigger this pre-existing bug, but were fine
themselves).
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we were only making sure types were the same within a
single stage. This looks to have regressed with 953a0af8e3f73.
Fixes: 953a0af8e3f73 ("mesa: validate sampler uniforms during gluniform calls")
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=97524
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From Chapter 5 'Shared Objects and Multiple Contexts' of
the OpenGL 4.5 spec:
"Objects which contain references to other objects include
framebuffer, program pipeline, query, transform feedback,
and vertex array objects. Such objects are called container
objects and are not shared"
For we leave locking in place for framebuffer objects because
the EXT fbo extension allowed sharing.
We could maybe just replace the hash with an ordinary hash table
but for now this should remove most of the unnecessary locking.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This pattern was only useful when we used mutex locks, which the previous
commit removed.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From Chapter 5 'Shared Objects and Multiple Contexts' of
the OpenGL 4.5 spec:
"Objects which contain references to other objects include
framebuffer, program pipeline, query, transform feedback,
and vertex array objects. Such objects are called container
objects and are not shared"
For we leave locking in place for framebuffer objects because
the EXT fbo extension allowed sharing.
V2: (Timothy Arceri)
- rebased and dropped changes to framebuffer objects
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
We should never get here if this is 0 unless there is a
bug. Replace the check with an assert.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since commit ce562f9e3fa, two new files are generated.
We don't want to track them.
Signed-off-by: Elie Tournier <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As mentioned in the manual - comparing pthread_t handles via the C
comparison operator is incorrect and pthread_equal() should be used
instead.
Cc: Timothy Arceri <[email protected]>
Fixes: d8d81fbc316 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
As pointed out by compiler
./llvm/codegen.hpp:52:22: error: ‘<::’ cannot begin a template-argument list [-fpermissive]
./llvm/codegen.hpp:52:22: note: ‘<:’ is an alternate spelling for ‘[’. Insert whitespace between ‘<’ and ‘::’
Cc: Francisco Jerez <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Vedran Miletić <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This function is actually a wrapper for component_slots()
and it always returns 1 (or N) for samplers. Since
component_slots() now return 1 for samplers, it can go.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It looks inconsistent to return 1 for image types and 0 for
sampler types. Especially because component_slots() is mostly
used by values_for_type() which always returns 1 for samplers.
For bindless, this value will be bumped to 2 because the
ARB_bindless_texture states that bindless samplers/images
should consume two components.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 4d4558411db166d2d66f8cec9cb581149dbe1597.
This was a wrong call, while it fixed issue with 3DMark it
actually introduced regression elsewhere.
Signed-off-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables support on GM200+ for:
- GL_AMD_vertex_shader_layer
- GL_AMD_vertex_shader_layer_viewport_index
- GL_ARB_shader_viewport_layer_array
Signed-off-by: Ilia Mirkin <[email protected]>
[lyude: add relnotes/TES cap]
Signed-off-by: Lyude <[email protected]>
[imirkin: move relnotes to right place, add features.txt]
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
EMIT only applies to geometry shaders. For everything else, we want to
export the viewport normally.
Signed-off-by: Lyude <[email protected]>
Reviewed-by: Boyan Ding <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
For consistency, doesn't really impact performance.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
This breaks the guts of MI_MATH (the instruction part) out into its own
structure with proper named values.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Kenneth Graunke <[email protected]>
Reviewed by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
| |
Looking at some Talos shaders vs radeonsi, I noticed they use
tex_lz in a few places, so we should be able to as well.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
for lower overhead.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This is nicer on caches, and the next commit will need to access
the structure from a different place.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Cc: <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
min/max_index are typically hints for the u_vbuf module, not the driver.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Most drivers don't need it and shouldn't need it because it can't be used
in some cases (indirect draws, primitive restart, count from streamout).
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: previously getWithDereferenceableBytes() exists, but addAttr() doesn't take that type
Signed-off-by: Christoph Haag <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-and-reviewed-by: Mike Lothian <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
getuid() and geteuid() are not present on Windows.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Build VS with alternating output for the current simd16 fe double-pump
of a simd8 shader.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieeuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Gives me approximately a 2% perf increase in bot dota2 & talos.
Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
In cases where it is used it is always 1.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
With this we don't have any operations on a pool with non-freeable
descriptors left that have O(#descriptors) complexity.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|