| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the plan was "if the width/height we have to load/store isn't
the size the user is planning on writing, then we need to load the old
contents out beforehand to prevent writing back undefined".
However, when we're doing glTexImage() we often end up aligning the
width/height into the padding of the texture, and we don't actually
need to read out that padding.
Improves x11perf -aatrapezoid100 performance from ~460/sec to
~700/sec.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Regression introduced by
ba0274c7d6c3b77a36bbe1b444f427b0c873e2f3
Check the resource exists before assigning it
a flag (and use This->base.resource instead
of pResource, since the former may have a newly
allocate resource, while the latter would be
NULL).
This should reintroduce the behaviour of previous
code.
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 1604efa6fda9b780e8537a131ad77f3e83e5a67a,
lconsti and lconstb don't need to be initialized.
Remove some leftovers from the previous code (which
has now invalid use of ARRAY_SIZE on a pointer instead
of an array).
Reported by Coverity.
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
| |
Use uint64_t instead of int64_t in the calculation,
as the result is uint64_t.
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
All ARB_enhanced_layouts piglit tests pass without any changes
in our compiler.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
The table was copied from the Vulkan driver. The comment lines are as long
as the table for cosmetic reasons.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a serious performance fix. Discovered by luck.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94354
Cc: 12.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
so that decompress blits aren't needed and depth texturing needs less
memory bandwidth.
Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible
HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16.
The format promotion is not visible to state trackers.
This is part of TC-compatible renderbuffer compression, which has 3 parts:
DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now.
I don't see a measurable increase in performance though.
(I tested Talos Principle and DiRT: Showdown, the latter is improved by
0.5%, which is almost noise, and it originally used layered Z16,
so at least we know that Z16 promoted to Z32F isn't slower now)
Tested-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For performance tuning in drivers. It filters out window system
framebuffers and OpenGL renderbuffers.
radeonsi will use this to guess whether a depth buffer will be read
by a shader. There is no guarantee about what will actually happen.
This is a departure from PIPE_BIND flags which are defined to be strict
but they are useless in practice.
Acked-by: Roland Scheidegger <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Caused by a bad rebase when pushing commit 76a940893.
|
|
|
|
|
|
|
| |
Fixes GL45-CTS.shader_image_load_store.basic-allTargets-atomic*
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Recent fix for non-const offsets broke the case of a single offset (vs 4
offsets). The later code relies on the offs array to contain null values
to tell whether they should be added onto the srcs list.
Fixes: 5239bd592 ("nvc0/ir: fix overwriting of value backing non-constant gather offset")
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
| |
The offset needs to be properly copied over to the phi value, otherwise
it will get assigned to the base of the merge instead of the proper
location.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
| |
v2: mark llvmpipe & softpipe properly as well (Jason Wood)
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For specifying an exact location/component.
v2: change the order of parameters (Dave)
Reviewed-by: Edward O'Callaghan <[email protected]> (v1)
Reviewed-by: Dave Airlie <[email protected]> (v1)
|
|
|
|
|
|
|
| |
v2: change the order of parameters (Dave)
Reviewed-by: Edward O'Callaghan <[email protected]> (v1)
Reviewed-by: Dave Airlie <[email protected]> (v1)
|
|
|
|
|
|
|
| |
v2: remove a tautological left-over assert (Marek)
Reviewed-by: Edward O'Callaghan <[email protected]> (v1)
Reviewed-by: Dave Airlie <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
This is a screen cap because drivers are expected to support it either
for all shader types or for none of them.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
This patch requires LLVM r284024 or newer.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
The existing function only worked for integer types.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
radeonsi no longer supports pixel shaders without interpolation optimizations,
which led to assertion failures in si_shader_ps when running shader-db.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs :2286901 -> 2284473 (-0.11%)
total gprs used in shared programs :335256 -> 335273 (0.01%)
total local used in shared programs :31968 -> 31968 (0.00%)
local gpr inst bytes
helped 0 41 852 852
hurt 0 44 23 23
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This should make the code more robust if a shader tries to use inputs which
aren't defined by the vertex element layout (which usually shouldn't happen).
No piglit change.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
| |
Only stat and counter events are saved to the event files.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
SwrStoreTiles now takes a mask of surfaces to store. Reduces
overhead when storing multiple render targets.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
| |
Add template for generating code to save events to a file.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Align and use streaming store instructions for BE fifo queues.
Provides slightly faster enqueue and doesn't pollute the caches.
Add appropriate memory fences to ensure streaming writes are
globally visible.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Intel Pineview M hardware, the i915 gallium driver doesn't output
the correct gl_FragCoord. It seems to always have an X coord of 0.0
and a Y coord of the window's height in pixels, e.g. 600.0f or such.
I believe this is a regression caused in part by this commit:
afa035031ff9e0c07a2297d864e46c76f7bfff58
The old behavior used the output at index zero, while the new behavior
uses actual zeroes. In the case of gl_FragCoord the output at index
zero happened to be the correct one, so the behavior appeared correct
although the code already had a bug.
Fixed by checking for I915_SEMANTIC_POS when setting up texCoords. If
the generic_mapping is I915_SEMANTIC_POS, look for the
TGSI_SEMANTIC_POSITION instead of a TGSI_SEMANTIC_GENERIC output.
https://bugs.freedesktop.org/show_bug.cgi?id=97477
Reviewed-by: Stéphane Marchesin <[email protected]>
Tested-by: Stéphane Marchesin <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a wine test crash
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Patrick Rudolph <[email protected]>
|
|
|
|
|
|
|
| |
Add debug output to ease debugging.
Signed-off-by: Patrick Rudolph <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
|