| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator. If
there is no first terminator or later terminator, there is no off-by-one
error. Incrementing the loop count creates one. This can be seen in
loops like:
do {
if (something) {
// No breaks or continues here.
}
} while (false);
Reviewed-by: Timothy Arceri <[email protected]>
Tested-by: Abel Briggs <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66da ("glsl: make loop unrolling more like the nir unrolling path")
|
|
|
|
|
|
|
|
|
|
| |
We were pessimistically uploading all of it in case of indirection,
but we can just bump that when we encounter indirection.
total constlen in shared programs: 2529623 -> 2485933 (-1.73%)
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we
need to upload (all of it, since it will lower indirect UBO 0 accesses
from load_ubo back to indirection on the constant buffer).
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the NIR-level analysis decided to move UBO loads to the constant
file, but the backend decided not to load those constants, we could
upload past the end of constlen. This is particularly relevant for
pre-a6xx, where we emit a different constlen between bin and render
variants.
(Fix by Rob, commit message by anholt)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This is the hardware max, as far as I can tell.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Just a little spring cleanup, extending UBOs to vertex shaders in the
process.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
We look at the highest set bit in the UBO enable mask to work out the
maximum indexable UBO, i.e. the UBO count as we need to report to the
hardware.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
We refactor panfrost_constant_buffer to mirror v3d's constant buffer
handling, to enable UBOs as well as a single set of uniforms.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
This doesn't handle Y-flipping, but it's good enough to render the stars
in Neverball.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
In preparation for lowering point sprites, track them like we track
alpha testing state.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Those either depend on information filled by the NIR linking steps OR
are restricted by those:
- gl_nir_lower_samplers: depends on UniformStorage being set by the
linker.
- brw_nir_lower_image_load_store: After 6981069fc80 "i965: Ignore
uniform storage for samplers or images, use binding info" we want
this pass to happen after gl_nir_lower_samplers.
- gl_nir_lower_buffers: depends on UniformBlocks and
SharedStorageBlocks being set by the linker.
For the regular GLSL code path, those datastructures are filled
earlier. For NIR linking code path we need to generate the nir_shader
first then process it -- and currently the processing works with all
shaders together. So move the passes out of brw_create_nir into its
own function, called by the brwProgramStringNotify and
brw_link_shader().
This patch prepares ground for ARB_gl_spirv, that will make use of NIR
linker.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The iterator `i` already walks the right amount now that is
incremented by `dmul`, so no need to `* 2`. Fixes invalid memory
access in upcoming ARB_gl_spirv tests.
Failure bisected by Arcady Goldmints-Orlov.
Fixes: b019fe8a5b6 "glsl/nir: Fix handling of 64-bit values in uniform storage"
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For n_columns == 1, we have a vector which is handled by the else
case. Fixes invalid memory access in upcoming ARB_gl_spirv tests.
Failure bisected by Arcady Goldmints-Orlov.
Fixes: 81e51b412e9 "nir: Make nir_constant a vector rather than a matrix"
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
bitfield_select.
bitfield_select is defined as:
bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert)
matching the behavior of AMD's BFI instruction.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
This lets us use the optimization pattern
(('ult', 31, ('iand', b, 31)), False) to remove the
bcsel instruction for code originating in D3D shaders.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
The [iu]bfe and bfm instructions are defined to only use the five
least significant bits.
This optimizes a common pattern from D3D -> SPIR-V translation.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.
Tested-by: Eric Anholt <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
friends.
These optimizations are based on the fact that
'and(a,b) <= umin(a,b)'.
For AMD, this series moves the optimization from LLVM to NIR,
so currently no vkpipeline-db changes here.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Andreas Baierl <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
| |
Suggested-by: Chris Wilson <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
Since we cannot handle ffma in ppir, lower it on nir level already.
Signed-off-by: Andreas Baierl <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
| |
This simple extension might be useful for debugging purposes.
GAPID has support for it.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110939
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED is used, then the platform
gralloc module will select a format based on the usage flags provided by
the camera device and the other endpoint of the stream.
The patch fixes crash in vulkan when the test is run with camera stream
set to HAL_PIXEL_FORMAT_IMPLEMENTATION_DEFINED.
Test: android.graphics.cts.CameraVulkanGpuTest#testCameraImportAndRendering
on chromebook with camera HAL3.
v2: use AHARDWAREBUFFER_FORMAT_IMPLEMENTATION_DEFINED and take
AHARDWAREBUFFER_USAGE_CAMERA_MASK in to account (Gurchetan)
Fixes: f1654fa7e31 "anv/android: support creating images from external format"
Signed-off-by: Nataraj Deshpande <[email protected]>
Signed-off-by: Gurchetan Singh <[email protected]>
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Gurchetan Singh <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit moves the sysvals to a separate, new constant buffer
at the end (before the shader constants). It also allows us to
remove the special handling we had for cbuf0, and enables all
constant buffers to support user-specified resources and user
buffers.
v2: (by Kenneth Graunke)
- Rebase on the previous patch to fix system value uploading.
- Fix disk cache num_cbufs calculation
- Fix passthrough TCS to report num_cbufs = 1 so upload actually occurs
- Change upload_sysvals to assert that num_cbufs > 0 when
num_system_values > 0.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I neglected to mark cbuf0_needs_upload = false after uploading it.
The obvious fix regressed user clip plane tests, because of a second
bug: we also forgot to mark that they may need re-uploading when
changing shader programs (which may have more or less system values).
Thanks to Timur Kristóf for catching the original issue.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timur Kristóf <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
eglapi.c"
This reverts commits cc4b68a80193e2a132cb62309292984a9428f2bb and
b27fb3eacab906ec06cd61b7d01e3425c3b3cbfc.
These caused a bunch of EGLSync tests to crash when they were previously
failing.
I have a hunch the tests are doing something wrong, like using
extensions without checking for they support, but until the issue is
investigated I'm just reverting these commits.
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: drop them altogether, they should never get called in the
first place (Emil)
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No need to add a function that returns `false` only to be cast into
a pointer, we can just use the existing `return NULL` :)
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This way other functions added in these entrypoints don't need to check
anything.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
There is always a BO.
|
|
|
|
|
|
| |
This reverts commit db8f57a5cb4ab8e1ad789793678797c04e95de21.
This is bonkers. There will always be a BO.
|
|
|
|
|
|
|
| |
total constlen in shared programs: 2485933 -> 2462236 (-0.95%)
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
We only ever return the shader we were passed in (but internally
modified).
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We need the constants uploaded to cover the NIR offset plus the size,
not the aligned-down start of our upload range plus the size. Fixes
mistaken UBO analysis with mat3 loads.
Fixes: 893425a607a6 ("freedreno/ir3: Push UBOs to constant file")
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|