| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
These were previously handled by the spirv_to_nir, but that changed to
be an explict pass in 23d30f4099f "spirv,nir: lower
frexp_exp/frexp_sig inside a new NIR pass"
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
It only accepts 32-bit integers so it should have a more descriptive
name. This patch should not be a functional change.
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for
execinfo.h presence, just check directly. This allows the build to work
on musl.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Nothing uses its results yet, that will come with the following commits.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently, we're supposed to look at the texture object's built-in
sampler object's sRGB decode setting in order to decide whether to
decode/downsample/re-encode, or simply downsample as-is. Previously,
we had just respected the pipe_resource's format.
Fixes SKQP's Skia_Unit_Tests.SRGBMipMaps test.
(This ports commit 337a808062c756b474ee80a9ac04b5a3dbbeb67e from i965
to st/mesa for Gallium drivers.)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
It is similar with YUYV
Fixes: 165e704719b85c ("i965/i915: Add UYVY as the supported format")
Signed-off-by: Haihao Xiang <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Used to enable INTEL_shader_atomic_float_minmax.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use hash_table_u64 instead of hash_table directly, since the former
will also handle the special keys (deleted and freed) and allow use
the whole u64 space.
Fixes crash in INTEL_DEBUG=bat when using a key with value 0 -- the
current value for a freed key.
Fixes: b38dab101ca "util/hash_table: Assert that keys are not reserved pointers"
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main motivation for this change is API ergonomics: most operations
on dynarrays are really on elements, not on bytes, so it's weird to have
grow and resize as the odd operations out.
The secondary motivation is memory safety. Users of the old byte-oriented
functions would often multiply a number of elements with the element size,
which could overflow, and checking for overflow is tedious.
With this change, we only need to implement the overflow checks once.
The checks are cheap: since eltsize is a compile-time constant and the
functions should be inlined, they only add a single comparison and an
unlikely branch.
v2:
- ensure operations are no-op when allocation fails
- in util_dynarray_clone, call resize_bytes with a compile-time constant element size
v3:
- fix iris, lima, panfrost
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add missing cases for fp32 and fp16 formats.
Fixes: c68334ffc0a9 "st/mesa: add floating point formats in st_new_renderbuffer_fb()"
Signed-off-by: Kevin Strasser <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Tells whether or not the driver can handle gl_LocalInvocationIndex and
gl_GlobalInvocationID. If not supported (the default), state tracker
will lower those on behalf of the driver.
v2: Add case to u_screen.c. (Anholt)
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Perform those before some derefs are gone when we lower the buffers
after the st_nir_opts() call.
Intel SKL shader-db results:
total instructions in shared programs: 15593685 -> 15590708 (-0.02%)
instructions in affected programs: 378078 -> 375101 (-0.79%)
helped: 777
HURT: 44
helped stats (abs) min: 1 max: 68 x̄: 4.07 x̃: 4
helped stats (rel) min: 0.04% max: 31.58% x̄: 2.88% x̃: 1.37%
HURT stats (abs) min: 1 max: 24 x̄: 4.20 x̃: 2
HURT stats (rel) min: 0.17% max: 8.00% x̄: 1.60% x̃: 1.27%
95% mean confidence interval for instructions value: -4.02 -3.23
95% mean confidence interval for instructions %-change: -2.93% -2.35%
Instructions are helped.
total loops in shared programs: 4815 -> 4815 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total cycles in shared programs: 371965528 -> 371788566 (-0.05%)
cycles in affected programs: 184190307 -> 184013345 (-0.10%)
helped: 3650
HURT: 2855
helped stats (abs) min: 1 max: 59400 x̄: 99.45 x̃: 15
helped stats (rel) min: <.01% max: 43.18% x̄: 2.60% x̃: 1.02%
HURT stats (abs) min: 1 max: 16362 x̄: 65.16 x̃: 10
HURT stats (rel) min: <.01% max: 66.22% x̄: 2.78% x̃: 0.81%
95% mean confidence interval for cycles value: -53.73 -0.68
95% mean confidence interval for cycles %-change: -0.39% -0.08%
Cycles are helped.
total spills in shared programs: 11936 -> 11956 (0.17%)
spills in affected programs: 443 -> 463 (4.51%)
helped: 0
HURT: 8
total fills in shared programs: 25644 -> 25619 (-0.10%)
fills in affected programs: 2306 -> 2281 (-1.08%)
helped: 24
HURT: 2
LOST: 7
GAINED: 16
Total CPU time (seconds): 1679.04 -> 1695.69 (0.99%)
shader-db results radeonsi (VEGA64):
Totals from affected shaders:
SGPRS: 180160 -> 179552 (-0.34 %)
VGPRS: 115368 -> 114544 (-0.71 %)
Spilled SGPRs: 5627 -> 5603 (-0.43 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 7808364 -> 7803268 (-0.07 %) bytes
LDS: 192 -> 192 (0.00 %) blocks
Max Waves: 19202 -> 19340 (0.72 %)
Wait states: 0 -> 0 (0.00 %)
Radeonsi results provided by Timothy.
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
BLORP now handles this so there's no reason to fall back.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both GLSL IR and NIR perform the same mod -> floor lowering for 32-bit
types. But nir_lower_double_ops is slightly more defensive against
lowered drcp precision loss, and handles mod(x, x) = 0 directly. This
works well...assuming nir_lower_double_ops actually gets an fmod op to
lower in the first place.
The previous patches enabled NIR-based lowering for the remaining
drivers, so we can stop using the GLSL IR lowering when using NIR.
Fixes KHR-GL45.gpu_shader_fp64.builtin.mod_dvec[234] on iris.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Since NIR_PASS no longer swaps out the NIR pointer when NIR_TEST_* is
enabled, we can just take a single pointer and not a pointer to pointer.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that NIR_TEST_* doesn't swap the shader out from under us, it's
sufficient to just modify the shader rather than having to return in
case we're testing serialization or cloning.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This essentially reverts 20234cfe3a20.
Fixes piglit test:
tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test
Fixes: 20234cfe3a20 "st/mesa: don't propagate uniforms when restoring from cache"
Reviewed-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784
|
|
|
|
|
|
|
|
| |
Android build settings require format strings to be string literals.
Fixes: d2906293c43 "mesa: EXT_dsa add selectorless matrix stack functions"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Program parser allocates parameter list.
In case of parsing error some variables will not be freed.
Patch adds freeing of it.
Signed-off-by: Sergii Romantsov <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Allows the legacy matrix stacks to be manipulated without disturbing the
matrix mode selector.
Adapted from a patch from Chris Forbes.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Split this out from glMatrixMode since we're about to need it
independently for EXT_DSA.
Adapted from Chris Forbes commit.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This extension is huge and this gives us a TODO list of functions
to implement.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Improvements related to the patch that removed native_integers:
* In glsl_to_nir, special cases for i2f,u2f,etc are no longer needed
* In prog_to_nir, use sge/slt and let lower_scmp lower it if needed
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
debugoptimized builds don't define NDEBUG, but they also don't define
DEBUG. We want to enable cheap debug code for these builds.
I only chose those occurences that I care about.
Reviewed-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it. There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.
The return type reflects better what the function name suggests. It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it. When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Replace its use with checking for is_entrypoint.
This is a preparation to change spirv_to_nir() return type.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
This shouldn't be allowed in GLES 1/2.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so
we shouldn't allow its enums there either.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead
use the extenion-helpers, and check for desktop and gles extensions
separately.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This extension isn't enabled for GLES 1.x, so we shouldn't allow the
state there. Let's use the extension-helpers instead of CHECK_EXTENSION
for this.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on. This patch attempts to re-vectorize those operations after
the varying optimizations are done.
Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads. This re-vectorization can help a lot.
Broadcom GPUs, however, really do want scalar IO. radeonsi may want
this, or may want to leave it to LLVM. So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick. (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other. We can adjust later if needed.)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.
v2: Fix swrast surfaceless contexts on the driver side.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 55376cb31e2f495a4d872b4ffce2135c3365b873.
It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 12bf7cfecf52083c484602f971738475edfe497e.
This commits caused lots of problems:
https://bugs.freedesktop.org/show_bug.cgi?id=110721
https://bugs.freedesktop.org/show_bug.cgi?id=110761
Fixes: 12bf7cfecf52 ("mesa: unreference current winsys buffers when unbinding winsys buffers")
Pushing without review as we need to get it into next stable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference between imov and fmov has been a constant source of
confusion in NIR for years. No one really knows why we have two or when
to use one vs. the other. The real reason is that they do different
things in the presence of source and destination modifiers. However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Unless source modifiers are present, fmov and imov are the same.
There's no good reason for having two helpers.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This flag has caused more confusion than good in most cases. You can
validly use imov for floats or fmov for integers because, without source
modifiers, neither modify their input in any way. Using imov for floats
is more reliable so we go that direction.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Both of these passes predate the nir_channel helper. We should just use
it instead of hand-rolling it in both passes.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
| |
st/mesa now exposes KHR_blend_equation_advanced_coherent and
EXT_shader_framebuffer_fetch if the new capability is supported.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This extension requires the ability to read from all render targets,
so we only enable it if PIPE_CAP_FBFETCH >= PIPE_CAP_MAX_RENDER_TARGETS.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI's FBFETCH instruction currently only supports reading from a single
render target, but NIR intrinsics can support multiple render targets.
radeonsi can only support fetching from RT 0, but other drivers may be
able to support fetching from any render target.
To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply
PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?"
to an integer number of render targets which can be fetched.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
INTEL_conservative_rasterization isn't exposed on compatibility
contexts, nor for GLES 3.0 and below. We already do this correctly for
gl{Enable,Disable}, but we should do the same for glIsEnabled as well.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
IsEnabled(FRAGMENT_PROGRAM) isn't supposed to be allowed, but our
check allowed this anyway. Let's make these checks consistent, and
while we're at it, modernize them a bit.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
IsEnabled(TEXTURE_CUBE_MAP) isn't supposed to be allowed, but our
check allowed this anyway. Let's make these checks consistent, and
while we're at it, modernize them a bit.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
These are already defined as the exactly same, so let's get rid of
the duplicate definitions.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|