| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
We won't have a var to load from, so don't try to the processing
required if we don't need it.
This avoids crashes in:
dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.workgroup_two_buffers
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
For variable pointers we really don't want to case the pointers to int
without a good reason, just add a wrapper for bcsel loading and result
storing.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Meson test has a concepts of suites, which allow tests to be grouped
together. This allows for a subtest of tests to be run only (say only
the tests for nir). A test can be added to more than one suite, but for
the most part I've only added a test to a single suite, though I've
added a compiler group that includes nir, glsl, and glcpp tests.
To use this you'll need to invoke meson test directly, instead of ninja
test (which always runs all targets). it can be invoked as:
`meson test -C builddir --suite $suitename` (meson test has addition
options that are pretty useful).
Tested-By: Gert Wollny <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no point reverting to the last saved point if that save point is
the empty batch, we will just repeat ourselves.
v2: Merge with new commits, changes was minimized, added the 'fixes' tag
v3: Added in to patch series
v4: Fixed the regression which was introduced by this patch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630
Reported-by: Mark Janes <[email protected]>
The solution provided by: Jordan Justen <[email protected]>
CC: Chris Wilson <[email protected]>
Fixes: 3faf56ffbdeb "intel: Add an interface for saving/restoring
the batchbuffer state."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107626
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108630 (fixed in v4)
Signed-off-by: Andrii Simiklit <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we detect the module and if missing, the glXGetMsc* API is
effectively a stub, always returning false.
This is what effectively has been happening with our meson build :-(
Thus users have no chance of using it - they cannot even distinguish
if the failure is due to a misconfigured build.
There's no reason for keeping xf86vidmode optional - it has been
available in all distributions for years.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Fixes: a47c525f3281a2753180e "meson: build glx"
|
|
|
|
|
|
|
|
| |
Fixes an assertion in SoTTR.
Fixes: dd0172e865 ("radv: Use structured intrinsics instead of indexing workaround for GFX9.")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These force the index to be used in the instruction so we don't need the
workaround.
Totals:
SGPRS: 1321642 -> 1321802 (0.01 %)
VGPRS: 943664 -> 943788 (0.01 %)
Spilled SGPRs: 28468 -> 28480 (0.04 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 52415292 -> 52338932 (-0.15 %) bytes
LDS: 400 -> 400 (0.00 %) blocks
Max Waves: 233903 -> 233803 (-0.04 %)
Wait states: 0 -> 0 (0.00 %)
Totals from affected shaders:
SGPRS: 238344 -> 238504 (0.07 %)
VGPRS: 232732 -> 232856 (0.05 %)
Spilled SGPRs: 13125 -> 13137 (0.09 %)
Spilled VGPRs: 88 -> 89 (1.14 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 80 -> 80 (0.00 %) dwords per thread
Code Size: 15752712 -> 15676352 (-0.48 %) bytes
LDS: 139 -> 139 (0.00 %) blocks
Max Waves: 31680 -> 31580 (-0.32 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0
was written, then VARYING_SLOT_CLIP_DIST1 would be as well. That's
true with the current lowering, but not necessary if there are 4 or
fewer clip distances. Separate out the checks to allow this.
The new NIR-based lowering will trigger this case, which would have
caused backend validation errors (src is null) without this patch.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way nir_lower_clip_vs() works with store_output intrinsics makes a
ton of assumptions about the driver_location field.
In i965 and iris, I'd rather do this lowering early and work with
variables. v3d may want to switch to that as well, and ir3 could too,
but I'm not sure exactly what would need updating. For now, handle
both methods.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
I'll want the variables in the next patch.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
It's now called exactly once, and there's not really any distinction.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Check if the base ends up with no variable, and continue
if we see that case outside the loop.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I posted a load of hacks before to do this, Jason suggested this,
just check the deref mode, not the variable mode and delay getting
the variable until we know the type.
avoids crashes when derefing shared memory pointers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare]
assert(nir_intrinsic_write_mask(instr) ==
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
(1 << instr->num_components) - 1);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This was caused by 6339aba775ecdc which added these completely valid
checks. However clang likes to complain about signedness mismatches.
Fixes: 6339aba775ecdc "intel/compiler: Lower SSBO and shared..."
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
| |
It's not at all intel-specific; the formula is dictated by OpenGL and
Vulkan. The only intel-specific thing is that we need the lowering. As
a nice side-effect, the new version is variable-group-size ready.
Reviewed-by: Plamena Manolova <[email protected]>
|
|
|
|
|
|
|
| |
Fixes: d971a4230d54069c996bc "loader: Factor out the common driver
opening logic from each loader."
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows to fast clear the depth part (or the stencil part)
of a depth+stencil surface when HTILE is enabled. I didn't test
on GFX8, so it's disabled currently.
This gives a very nice boost, for example when clearing the depth
aspect of a 4096x4096 D32_SFLOAT_S8_UINT image (18x faster).
BEFORE: 235 us
AFTER: 13 us
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This path will be eventually improved later but as it's only
used on SI (or with RADV_DEBUG=noibs), I'm not sure if that
matters much.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
The chained submission is the fastest path and it should now
be used more often than before. This removes some EOP events.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At least GC2000 seems to push some dirt from the PE color cache into
the last bound render target when drawing depth only. Newer cores
seem to behave properly and don't do this, but I have found no way
to fix it on GC2000. Flushes and stalls don't seem to make any
difference.
In order to stop the core from pushing the dirt into a precious real
render target, plug in dummy buffer when rendering without a color
buffer.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Philipp Zabel <[email protected]>
|
|
|
|
|
|
|
|
| |
The in_fence_fd needs to be initialised to -1.
Fixes: d1a1c21e7 (virgl: native fence fd support)
Reviewed-by: Robert Foss <[email protected]>
|
|
|
|
|
|
|
|
|
| |
DCC and FMASK also imply a fast-clear eliminate, so it should be
safe to reset the predicate unconditionally. We still only skip
FMASK or CMASK decompressions for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
This is just a small cleanup.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
We read 4 values out of sample_locs_8x, so make sure the array is
big enough.
Fixes: ac76aeef20 ("radeonsi: switch back to standard DX sample positions")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With 5d517a streamout info is only attached to the shader for which the
transform feedback is actually recorded, but the driver set the context info
with each state submitted, thereby always using the info data that was
attached to the vertex shader.
Pass the streamout stride info to the context only from the shader that
actually has outputs. (Thanks to Marek Olšák for pointing me in the right
direction)
Fixes regresion with: dEQP-GLES31.functional.tessellation.invariance.*
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108734
Fixes: 5d517a599b1eabd1d5696bf31e26f16568d35770
st/mesa: Don't record garbage streamout information in the non-SSO case.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FRAMEBUFFER_INCOMPLETE_DIMENSIONS is not supported for GLES 3.0 and later and
not defined for Desktop OpenGL. Instead use FRAMEBUFFER_UNSUPPORTED like it
was done before.
Thanks to Iago Toral and Andrey Simiklit for pointing out the problem and the
details.
Fixes: ebcde3454552adc6d3fea8af2207aafaba857796
i965: be more specific about FBO completeness errors
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The structure qdws is not allocated at this point, nor is the
file descriptor set to it's member. Use the fd directly instead.
Fixes: d1a1c21e7621b5177febf191fcd3d3b8ef69dc96
virgl: native fence fd support
Signed-off-by: Gert Wollny <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emulate MESA_FORMAT_R_SRGB8 by using L8_UNORM_SRGB. This is possible
because component swizzling is handled based on the mesa format and,
hence, the a r001 swizzling can be used to correct the components.
Enables and makes pass (tested on Kabylake)
dEQP-GLES31.functional.srgb_texture_decode.skip_decode.sr8.*
dEQP-GLES31.functional.texture.filtering.cube_array.formats.sr8*
Signed-off-by: Gert Wollny <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
This makes it possible to use a hardware luminance format as RED format.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The driver was returning GL_FRAMEBUFFER_UNSUPPORTED for all cases of an
incomplete fbo, be a bit more specific about this following the description
of glCheckFramebufferStatus.
This helps to keeps dEQP happy when adding EXT_texture_sRGB_R8 support.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
As the name says, the format is an sRGB format.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
Remove a dead variable, a int->bool conversion and some
whitespace changes.
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This is all leftover from the i965 split.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On nv50, certain operations must happen on regs below 64, due to
encoding requirements. First of all, we add infrastructure to enforce
this. Secondly we change the spill order to first spill RIG nodes that
are unconstrained, followed by ones that are.
This makes the gamecube logo shadertoy compile properly. Curiously, if
we adjust the spill order so that we first spill the constrained RIG
nodes instead, the RA also succeeds. However it seems more logical to
first spill the unconstrained ones.
While we're at it, drop the nv50 max register to reserve r127 as the
zero register of last resort (r63 is preferred).
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of the size restriction existing in two places, and potentially
being applied twice, we move this together. Ops with 16-bit register
addresses can only take a short reg, and ops with immediates can only
take a short reg.
Of course we leave the immediate 0 in place since we know that it will
be replaced by r63/r127 down the line, so don't treat zeroes as an
immediate.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We removed the op from the BB, but it was still listed in its sources'
uses. This could trip up some logic down the line which analyzes all the
uses of an l-value, e.g. spilling.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would print errors on the console like:
libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize
When we had everything we needed for:
libEGL debug: EGL user error 0x3001 (EGL_NOT_INITIALIZED) in eglInitialize: DRI2: failed to find EGLDevice
(for a gbm error in my case)
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I copied the code from egl_dri2.c, but the functionality was equivalent
between all the loaders other than their particular environment variables.
v2: Drop the logging function equivalent to loader_default_logger()
(requested by Eric, Emil). Move the SCons workaround across. Drop
the now-unused driGetDriverExtensions() declaration that was lost in a
rebase.
Reviewed-by: Eric Engestrom <[email protected]> (v1)
Reviewed-by: Emil Velikov <[email protected]> (v1)
|
|
|
|
|
|
|
|
| |
I need other types from the header now, and "gl.h is big" is not a good
reason to duplicate definitions.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Everyone needs to call it, and platform_x11 forgot to.
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The only thing you do with a dri driver handle is get the extensions
pointer, so just fold it in to simplify the callers.
v2: Add the declaration of driGetDriverExtensions() that got lost in a
rebase.
Reviewed-by: Eric Engestrom <[email protected]> (v1)
Reviewed-by: Emil Velikov <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
| |
You can tell by "Mesa/configs/default" how old this is. Your build system
really has to provide the DEFAULT_DRIVER_DIR, or other loaders will break.
v2: Move the bad (non-prefix-dependent) define to the SConscript to avoid
breaking it.
Reviewed-by: Eric Engestrom <[email protected]> (v1)
Reviewed-by: Emil Velikov <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
| |
After doing a bunch of benchmarks, primitive binning helps
some games like The Talos Principle (+5%) or Serious Sam 2017
(+3%). For other titles, either it doesn't change anything or
it hurts very few (less than 1%).
This only affects GFX9.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|