| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
There is a fairly simple relation to turn this into ufind_msb.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
ufind_msb is easily expressed in terms of clz, and we can reduce ifind_msb
to that.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
V3D doesn't have opcodes for ibfe/ubfe, so we need to lower similarly to
glsl/lower_instructions.cpp.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If you don't have HW to do bfi, then lowering bitfieldInsert to bfi makes
things harder than keeping the "bits" argument around.
This still uses bfm, but I've added the obvious lowering of bfm if you
need it.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Daniel Stone <[email protected]>
Cc: Andres Gomez <[email protected]>
Cc: Dylan Baker <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Daniel Stone <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Daniel Stone <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
GLSL ES 1.0.17 specifies that "double" is a keyword reserved
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106823
Signed-off-by: zhaowei yuan <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Discovered by Roland Scheidegger.
The resource_create code uses GPU memory for PIPE_BIND_CUSTOM, but
malloc'd memory otherwise. Vertex and index buffers should use malloc'd
memory.
Cc: 18.0 18.1 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The memcpy had the wrong size and this was causing crashes on 32-bit
builds of the driver.
Fixes: 6a9525bf6729a8 "intel/eu: Switch to a logical state stack"
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106830
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Using the image format is incorrect when the view has a different
format than the image. Instead, the view format needs to be used.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
CC: 18.1 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106687
|
|
|
|
|
|
|
| |
You'd need src/broadcom/cle/ in the -I previously, for srcdir != builddir.
nir was fine at that, but automake didn't have it.
Bugzilla: https://github.com/anholt/mesa/issues/104
|
|
|
|
|
|
|
|
| |
except for the odd one out.
This should support many more formats.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106778
Fixes: cc41603d6d ("intel/tools: new intel_sanitize_gpu tool")
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106801
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106776
Fixes: 943fecc569 ("util: Add a randomized test for the virtual memory allocator")
Tested-by: Vinson Lee <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were trying to read twice as many as the X server sent us, which
upset XCB:
[xcb] Too much data requested from _XRead
[xcb] This is most likely caused by a broken X extension library
[xcb] Aborting, sorry about that.
glx-free-context: ../../src/xcb_io.c:732: _XRead: Assertion `!xcb_xlib_too_much_data_requested' failed.
Fixing this takes 3 GLX piglit tests from crash to pass.
Fixes: 085216295033 "glx: Be more tolerant in glXImportContext (v2)"
Reviewed-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
|
|
| |
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106784
Fixes: 17201a2eb0b1b85387136 "radv: port to using updated anv
entrypoint/extension generator."
Acked-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I'd like to eventually drop support for the confusing "an array of
a single empty string is meant to be interpreted as an empty array", so
let's start by not using it anymore.
Reviewed-by: Dylan Baker <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Looks like we forgot to update this bit of the driver for softpin.
Fixes: 4affeba1e9eb42 ("anv: Soft-pin everything else")
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106779
Fixes: ff904978a1d299a36b587 "gallium/util: Android backtrace support"
Reviewed-by: Dylan Baker <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Fixes: 0ed6a87a106b6e2266e0 "meson: fix platforms=[]"
Reported-by: Christoph Haag <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent patch
mesa: Remove FLUSH_VERTICES from VAO state changes.
Pending draw calls on immediate mode or display list calls do
not depend on changes of the VAO state. So, remove calls to
FLUSH_VERTICES and flag _NEW_ARRAY as appropriate.
uncovered a problem that non immediate mode draw calls do only
flush outstanding immediate mode draws if FLUSH_UPDATE_CURRENT
is set in ctx->Driver.NeedFlush.
In that case, due to the sequence of _mesa_set_draw_vao commands
we could end up with the VAO from the FLUSH_VERTICES call set
into gl_context::Array._DrawVAO when the array draw is executed.
So the change pulls FLUSH_CURRENT out of _mesa_validate_* calls
into the array draw calls being validated.
The change introduces a new macro FLUSH_FOR_DRAW beside FLUSH_VERTICES
and FLUSH_CURRENT that flushes on changed current attributes as well
as on outstanding immediate mode draw calls. Use FLUSH_FOR_DRAW
in the non immediate mode draw code paths.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Kai Wasserbäch <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106594
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Let's add another field to caps v2, that can help report boolean
values.
Suggested-by: Gert Wollny <[email protected]>
Suggested-by: Dave Airlie <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the SSBO analogue to fe0647. User supplied data must
be a multiple of GL_SHADER_STORAGE_BUFFER_OFFSET_ALIGNMENT.
This fixes 44 GLES31 tests on airlied@'s GLES31 sketch branches with
Nvidia hardware, but this patch standalone can applied to master. The
alignment restriction on Nvidia is 32, hence the default value.
Example tests:
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.0
dEQP-GLES31.functional.ssbo.layout.multi_basic_types.single_buffer.std430
v2: Move to a better place in case statement
v3: Rebase
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If EXEC_OBJECT_PINNED is set, we don't want to emit any relocations.
We simply want to add the BO to the validation list, and possibly mark
it as writeable. The new brw_use_pinned_bo() interface does just that.
To avoid having to make every caller consider both the relocation and
softpin cases, we make emit_reloc() call brw_use_pinned_bo() when given
a softpinned buffer.
We also can't grow buffers that are softpinned - the mechanism places a
larger BO at the same offset as the original, which requires moving BOs
around in the VMA. With softpin, we only allocate enough VMA for the
original size of the BO.
v2: Assert that BOs aren't pinned if the kernel says we should move them
(feedback from Chris Wilson)
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This introduces a new fast virtual memory allocator integrated with our
BO cache bucketing. For larger objects, it falls back to the simple
free-list allocator (util_vma).
This puts the allocators in place but doesn't enable softpin yet.
v2:
(feedback from Chris Wilson)
- Check (bo->kflags & EXEC_OBJECT_PINNED) instead of a global flag
- Avoid vma_free(0ull) on the err_free path.
- Only enable if the kernel says we have full PPGTT support
- Make bucketing allocators more resistant to failing to grow arrays
(feedback from Scott Phillips)
- Don't use node after popping it from the list.
- Avoid undefined behavior in canonicalization by reusing new helper
- Comment updates
(feedback from myself)
- Avoid __vma_alloc vs. vma_alloc by making a zero_high_bits helper
to return a non-canonical address with the high bits zeroed.
- Don't shadow loop variable 'i' when destroying things (ugly; worked)
v3:
- Replace zero_high_bits with new common gen_48b_address helper.
Reviewed-by: Scott D Phillips <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If window system supports Y-tiling but not CCS_E, we currently create an
internal CCS for any window system buffers and then resolve right before
handing it off to X or Wayland. In the case of the single-sampled
shadow of a multi-sampled window system buffer, this is pointless
because the only thing we do with it is use it as a MSAA resolve target
so we do MSAA resolve -> CCS resolve -> hand to the window system.
Instead, just disable CCS for the shadow and then the MSAA resolve will
write uncompressed directly into it. If the window system supports
CCS_E, we will still use CCS_E, we just won't do internal CCS.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Instead of having it be a general "is this a winsys image" boolean, make
it more specific to the actual purpose.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of the state stack that's based on copying a dummy instruction
around, we start using a logical stack of brw_insn_states. This uses a
bit less memory and is way less conceptually bogus.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to gen8, the flag [sub]register number is in a different spot on
3src instructions than on other instructions. Starting with Broadwell,
they made it consistent. This commit fixes bugs that occur when a
conditional modifier gets propagated into a 3src instruction such as a
MAD.
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of doing a memcpy, this moves us to start with a blank
instruction (memset to zero) and copy the fields over one at a time.
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This is much cleaner than everything that wants a default value poking
at the bits of p->current directly.
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The emitted buffer_subdata/texture_subdata call didn't match the
respective signatures.
v2: Actually emit buffer_subdata call.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Requires LLVM trunk r329166.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On pre-4.13 kernels, which don't support I915_EXEC_BATCH_FIRST, we move
the validation list entry to the end...but incorrectly left the exec_bo
array alone, causing a mismatch where exec_bos[0] no longer corresponded
with validation_list[0] (and similarly for the last entry).
One example of resulting breakage is that we'd update bo->gtt_offset
based on the wrong buffer. This wreaked total havoc when trying to use
softpin, and likely caused unnecessary relocations in the normal case.
Fixes: 29ba502a4e28471f67e4e904ae503157087efd20 (i965: Use I915_EXEC_BATCH_FIRST when available.)
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the i-th target format is set, all previous target formats
must be non-zero to avoid hangs. In other words, without this
if a fragment shader exports mrt0, mrt2 and mrt3, the GPU hangs
because the target format of mrt1 is zero.
This fixes DXVK GPU hangs with "Seven: The Days Long Gone",
"GTA V" and probably more games.
Cc: "18.0" 18.1" <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise on pre-GFX9, if the constant layout allows both TESS_EVAL and
GEOMETRY shaders, but the PIPELINE has only GEOMETRY, it would return the
GEOMETRY shader for the TESS_EVAL shader.
This would cause the flush_constants code to emit the GEOMETRY constants
to the TESS_EVAL registers and then conclude that it did not need to set
the GEOMETRY shader registers.
Fixes: dfff9fb6f8d "radv: Handle GFX9 merged shaders in radv_flush_constants()"
CC: 18.1 <[email protected]>
Reviewed-by: Alex Smith <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass turns:
if (cond) {
} else {
do_work();
}
into:
if (!cond) {
do_work();
} else {
}
Here's the vkpipeline-db stats (from affected shaders) on Polaris10:
Totals from affected shaders:
SGPRS: 17272 -> 17296 (0.14 %)
VGPRS: 18712 -> 18740 (0.15 %)
Spilled SGPRs: 1179 -> 1142 (-3.14 %)
Code Size: 1503364 -> 1515176 (0.79 %) bytes
Max Waves: 916 -> 911 (-0.55 %)
This pass only affects Serious Sam 2017 (Vulkan) on my side. The
stats are not really good for now. Some shaders look quite dumb
but this will be improved with further NIR passes, like ifs
combination.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Rename and change the prototype for consistency regarding
nir_tex_instr_is_query(). This function will be used in the
following patch.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
These wrappers were introduces, so start using them.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit aba161e63a25a07c3c24fec01b6c63c43874b805)
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit ca0037aaefcb06ff8e1eb6fbde8f313c45789921)
|
|
|
|
|
|
| |
LLVM 5.0 requires additional Win32 libraries, and MinGW with pthreads.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
| |
This just separates the reloc list vs. BO set cases and lets us avoid an
allocation if relocs->deps->entries == 0.
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
| |
Reviewed-by: Scott D Phillips <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Break up Scott's mega-patch
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Scott D Phillips <[email protected]>
|