| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shaders can use quite a bit of uniform data. Better to put it in the
upload buffers, like we do for client vertex data, rather than the
batch buffer state area, which is primarly used for indirect state.
This should free up batch space, allowing us to emit more commands in a
batch before flushing. Because BRW_NEW_BATCH also causes a lot of state
to be re-emitted, it may also reduce CPU overhead a little bit.
We took this approach on Gen4-5, but switched to using the batch area
on Gen6+ because buffer 0 is relative to Dynamic State Base Address by
default, which is set to the start of the batch.
On Gen9+, we already use a relocation due to a workaround, so this is
trivial to change and has basically no downside.
Unfortunately we can't change compute shader push constants because
MEDIA_CURBE_LOAD always uses an offset from dynamic state base address.
Improves performance in GLBenchmark 2.7/TRex Offscreen by:
- Skylake GT4e: 0.52821% +/- 0.113402% (n = 190)
- Apollolake: 0.510225% +/- 0.273064% (n = 70)
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I don't think CS push constant uploading uses the section of L3
controlled by 3DSTATE_PUSH_CONSTANT_ALLOC_XS. So I don't think
it needs to be re-emitted when that space is reallocated.
The programming note in gen7_allocate_push_constants doesn't
indicate this is necessary, at least.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
The instruction encodings only allow for immediates. Don't try to
replace a zero (which is dumb to have in that op in any case) with RZ.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
| |
Just like is done on desktop and what is expected by the build-id code.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As gen_builder.hpp file is generated, it contains information that is
specific to the LLVM version it originates from.
As suggested by Tim, the file seems to be forwards compatible. So in
order to produce ship a file which will work everywhere we should be
using earlies supported LLVM - 3.9.
With this we're back on track and can build all of mesa without
python/mako/flex and friends.
In the long term we might want to see if the python generators can be
updated to produce LLVM version agnostic files. At least within the
range supported by SWR.
Cc: <[email protected]>
Cc: Chuck Atkins <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we fallback currently the gl_program objects are re-allocated.
This is likely to change when the i965 cache lands, but for now
this fixes a crash when using MESA_GLSL=cache_fb. This env var
simulates the fallback path taken when a tgsi cache item doesn't
exist due to being evicted previously or some kind of error.
Unlike i965 we are always falling back at link time so it's safe to
just re-allocate everything. We will be unnecessarily freeing and
re-allocate a bunch of things here but it's probably not a huge deal,
and can be changed when the i965 code lands.
Fixes: 0e9991f957e2 ("glsl: don't reference shader prog data during cache fallback")
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the gallium state tracker a tgsi binary may have been evicted
from the cache to make space. In this case we would take the
fallback path and recompile/link the shader.
On i965 there are a number of reasons we can get to the program
upload stage and have neither IR nor a valid cached binary.
For example the binary may have been evicted from the cache or
we need a variant that wasn't previously cached.
This environment variable enables us to force the fallback path that
would be taken in these cases and makes it easier to debug these
otherwise hard to reproduce scenarios.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will explicitly state that we are following the fallback
path when we find invalid/corrupt cache items. It will also
output the fallback message when the fallback path is forced
via an environment variable, the following patches will allow
this.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Tested-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Cc: Christian König <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-and-Tested-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
| |
OMX and VA can optionally use the X11 DRI2/DRI3, thus we should link
only as required.
Cc: <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The vl_*_screen_create API properly falls back to a NOP when we're
building without specific platforms. So the only thing we need is to
handle the lack of X11/Xlib.h and provide a dummy Display define.
Cc: <[email protected]>
Cc: Christian König <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
En route to a X11-less builds
Cc: <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nayan Deshmukh <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nayan Deshmukh <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nayan Deshmukh <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's been like this since the code was introduced.
Fixes: 86eb4131a90 (st/va: add headless support, i.e. VA_DISPLAY_DRM)
Cc: <[email protected]>
Cc: Julien Isorce <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nayan Deshmukh <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
| |
... and make it const, since we shouldn't tinker with it.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nayan Deshmukh <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Provide a dummy stub when the user has opted w/o said platform, thus
we can build the binaries without unnecessarily requiring X11/other
headers.
In order to avoid build and link-time issues, we remove the HAVE_DRI3
guards in the VA and VDPAU state-trackers.
With this change st/va will return VA_STATUS_ERROR_ALLOCATION_FAILED
instead of VA_STATUS_ERROR_UNIMPLEMENTED. That is fine since upstream
users of libva such as vlc and mpv do little error checking, let
alone distinguish between the two.
Cc: Leo Liu <[email protected]>
Cc: Guttula, Suresh <[email protected]>
Cc: [email protected]
Cc: Christian König <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Pretty much every other place does the same.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we are having the XCB_DRI3 dependencies duplicated,
partially.
Just do a once-off check and add all of the respective CFLAGS/LIBS
where needed.
As a nice side effect this helps us solve a couple of FIXMEs.
DRI3 is not a thing w/o X11 so disable it in such cases.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Rather than having multiple places that define the macros, do it just
once in configure. Makes existing code a bit shorter and easier to
manage as we fix the VL targets with follow-up commits.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
Rename the remaining references to omit the egl part.
Cc: [email protected]
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Analogous to others earlier, these will be used to control the platform
for more than the EGL driver.
Cc: <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's still an error after my recent clean-up if LLVM is not patched to
enable AMDGPU target:
external/mesa3d/src/amd/common/ac_llvm_util.c:38:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTargetInfo();
^
external/mesa3d/src/amd/common/ac_llvm_util.c:39:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTarget();
^
external/mesa3d/src/amd/common/ac_llvm_util.c:40:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTargetMC();
^
external/mesa3d/src/amd/common/ac_llvm_util.c:41:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUAsmPrinter();
^
We need to drop libmesa_amd_common when LLVM is disabled, however there's
still a dependency on include paths for ac_binary.h. So explicitly add the
include path when LLVM is disabled.
Signed-off-by: Rob Herring <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3dfe61ed6ec6 ("gallium: decrease the size of pipe_box - 24 -> 16
bytes") changed the size of pipe_box, but the virgl code was relying on
pipe_box and drm_virtgpu_3d_box structs having the same size/layout doing
a struct copy. Copy the fields one by one instead.
Cc: Marek Olšák <[email protected]>
Cc: Dave Airlie <[email protected]>
Fixes: 3dfe61ed6ec ("gallium: decrease the size of pipe_box - 24 -> 16 bytes")
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Function droid_swap_buffers may get called without dri2_surf->buffer set,
in these cases we don't have a back buffer set either. Patch fixes segfault
seen with 3DMark that uses android.opengl.GLSurfaceView for rendering it's UI.
backtrace:
#00 pc 00013f88 /system/lib/egl/libGLES_mesa.so (droid_swap_buffers+104)
#01 pc 000117b2 /system/lib/egl/libGLES_mesa.so (dri2_swap_buffers+50)
#02 pc 000058b2 /system/lib/egl/libGLES_mesa.so (eglSwapBuffers+386)
#03 pc 00011329 /system/lib/libEGL.so (eglSwapBuffersWithDamageKHR+553)
#04 pc 000118e7 /system/lib/libEGL.so (eglSwapBuffers+55)
#05 pc 000754dc /system/lib/libandroid_runtime.so
Note, this is v1 as v2 caused dEQP regressions.
Fixes: 2acc69d ("EGL/Android: Add EGL_EXT_buffer_age extension")
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Emil Velikov <[email protected]>
Cc: "17.1" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 9ca6711faa03 changed the Wayland winsys to only block for the
frame callback inside SwapBuffers, rather than get_back_bo. get_back_bo
would perform a single non-blocking Wayland event dispatch, to try to
find any release events which we had pulled off the wire but not
actually processed. The blocking dispatch was moved to SwapBuffers.
This removed a guarantee that we would've processed all events inside
get_back_bo(), and introduced a failure whereby the server could've sent
a buffer release event, but we wouldn't have read it. In clients
unconstrained by SwapInterval (rendering ~as fast as possible), which
were being displayed directly without composition (buffer release delayed),
this could lead to get_back_bo() failing because there were no free
buffers available to it.
The drawing rightly failed, but this was papered over because of the
path in eglSwapBuffers() which attempts to guarantee a BO, in order to
support calling SwapBuffers twice in a row with no rendering actually
having been performed.
Since eglSwapBuffers will perform a blocking dispatch of Wayland
events, a buffer release would have arrived by that point, and we
could then choose a buffer to post to the server. The effect was that
frames were displayed out-of-order, since we grabbed a frame with random
past content to display to the compositor.
Ideally get_back_bo() failing should store a failure flag inside the
surface and cause the next SwapBuffers to fail, but for the meantime,
restore the correct behaviour such that get_back_bo() no longer fails.
Signed-off-by: Daniel Stone <[email protected]>
Reported-by: Eero Tamminen <[email protected]>
Acked-by: Pekka Paalanen <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98833
Fixes: 9ca6711faa03 ("Revert "wayland: Block for the frame callback in get_back_bo not dri2_swap_buffers"")
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
During display initialisation, we need a separate event queue to handle
the registry events, which is correctly handled. But we also need
separate per-surface event queues to handle swapchain-related events,
such as surface frame events and buffer release events. This avoids two
surfaces from the same EGLDisplay, both current on separate threads,
dispatching each other's events.
Create separate per-surface event queues, create wl_surface and wl_drm
proxy wrapper objects per surface, so we eliminate the race around
sending events to the wrong queue. swrast buffers do not need a
dedicated proxy wrapper, as the wl_shm_pool used to create the
wl_buffers, being transient, can itself be assigned to a queue.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Fixes: 36b9976e1f99 ("egl/wayland: Avoid race conditions when on non-main thread")
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
| |
wl_display_roundtrip_queue() exists and can replace roundtrip(). The
API was introduced with wayland 1.6, while we currently require 1.11.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Though most swapchain operations used a queue, they were racy in that
the object was created with the queue only set later, meaning that its
event could potentially be dispatched from the default queue in between
these two steps.
Use proxy wrappers to avoid this race, also assigning wl_buffers created
for the swapchain to the event queue.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
| |
Calling random callbacks on the display's event queue is hostile, as
we may call into client code when it least expects it. Create our own
event queue, one per wsi_wl_display, and use that for the registry.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no need to call wl_display_roundtrip() after trying to create a
buffer through wl_drm; if it succeeds then everything is fine, and if it
fails, then we get a fatal protocol error so can't recover anyway.
Additionally, doing a roundtrip on the default / main application queue,
is destructive anyway, so would need to be its own queue.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
Untangle the exit cleanup paths so we don't try to use the registry
variable before it's been initialised.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The procedure for decompressing an opaque DXT1 OpenGL format is
dependant on the comparison of two colors stored in the first 32 bits of
the compressed block. Here's the specified OpenGL behavior for
reference:
The RGB color for a texel at location (x,y) in the block is given by:
RGB0, if color0 > color1 and code(x,y) == 0
RGB1, if color0 > color1 and code(x,y) == 1
(2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2
(RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3
RGB0, if color0 <= color1 and code(x,y) == 0
RGB1, if color0 <= color1 and code(x,y) == 1
(RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2
BLACK, if color0 <= color1 and code(x,y) == 3
The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1.
This means that the behavior is incompatible with OpenGL. This is stated
in the SKL PRM, Vol 5: Memory Views:
Opaque Textures (DXT1_RGB)
Texture format DXT1_RGB is identical to DXT1, with the exception that the
One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
the resulting texel color is derived strictly from the Opaque Color Encoding.
The alpha channel defaults to 1.0.
Programming Note
Context: Opaque Textures (DXT1_RGB)
The behavior of this format is not compliant with the OGL spec.
The opaque and non-opaque DXT1 OpenGL formats are specified to be
decoded in exactly the same way except the BLACK value must have a
transparent alpha channel in the latter. Use the four-channel BC1 Intel
formats with the alpha set to 1 to provide the behavior required by the
spec. Note that the alpha is already set to 1 for RGB formats in
brw_get_texture_swizzle().
v2: Provide a more detailed commit message (Kenneth Graunke).
v3: Ensure the alpha channel is set to 1 for DXT1 formats.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <[email protected]>
Acked-by: Tapani Pälli <[email protected]> (v1)
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The procedure for decompressing an opaque BC1 Vulkan format is dependant on the
comparison of two colors stored in the first 32 bits of the compressed block.
Here's the specified OpenGL (and Vulkan) behavior for reference:
The RGB color for a texel at location (x,y) in the block is given by:
RGB0, if color0 > color1 and code(x,y) == 0
RGB1, if color0 > color1 and code(x,y) == 1
(2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2
(RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3
RGB0, if color0 <= color1 and code(x,y) == 0
RGB1, if color0 <= color1 and code(x,y) == 1
(RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2
BLACK, if color0 <= color1 and code(x,y) == 3
The sampling operation performed on an opaque DXT1 Intel format essentially
hard-codes the comparison result of the two colors as color0 > color1. This
means that the behavior is incompatible with OpenGL and Vulkan. This is stated
in the SKL PRM, Vol 5: Memory Views:
Opaque Textures (DXT1_RGB)
Texture format DXT1_RGB is identical to DXT1, with the exception that the
One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and
the resulting texel color is derived strictly from the Opaque Color Encoding.
The alpha channel defaults to 1.0.
Programming Note
Context: Opaque Textures (DXT1_RGB)
The behavior of this format is not compliant with the OGL spec.
The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in
exactly the same way except the BLACK value must have a transparent alpha
channel in the latter. Use the four-channel BC1 Intel formats with the alpha
set to 1 to provide the behavior required by the spec.
v2 (Kenneth Graunke):
- Provide a more detailed commit message.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925
Cc: <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly for running in our CI system to prevent dEQP from
continuing on to the next test if we get a GPU hang. As it currently
stands, dEQP uses the same VkDevice for almost all tests and if one of
the tests hangs, we set the anv_device::device_lost flag and report
VK_ERROR_DEVICE_LOST for all queue operations from that point forward
without sending anything to the GPU. dEQP will happily continue trying
to run tests and reporting failures until it eventually gets crash that
forces the test runner to start over. This circumvents the problem by
just aborting the process if we ever get a GPU hang. Since this is not
the recommended behavior most of the time, we hide it behind an
environment variable.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
We weren't wrapping this before because anv_cmd_buffer_execbuf may throw
a more meaningful error message. However, we do change the error code
into VK_ERROR_DEVICE_LOST, so we should print a new message.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On GFX9 with only 4K CE RAM, define the range of slots that will be
allocated in CE RAM. All other slots will be uploaded directly. This will
switch dynamically according to which slots are used by current shaders.
GFX9 CE usage should now be similar to VI instead of being often disabled.
Tested on VI by taking the GFX9 CE allocation codepath and setting
num_ce_slots = 2 everywhere to get frequent switches between both modes.
CE is still disabled on GFX9.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This was only needed by LOAD_CONST_RAM, which is now only used to load
whole CE.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This decreases the size of CE RAM dumps to L2, or the size of descriptor
uploads without CE.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
A later commit will only upload descriptors used by shaders, so we won't do
full dumps anymore, so the only way to have a complete mirror of CE RAM
in memory is to do a separate dump after the last draw call.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
All updates of descriptors_dirty also set dirty_mask, so the return is
unnecessary. The next commit will want this function to be executed
even if dirty_mask == 0.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
We'll do partial uploads of descriptor arrays, so we need to clamp
against what shaders declare.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sampler slots: slot[8], .. slot[39] (ascending)
Image slots: slot[7], .. slot[0] (descending)
Each image occupies 1/2 of each slot, so there are 16 images in total,
therefore the layout is: slot[15], .. slot[0]. (in 1/2 slot increments)
Updating image slot 2n+i (i <= 1) also dirties and re-uploads slot 2n+!i.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Constant buffers: slot[16], .. slot[31] (ascending)
Shader buffers: slot[15], .. slot[0] (descending)
The idea is that if we have 4 constant buffers and 2 shader buffers, we only
have to upload 6 slots. That optimization is left for a later commit.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|