| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They can vary at call sites if the intrinsic is NOT a legacy SI intrinsic.
We need this to force readnone or inaccessiblememonly on some amdgcn
intrinsics.
This is only used with LLVM 4.0 and later. Intrinsics only used with
LLVM <= 3.9 don't need the LEGACY flag.
gallivm and ac code is in the same patch, because splitting would be
more complicated with all the LEGACY uses all over the place.
v2: don't change the prototype of lp_add_function_attr.
Reviewed-by: Jose Fonseca <[email protected]> (v1)
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since both r600 and radeonsi use code from libamd_common they need to
static link it. At the same time, adding a common library to LIB_DEPS is
fragile [can lean to multiple symbol definitions] and non-obvious - I
had to do a double-take how things work atm.
So follow the libradeon.la approach and put common libraries in
TARGET_RADEON_COMMON
Fixes: 936f5407a7d ("gallium/radeon: Add libamd_common.a to TARGET_LIB_DEPS also for r600")
Cc: Timothy Arceri <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes build failure with --enable-opencl --enable-xvmc:
make[4]: Entering directory '/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/targets/xvmc'
CXXLD libXvMCgallium.la
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `evergreen_create_compute_state':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:254: undefined reference to `ac_elf_read'
../../../../src/gallium/drivers/r600/.libs/libr600.a(evergreen_compute.o): In function `r600_shader_binary_read_config':
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
/home/daenzer/src/mesa-git/mesa/build-amd64/src/gallium/drivers/r600/../../../../../src/gallium/drivers/r600/evergreen_compute.c:189: undefined reference to `ac_shader_binary_config_start'
collect2: error: ld returned 1 exit status
Makefile:760: recipe for target 'libXvMCgallium.la' failed
Fixes: dc4c551a345d ("radeon/ac: switch from radeon_elf_read() to ac_elf_read()")
Acked-by: Timothy Arceri <[email protected]>
Tested-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Fixes build regression caused by d90bf4ef3e1db7
|
|
|
|
|
|
| |
We now use the shared code in AMD common instead.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
For radeonsi we could probably switch to
ac_shader_binary_read_config(). However the functions have
diverged so just share this helper for now.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
this allows to pass the generated files directly to llc or bugpoint
v2: add atomic counter ID
v3: remove extra scope operator, constify
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If i-th thread could not be created it means we have i threads,
not i+1, because we start from 0.
Fixes: 404d0d5 "gallium/u_queue: add an option to have multiple worker threads"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 4aea8fe ("gallium/u_queue: fix random crashes when the app calls
exit()") added a atexit handler which calls
util_queue_killall_and_wait() for each queue to stop the threads.
However the app is also free to use atexit handlers to clean up things,
leading to util_queue_destroy() call which will also call
util_queue_killall_and_wait() for the same queue again, causing threads
being joined twice, and that is undefined. This happens with libglut,
for example. A simple fix is to just set num_threads to 0 as there are
no more valid threads after util_queue_killall_and_wait() returns.
Fixes: 4aea8fe "gallium/u_queue: fix random crashes when the app calls exit()"
Signed-off-by: Grazvydas Ignotas <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes 4a883966c1f74f43afc145d2c3d27af7b8c5e01a where the
PIPE_CAP was removed.
Now USER_INDEX_BUFFERS are always enabled remove the check and only
check for cmst_active directly.
v2: Axel pointed out the code was still needed when cmst was inactive,
Rebase on master too
v3: Drop struct member user_ibufs also && fixup shortlog (Edward).
v4: Fix negation
v5: Use the right variable name csmt != cmst
Fixes: 4a883966c1f7 ("gallium: remove PIPE_CAP_USER_INDEX_BUFFERS")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99953
Reported-and-tested-by: Vinson Lee <[email protected]> (v1)
Cc: Marek Olšák <[email protected]>
Cc: Axel Davy <[email protected]>
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Mike Lothian <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Make use of common uploaders that landed recently to Mesa
v2: fixed formatting, broken due to thunderbird configuration
v3: per Axel comment: added a comment into NineDevice9_DrawPrimitiveUP
v4: per Axel comment: changed style of the comment
|
|
|
|
|
|
|
| |
Need to specify the zero for the struct initializer. My earlier test
of the patch series was with MinGW, not MSVC.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces register pressure in both types of shaders, by reordering the
input loads from the var->data.driver_location order to whatever order
they appear first in the NIR shader. These instructions aren't
reorderable at our QIR scheduling level because the FS takes two in
lockstep to do an interpolation, and the VS takes multiple read
instructions in a row to get a whole vec4-level attribute read.
shader-db impact:
total instructions in shared programs: 76666 -> 76590 (-0.10%)
instructions in affected programs: 42945 -> 42869 (-0.18%)
total max temps in shared programs: 9395 -> 9208 (-1.99%)
max temps in affected programs: 2951 -> 2764 (-6.34%)
Some programs get their max temps hurt, depending on the order that the
load_input intrinsics appear, because we end up being unable to copy
propagate an older VPM read into its only use.
|
|
|
|
| |
It's going gain most of ntq_setup_inputs(), so simplify it first.
|
|
|
|
|
| |
This will be used for delaying our VPM reads (which must be unconditional)
until just before they're used.
|
|
|
|
|
|
|
| |
We need to be paying attention to optimization's impact on this -- even if
we reduce instruction count, increasing max temps in general is likely to
cause us to fail to register allocate on some shaders, which means that
those won't run at all.
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99850
Cc: 13.0 17.0 <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Not needed. ddebug does the same thing. The limitation is that drivers
can only use pipe_resource::screen through pipe_resource_reference.
This unbreaks trace, because pipe_context uploaders aren't wrapped,
so trace doesn't understand buffers returned by them.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
all drivers support it
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]> (VMware driver only)
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]> (VMware driver only)
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
v3: split from the etnaviv patch; fix new_ib.buffer leak
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]> (VMware driver only)
|
|
|
|
|
|
| |
the format of the rt can be different than the one of the texture, so must
propagate the format explicitly to the helper. Broken since
3f9c5d62441eba38e8b1592aba965ed5db6fd89b (but unused by st/mesa).
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use the utility u_copy_nv12_from_yv12 to implement this similarly to
how it's been done in the VPAU state tracker. The old code mixed up
planes and fields and didn't correctly handle video surfaces in
interlaced format.
Acked-by: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
mplayer likes putting YV12 data, and if there is a buffer format mismatch,
the vdpau state tracker would try to reallocate the video surface as an
YV12 surface. A virtual driver doesn't like reallocating and doesn't like YV12
surfaces, so if we can't support YV12, try an YV12 to NV12 conversion
instead.
Also advertize that we actually can do the getBits and putBits conversion.
v2: A previous version of this patch prioritized conversion before
reallocating. This has been changed to prioritize reallocating in this version.
Cc: Christian König <[email protected]>
Acked-by: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
|
|
|
|
|
|
|
|
| |
Passes all corresponding piglit tests.
Signed-off-by: Lars Hamre <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
Passes all corresponding piglit tests.
Signed-off-by: Lars Hamre <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v3: have util_clear_texture mirror the pipe function (Roland Scheidegger)
v2: rework util clear functions such that they operate on a resource
instead of a surface (Roland Scheidegger)
Creates a util_clear_texture function for implementing the GL_ARB_clear_texture
in softpipe and llvmpipe.
Signed-off-by: Lars Hamre <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix issue with index buffers that do not contain a 0 index. 0 index
can be a non-valid index if the (copied) vertex buffers are a subset of the
user's (which happens because we only copy the range between min & max).
Core will use an index passed in from the driver to replace invalid indices.
Only do this for calls that contain non-zero indices, to minimize performance
Reviewed-by: Bruce Cherniak <[email protected]>
cost.
|
|
|
|
|
|
|
|
|
| |
For now, the cache key is all of FETCH_COMPILE_STATE.
Use new/delete for swr_vertex_element_state, since we have to call the
constructors/destructors of the struct elements.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Before releasing a shared context, flush the context
with ST_FLUSH_WAIT to make sure all commands are executed.
This ensures that rendering to any shared resources is completed
before they will be referenced by another context.
Fixes an intermittent flickering with Photoshop. (VMware bug# 1779340)
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
When st_context_flush() is called with ST_FLUSH_WAIT,
the function will return after the fence is completed.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For gpu generations that use LLVM we create a timestamp string
containing both the LLVM and Mesa build times, otherwise we just
use the Mesa build time.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
V2: Provide more detail in callback description and add description to
screen.rst
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit f1e5dfbe3c8951a6c8acf41bf5e6c2d090098b2c.
For a detailed discussion see
https://lists.freedesktop.org/archives/mesa-dev/2017-February/145283.html
Acked-by: Christian König <[email protected]>
Signed-off-by: Thomas Hellstrom <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Nayan Deshmukh <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
sizeof(struct pipe_draw_info) = 104 -> 88
Also, vertices_per_patch is switched to ubyte, because it can't be more
than 32.
Seemed-reasonable-to: Roland Scheidegger
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
v2: use UINT64_MAX / 11
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
it's cleaner this way.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes:
vdpauinfo: ../lib/CodeGen/TargetPassConfig.cpp:579: virtual void
llvm::TargetPassConfig::addMachinePasses(): Assertion `TPI && IPI &&
"Pass ID not registered!"' failed.
v2: use list_head, switch the call order in destroy
Cc: 13.0 17.0 <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|