| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Create shader_lib during build, link with shaders at DLL generation time
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new define (USE_SIMD16_VS), to denote calling a 16-wide vertex shader.
This is needed because the mesa driver can do 16-wide shaders, but rasty
cannot yet, so we need to distinguish.
Create a new VertexID entry (VertexID16) for the USE_SIMD16_VS case, since
we need to format the vertex id in a way that is digestible by the 16-wide VS
Disabled for now. To be enabled in a future checkin when driver work
is complete.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Add name argument to x86 autogenerated macros.
Add useful variable names for DCL_inputVec implementation.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
in shader and fetch dump files
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
- Move debug .ll files to JIT_CACHE_DIR
- Don't link against jitter SRGBLut table, add global data to shader that needs it.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Added support for Fetch / Sample / LD functions
Added DLL link to JitCache implementation
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Adds ability to step into jitted llvm IR in Visual Studio.
- Updated llvm type generation script to also generate corresponding debug types.
- New module pass inserts debug metadata into the IR for each function
Disabled by default.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
+ ZeroMemory() macro definition for non win32-compilation in common/os.h
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
The user SGPR location can change between pipelines, so we need to
emit it again to the pottentially changed SGPR index.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
We don't have the meta kludge with 0 viewports anymore,
so we can always enable them.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Growing the batch/state buffer is a lot more dangerous than I thought.
A number of places emit multiple state buffer sections, and then write
data to the returned pointer, or save a pointer to brw->batch.state.bo
and then use it in relocations. If each call can grow, this can result
in stale map references or stale BO pointers. Furthermore, fences refer
to the old batch BO, and that reference needs to continue working.
To avoid these woes, we avoid ever swapping the brw->batch.*.bo pointer,
instead exchanging the brw_bo structures in place. That way, stale BO
references are fine - the GEM handle changes, but the brw_bo pointer
doesn't. We also defer the memcpy until a quiescent point, so callers
can write to the returned pointer - which may be in either BO - and
we'll sort it out and combine the two properly in the end.
v2/v3:
- Handle stale pointers in the shadow copy case, where realloc may or
may not move our shadow copy to a new address.
- Track the partial map explicitly, to avoid problems with buffer reuse
where multiple map modes exist (caught by Chris Wilson).
v4:
- Don't use realloc in the CPU shadow case, it isn't safe.
Fixes: 2dfc119f22f257082ab0 "i965: Grow the batch/state buffers if we need space and can't flush."
Reviewed-by: Iago Toral Quiroga <[email protected]> [v3]
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
| |
'aux' is a very generic name, suggesting it can be a bunch of things.
However, it's always the brw_*_prog_data structure. So, call it that.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 2 of 2 (part 1 is autoconf changes, part 2 is C++ changes)
When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen. Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.
This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.
Based on an initial implementation by Tim Rowley.
v2: Refactor repetitive preprocessor checks to reduce code duplication
v3: Formatting changes per Bruce C. Also delay screen creation until end
to avoid leaks when failure conditions are hit.
Signed-off-by: Chuck Atkins <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
CC: Tim Rowley <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Part 1 of 2 (part 1 is autoconf changes, part 2 is C++ changes)
When only a single SWR architecture is being used, this allows that
architecture to be builtin rather than as a separate libswrARCH.so that
gets loaded via dlopen. Since there are now several different code
paths for each detected CPU architecture, the log output is also
adjusted to convey where the backend is getting loaded from.
This allows SWR to be used for static mesa builds which are still
important for large HPC environments where shared libraries can impose
unacceptable application startup times as hundreds of thousands of copies
of the libs are loaded from a shared parallel filesystem.
Based on an initial implementation by Tim Rowley.
v2: Fix comment placement pointed out by Bruce C.
Signed-off-by: Chuck Atkins <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
CC: Tim Rowley <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Don't annotate, but remove the unused ctx parameter
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Don't annotate, but remove the unused ctx parameter
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As a followup to the previous patch propagate the change of numSamples
from int to unsigned to gl_config::samples and consequently fix some
-Wsign-compare warnings.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
According to the ARB_multisample num_samples is a non-negative integer.
Consequently define it as such, fail in glx/choose_visual if a negative
number is given.
v2: split patch into gallium and mesa part
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Juan A. Suarez Romero <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For scons, windows/mingw dealing with LLVM_CONFIG is done before
untarring. This is also more convenient for copy and paste.
Cc: Emil Velikov <[email protected]>
Cc: Juan A. Suarez Romero <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Juan A. Suarez Romero <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Juan A. Suarez Romero <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Cc: Emil Velikov <[email protected]>
Cc: Juan A. Suarez Romero <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Christian König <[email protected]>
Cc: [email protected]
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit bc1503b13fcf8190262757ea7f86613e14e25981)
|
|
|
|
|
| |
Signed-off-by: Juan A. Suarez Romero <[email protected]>
(cherry picked from commit 80f5f279b3f9fc752ba35b1cb2878a936f8ace90)
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vk_error() is a macro that calls __vk_errorf() with instance == NULL.
Then, __vk_errorf() passes a pointer to instance->debug_report_callbacks
to vk_debug_error(), which segfaults as this pointer is invalid but not
NULL.
Fixes: e5b1bd6ab8 "vulkan: move anv VK_EXT_debug_report implementation to common code."
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Add forgotten argument and start offset.
Fixes: 91074bb11bda "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Fixes: 91074bb11bda "radv/ac: Implement Float64 SSBO stores."
Tested-by: Timothy Arceri <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a channel was not set we also did not increase the LDS address,
while that obviously should happen.
The output loading code was inadvertently fixed which resulted in a
mismatch causing the SaschaWillems tessellation demo to result
in corrupt rendering.
Fixes: 7898eb9a60 "ac: rework load_tcs_{inputs,outputs}"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The bindings also have an index field.
Fixes: 49d035122e "radv: Add single pipeline cache key."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104677
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Passes
dEQP-VK.api.smoke.*
dEQP-VK.wsi.android.*
with android-cts-7.1_r12 .
Unlike the initial anv implementation this does
use syncobjs instead of waiting on the CPU.
This is missing meson build coverage for now.
One possible todo is that linux 4.15 now has a
sycall that allows us to export amdgpu fence to
a sync_file, which allows us not to force all
fences and semaphores to use syncobjs. However,
I had trouble with my kernel crashing regularly
with NULL pointers, and I'm not sure how beneficial
it is in the first place given that intel uses
syncobjs for all fences if available.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
If we import an image, we might not have space in the
buffer for CMASK, even though it is compatible.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Otherwise we get uninitialized variable warnings for es_vgpr_comp_cnt.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
Gives a warning when the assert is disabled, and not even
necessarily true.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lp_build_interleave2_half was not doing the right thing for avx512-style
16-wide loads.
This path is hit in the swr driver with a 16-wide vertex shader. It is
called from lp_build_transpose_aos, when doing texel fetches and the
fetched data needs to be transposed to one component per output register.
Special-case the post-load swizzle operations for avx512 16x32 (16-wide
32-bit values) so that we move the xyzw components correctly to the outputs.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization in change 8e4efdc895ea ("vbo: optimize some display
list drawing") missed the loopback case. This is used when the
glBegin/End primitive doesn't have a uniform set of vertex attributes.
The new Piglit gl-1.0-dlist-materials test hits this.
So check the aligned_vertex_buffer_offset(list) value and adjust the
buffer offset accordingly.
We also need to remove the 'start == 0' assertion in the loopback
code since it no longer applies.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Currently a couple of gallium targets race with xmlpool_options.h being
generated, don't do that.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
V2: use LLVMIntTypeInContext()
Fixes: f4e499ec7914 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The view index user sgpr wasn't being accounted for properly,
this refactors out the code to decide if it's required and then
uses that info to account for it.
Fixes: 180c1b924e (ac/nir: Add shader support for multiviews.)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Only one piglit test fails,
sso-vs-gs-fs-array-interleave
There are 3 tests using ssbo without checking sizes failing also
but those are test bugs.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel is moving to a $class$instance naming scheme in preparation
for accommodating more rings in the future in a consistent manner. It is
already using the naming scheme internally, and now we are looking at
updating some soft-ABI such as the error state to use the new naming
scheme. This of course means we need to teach aubinator_error_decode how
to map both sets of ring names onto its register maps.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Michel Thierry <[email protected]>
Cc: Michal Wajdeczko <[email protected]>
Cc: Tvrtko Ursulin <[email protected]>
Cc: Lionel Landwerlin <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Michel Thierry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Portal 2 appears to bind RGBA8888_UNORM textures to a sampler2DShadow,
and calls shadow2D() on it. This causes undefined behavior in OpenGL.
Unfortunately, our sampler appears to hang in this scenario, which is
not acceptable. Just give them a null surface instead, which returns
all zeroes.
Fixes GPU hangs in Portal 2 on Kabylake.
Huge thanks to Jason Ekstrand for noticing this crazy behavior while
sifting through crash dumps.
Cc: [email protected]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104487
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|