| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
CID: 1416243, 1416244, 1416255
CC: [email protected]
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This can be used to guard support for EXT_memory_object and related
extensions.
v2: update gallium docs
v3 (Timothy Arceri):
- add cap to nv50
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Move AVX512BW specific intrinics to be Core-only.
Move some AVX512F intrinsics back to common implementation file.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Switch to a 1:1 mapping template:generated for future maintenance.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Unintentionally added with an apache2 license; relicense to match
the rest of the tree.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Copy/paste error was duplicating a gen_knobs.cpp rule.
Fixes: 5079c277b57 ("swr: [scons] Fix windows build")
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Add "const" as appropriate in method/function signatures.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Work in progress, disabled by default.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Rename to reflect global nature.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Disable an optimization which implemented sse/avx operations on avx512
using avx512 intrinsics (to avoid switching between lane widths).
Compile with SIMD_OPT_128_AVX512 / SIMD_OPT_256_AVX512 defined to enable
these optimizations.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Fix problems found when enabling USE_SIMD16_FRONTEND, mostly related to
vMask / movemask_ps(pd).
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Replace use of Win32 GetCurrentThreadId() with portable
std::this_thread::get_id().
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: rename cap to PIPE_CAP_QUERY_SO_OVERFLOW and be a bit more explicit
in the documentation
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The shader that is used to copy vertex data out of the vs/gs shaders to
the user-specified buffer (streamout or SO shader) was not using the
correct offsets.
Adjust the offsets that are used just for the SO shader:
- Make sure that position is handled in the same special way
as in the vs/gs shaders
- Use the correct offset to be passed in the core
- consolidate register slot mapping logic into one function, since it's
been calculated in 2 different places (one for calcuating the slot mask,
and one for the register offsets themselves
Also make room for all attibutes in the backend vertex area.
Fixes:
- all vtk GL2PS tests
- 18 piglit tests (16 ext_transform_feedback tests,
arb-quads-follow-provoking-vertex and primitive-type gl_points
v2:
- take care of more SGV slots in slot mapping logic
- trim feState.vsVertexSize
- fix GS interface and incorporate GS while calculating vsVertexSize
Note that vsVertexSize is used in the core as the one parameter that
controls vertex size between all stages, so it has to be adjusted appropriately
for the whole vs/gs/fs pipeline.
Also note that GS and SO is not fully implemented. This will be addressed
later.
fixes:
- fixes total of 20 piglit tests
CC: 17.2 <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
gcc prior to 4.9 didn't implement <regex>, causing a startup crash
in the swr knob parameter reading code.
CC: <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The variable name was missing a leading LD_, which resulted in a missing
check for unresolved symbols in the backend binaries.
With the link addressed with earlier patches, we can correct the typo.
Thanks to Laurent for the help spotting this.
v2: Split from a larger patch.
Cc: [email protected]
Cc: Bruce Cherniak <[email protected]>
Cc: Tim Rowley <[email protected]>
Cc: Laurent Carlier <[email protected]>
Fixes: 9475251145174882b532 "swr: standardize linkage and check for
unresolved symbols"
Reviewed-by: Eric Engestrom <[email protected]>
Reported-by: Laurent Carlier <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Analogous to previous commit but for the KNL/SKX backends.
Cc: Bruce Cherniak <[email protected]>
Cc: Tim Rowley <[email protected]>
Cc: Laurent Carlier <[email protected]>
Fixes: 1cb5a6061ce ("configure/swr: add KNL and SKX architecture targets")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Seems like the backends have been using pthreads since day one, yet
we've been missing the link.
With later commit we'll fix a typo, hence the libraries will be build
with -Wl,no-undefined, aka failing the build on unresolved symbols.
v2: Split from a larger patch.
Cc: [email protected]
Cc: Bruce Cherniak <[email protected]>
Cc: Tim Rowley <[email protected]>
Cc: Laurent Carlier <[email protected]>
Fixes: c6e67f5a9373e916a8d2 "gallium/swr: add OpenSWR rasterizer"
Reviewed-by: Eric Engestrom <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Linux-specific gettid() syscall shouldn't be used in portable code.
Fix does assume a 1:1 thread:LWP architecture, but works for our
current target platforms and can be revisited later if needed.
Fixes unresolved symbol in linux scons builds.
v2: add comment in code about the 1:1 assumption.
Cc: [email protected]
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Source/destination will not be AVX512 aligned, use the
unaligned load/store intrinsics.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Prevents unalignment crashes with avx512 code on gcc/clang.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Tested with clang-4.0 and gcc-6.3.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Not built by default. Currently only builds with icc.
v2:
* document knl,skx possibilities for swr_archs
* merge with changed loader lib selection code
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Allow configuration of the SWR architecture depend libraries
we build for with --with-swr-archs. Maintains current behavior
by defaulting to avx,avx2.
Scons changes made to make it still build and work, but
without the changes for configuring which architectures.
v2:
* add missing comma for swr_archs default
* check that at least one architecture is enabled
* modify loader logic to make it clearer how to add archs
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The last user of the function was removed with earlier commit.
Fixes: 50842e8a931 ("swr: replace gallium->swr format enum conversion")
Cc: Tim Rowley <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
Fixes performance regression from f50aa21456d - was forcing internal
code generation to target AVX (no gather, etc).
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Previous check-ins without testing with USE_SIMD16_FRONTEND have
introduced regressions. This fixes the build, not the regressions.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Core will ensure hot tiles are loaded for read and write render targets,
and will skip all output merger for read-only render targets.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
WIP to support read-only render targets.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If size of client memory copy is too large, don't copy. The draw will
access user-buffer directly and then block. This is faster and more
efficient than queuing many large client draws.
Applications that still use large client arrays benefit from this. VMD
is an example.
The threshold for this path defaults to 32KB. This value can be
overridden by setting environment variable SWR_CLIENT_COPY_LIMIT.
v2: Use #define for default value, rather than hard-coded constant.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
| |
Moved reading of environment config options out of
swr_create_screen_internal, into a separate swr_validate_env_options.
This is to keep from cluttering create_screen.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
| |
Removed the hard-coded constant in favor of a #define. Also removed
TODO comment. The constant value doesn't need an environment
configurable option.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swr used to build and link the rasterizer to the driver, and to support
multiple architectures we needed to have multiple versions of the
driver/rasterizer combination, which needed to link in much of mesa.
Changing to having one instance of the driver and just building
architecture specific versions of the rasterizer gives a large reduction
in disk space.
libGL.so 6464 Kb -> 7000 Kb
libswrAVX.so 10068 Kb -> 5432 Kb
libswrAVX2.so 9828 Kb -> 5200 Kb
Total 26360 Kb -> 17632 Kb
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Needed to expose SwrGetInterface
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cacheline alignment of SWR_STATS to prevent sharing of cachelines
between threads (performance).
Gets rid of gcc-7.1 warning about using c++17's over-aligned new
feature.
Cc: [email protected]
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Define these in terms of setzero for ancient gcc versions which don't
have the undefined intrinsics.
Cc: [email protected]
Reviewed-by: Bruce Cherniak <[email protected]>
|