| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
| |
we need rounding modes on other conversions involving floats and it is easier
to rename f2f16_undef than renaming all the other ones.
v2: rebased on master
Reviewed-by: Jason Ekstrand <[email protected]>
Acked-by: Rob Clark <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than
SelectionDAG for instruction selection.
v2: mareko: move the helper to src/amd/common
Signed-off-by: Marek Olšák <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
v2: Fixed dvec3 loads (bas)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 7 returns incorrect results when count is 0, something
has been broken since LLVM 6. Of course, the best solution is
to fix LLVM but this workaround works as expected for now.
Original workaround by Philippe Rebohle.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107276
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Python 2, `print` was a statement, but it became a function in
Python 3.
Using print functions everywhere makes the script compatible with Python
versions >= 2.6, including Python 3.
Signed-off-by: Mathieu Bridon <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
Acked-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Python, dictionaries and sets are unordered, and as a result their
is no guarantee that running this script twice will produce the same
output.
Using ordered dicts and explicitly sorting items makes the build more
reproducible, and will make it possible to verify that we're not
breaking anything when we move the build scripts to Python 3.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked.
struct ac_compiler_passes (opaque type) contains the main pass manager.
ac_create_llvm_passes -- the result can go to thread local storage
ac_destroy_llvm_passes -- can be called by a destructor in TLS
ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary
The motivation is to do the expensive call addPassesToEmitFile once
per context or thread.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Some of the compiler functions are no longer called outside
the util file.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This ports radv to the shared code, however due to a bug in LLVM
version prior to 7, radv cannot add target info at this stage,
as it would leak one for every shader compile, however I'd prefer
to keep this llvm damage in the shared code, since it isn't the
driver at fault here. We just add a flag to denote if the driver
can support leaking the target info or not, and the common code
does the right thing depending on the llvm version.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We want to share this code with radv in the future, so port
it out of radeonsi.
Add a return value as radv will want that to know if this
succeeds
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This doesn't do much yet, but it makes it easier to move the code
to a common shared code base.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
As precursor to moving init to common code, just rename the struct
and move it.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds a inline always pass, but otherwise should work the
same.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This just splits out the non-shared code and reuses ac_get_llvm_target in radv.
v2: rebase on Marek's patch - fixup brace position/whitespace
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
This removes some ugly code around module initialization.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
This removes useless s_waitcnt before barriers.
Only radeonsi uses this function.
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
cmask_size is changed to uint32_t because it can't be greater than 4GB.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Store the result in ctx->ssa_defs.
Acked-by: Rob Clark <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
The values for the radeon winsys were copied from the kernel driver.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
We HTILE compress stencil-only surfaces too.
CC: 18.1 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Somehow valgrind misses that the value is initialized by the ioctl.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
RADV might wanna use this helper too.
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
| |
The change from MIN2 to MAX2 is intentional.
Cc: 18.1 <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes the gcc warning:
snprintf’ output between 26 and 33 bytes into a destination of size 32
Fixes: d5f7ebda3ec0 ("ac: add LLVM build functions for subgroup instrinsics")
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The LLVM 6 code reduced it to a non-array call. We need to do that
with the new code too.
This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv.
Fixes: a9a79934412 "amd/common: use the dimension-aware image intrinsics on LLVM 7+"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Requires LLVM trunk r329166.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Leftover from bring up.
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
WQM is pretty reliable now on LLVM 7, so let us just use
DPP + WQM.
This gives approximately a 1.5% performance increase on the
vrcompositor built-in benchmark.
v2: Use ac_build_quad_swizzle.
Reviewed-by: Nicolai Hähnle <[email protected]>
|