| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
It seems clear that trying to multiply two pairs of doubles would result
in the temporary register getting overwritten by the second pair. So
make the code more explicit.
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into
bary.f. We were incorrectly setting both this and gl_FragCoord.xy to
the same register resulting in all sorts of hilarity.
Fixes stk, vdrift, 0ad, probably a bunch others.
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on
a5xx.
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 9e4d1d8a7c0d60a6975d186944cd870e06f94773.
It broke arb_vertex_type_10f_11f_11f_rev-draw-vertices, which has
first_non_void == -1.
|
|
|
|
|
|
|
|
|
|
| |
Add resource_changed to the ddebug, rbug, and trace wrappers. Since it
is optional, there is no need to add it to noop.
Signed-off-by: Philipp Zabel <[email protected]>
Suggested-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
from imported buffers
Implement the resource_changed pipe callback to invalidate internal
resources derived from imported buffers. This is needed to update the
texture for re-imported renderables.
Signed-off-by: Philipp Zabel <[email protected]>
Reviewed-by: Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Imported resources already have contents that we want to be copied to
texture resources derived from them. Set initial seqno of imported
resources to 1, just as if it had already been rendered to.
Signed-off-by: Philipp Zabel <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
| |
This should fix a coverity defect.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes VM faults. Discovered by Samuel Pitoiset.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450
Cc: 17.0 13.0 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At least on VI, texture gather doesn't work with a 24_8 data format, so
use 8_8_8_8 and a modified swizzle instead.
A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select
the X24S8 pipe format because we don't support stencil-only render targets
properly. With mip-mapping this can lead to a setup where the tiling is
incompatible with stencil texturing, and a flushed stencil texture is
used. For the flushed stencil, a literal X24S8 is used because there were
issues with an 8bpp DB->CB copy.
Longer term, it would be good if we could get away from these workarounds,
i.e. properly support an S8 format for stencil-only rendering and flushed
stencil. Since stencil texturing is somewhat rare, it's not a high
priority.
Fixes GL45-CTS.texture_cube_map_array.sampling.
Cc: 17.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This commit makes si_update_poly_offset set poly_offset to NULL if
uses_poly_offset is false. This way poly_offset either points into the
currently queued rasterizer, or it is NULL.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451
Cc: "13.0 17.0" <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
v2: now it should be correct
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Only vertex buffers use a separate bool flag.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
the mutex lock is inside util_range_add.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
the next commit will use it in a clever way, because the CP DMA prefetch
doesn't need this.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If the shader selector is created with a different context than
the shader variant, we should use the calling context's target machine
for the shader variant.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit b7ac0f567123c96b5cd9e3485b963a5c0a0db66a.
This is a half baked solution needs some rework to fixes issues
with reported counter bits (GL_QUERY_COUNTER_BITS_ARB).
Also it enables PIPE_CAP_QUERY_TIME_ELAPSED accidently.
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is the porting to android of the following commits:
b838f64 "ac/debug: Move sid_tables.h generation to common code."
0ef1b4d "ac/debug: Move IB decode to common code."
Fixes android building errors due to sid_tables.h
and ac_debug.c, ac_debug.h moved to amd/common
Tested by building nougat-x86
Acked-by: Nicolai Hähnle <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVMInitializeAMDGPU* functions need to be explicitly declared
and mesa expects them via <llvm-c/Target.h> header,
but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro,
or the functions will not be available.
A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose,
the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM
A necessary prerequisite is to have AMDGPU target handled accordingly
in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def
for llvm device build includes.
This avoids the following building errors:
external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:129:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTargetInfo();
^
external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:130:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTarget();
^
external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:131:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUTargetMC();
^
external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:132:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
LLVMInitializeAMDGPUAsmPrinter();
^
Acked-by: Nicolai Hähnle <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVMInitializeAMDGPU* functions need to be explicitly declared
and mesa expects them via <llvm-c/Target.h> header,
but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro,
or the functions will not be available.
A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose,
the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM
A necessary prerequisite is to have AMDGPU target handled accordingly
in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def
for llvm device build includes.
This avoids the following building errors:
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:121:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' [-Werror=implicit-function-declaration]
LLVMInitializeAMDGPUTargetInfo();
^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:122:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' [-Werror=implicit-function-declaration]
LLVMInitializeAMDGPUTarget();
^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:123:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' [-Werror=implicit-function-declaration]
LLVMInitializeAMDGPUTargetMC();
^
external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:124:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' [-Werror=implicit-function-declaration]
LLVMInitializeAMDGPUAsmPrinter();
^
Acked-by: Nicolai Hähnle <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No point in having the extra argument considering that it's effectively
unused since the function was introduced.
Cc: Ilia Mirkin <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables the PIPE_CAP_OCCLUSION_QUERY capability without adding an
occlusion query type.
This is necessary to get Mesa to report desktop GL 2.0 support (to run
exciting things such as ioq3's OpenGL 2 renderer), and should be valid
because exposing the capability does not guarantee that any
counters are actually implemented.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Fixes compile warning introduced by commit a1c848.
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Fixes compile warning introduced by commit ee3ebe.
Signed-off-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
| |
Cc: 17.0 13.0 <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Address loading can often end up as shl + shr + shl combinations. The
latter two are equal shifts, which get converted into an and mask.
However if the previous shl is more than the mask is trying to remove
(in terms of low bits), we can just remove the and entirely. This
reduces some large shaders by as many as 3% of instructions (out of 2K).
total instructions in shared programs : 6495509 -> 6491076 (-0.07%)
total gprs used in shared programs : 954621 -> 954623 (0.00%)
local gpr inst bytes
helped 0 0 1014 1014
hurt 0 2 0 0
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need to support all the color buffers for advanced blend, just
cb0. For Fermi, we use the special binding slots so that we don't
overlap with user textures, while Kepler+ gets a dedicated position for
the fb handle in the driver constbuf.
This logic is only triggered when a FBFETCH is actually present so it
should be a no-op most of the time.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This is so that we can differentiate between flushing any framebuffer
reading caches from regular sampler caches.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This had been updated in one place but not the other.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
The existing lowering is in place to lower that to RCP + MUL, or fancier
things down the line if necessary.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Vedran Miletić <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
v2: add u_bit_consecutive64
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
It should be close to the GPU load, but it can be much lower if something
is stalling shader execution (e.g. CP DMA).
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
The next patch will add SPI_BUSY monitoring.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
so that the graphs are independent from FPS.
Reviewed-by: Nicolai Hähnle <[email protected]>
|