| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
This is for preparation of ARB_clip_control.
v3:
Add comments.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.
total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs: 10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs: 2467 -> 2264 (-8.23%)
|
|
|
|
|
|
|
| |
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO. In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
|
|
|
|
| |
I'm going to want to make some other decisions here before flushing.
|
|
|
|
|
|
|
| |
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs: 26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs: 5836 -> 5117 (-12.32%)
|
|
|
|
|
|
| |
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 01e637114914453451becc0dc8afe60faff48d84.
Since then many Hyper-Z issues have been fixed or worked around.
Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.
If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
The compiler isn't privy to the knowledge that we're doing at least one
framebuffer write.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79.
Which somehow broke gnome-shell.. and needs more investigation. For
now, revert..
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream. It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl). But we still can't do
the expected thing if the caller isn't using _NOSYNC. Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.
So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2: test for libelf once, check in both radeon and clover
CC: Tom Stellard <[email protected]>
CC: Emil Velikov <[email protected]>
CC: Francisco Jerez <[email protected]>
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Seems to be a typo judging from the overall declaration of the
template.
Cc: EdB <[email protected]>
Cc: Francisco Jerez <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
- Use macro for munmap under Android - the STATIC_ASSERT uses
a off_t which is not used under Android for mmap. As loff_t size
does not vary as does off_t just ignore the assert.
- Wrap the long lines to improve readability.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Fix the trivial typo in the variable name.
Cc: "10.2 10.3" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alon Levy <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alon Levy <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alon Levy <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The setMCJITMemoryManager method doesn't exist in LLVM 3.3.
I thought I had tested the latest version of my earlier change with LLVM
3.3, but it looks I missed it.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
JITMemoryManager was removed in LLVM 3.6, and replaced by its base class
RTDyldMemoryManager.
This change fixes our JIT memory managers specializations to derive from
RTDyldMemoryManager in LLVM 3.6 instead of JITMemoryManager.
This enables llvmpipe to run with LLVM 3.6.
However, lp_free_generated_code is basically a no-op because there are
not enough hook points in RTDyldMemoryManager to track and free the code
of a module. In other words, with MCJIT, code once created, stays
forever allocated until process destruction. This is not speicfic to
LLVM 3.6 -- it will happen whenever MCJIT is used regardless of version.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Replace tabs with spaces.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.
As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant
whether LLVM version supports AVX or not. Runtime support for AVX is
always checked and enforced independently.)
Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
The functionality exposed by those extensions does not appear in ES3
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
| |
Consistently use the function pointer we saved earlier.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We need to create all fs shaders in this function.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the user calls util_blitter_cache_all_shaders() set a flag and assert
that we never try to create any new fragment shaders after that point.
If the assertions fails, it means we missed generating some shader in
util_blitter_cache_all_shaders().
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Avoids a crash in case of negative array index is used in a
shader program.
Cc: <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently Mesa crashes with a shader like this:
[fragmnet shader]
float[5] array;
int idx = -2;
void main()
{
gl_FragColor = vec4(0.0, 1.0, 0.0, array[idx]);
}
Cc: <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes use-after-free when the currently bound blend state is destroyed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g6<1>D g1<0,4,1>F 0F
cmp.nz.f0(8) null g6<4,4,1>D 0D
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g1<0,4,1>F 0F
(+f0) sel(8) g5<1>F g1.4<0,4,1>F g2<0,4,1>F
No difference in shader-db.
v2: Remember to delete the old code (thanks Matt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using dst_reg(this, ir->type) automatically sets the writemask to the
proper size for the type; src_reg(dst_reg) preserves that. This should
be equivalent, but less code.
Note that src_reg(dst_reg) either uses SWIZZLE_XXXX or SWIZZLE_XYZW, so
the old code did need the manual writemask adjustment, since it
constructed the registers the other way around.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nothing uses the vector_elements temporary variable.
Setting this->result.file is dead because we overwrite this->result a
few lines later.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we generated an extra CMP instruction:
cmp.ge.f0(8) g4<1>D g2<0,1,0>F 0F
cmp.nz.f0(8) null g4<8,8,1>D 0D
(+f0) sel(8) g120<1>F g2.4<0,1,0>F g3<0,1,0>F
The first operand is always a boolean, and we want to predicate the SEL
on that. Rather than producing a boolean value and comparing it against
zero, we can just produce a condition code in the flag register.
Now we generate:
cmp.ge.f0(8) null g2<0,1,0>F 0F
(+f0) sel(8) g124<1>F g2.4<0,1,0>F g3<0,1,0>F
total instructions in shared programs: 5473459 -> 5473253 (-0.00%)
instructions in affected programs: 6219 -> 6013 (-3.31%)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A while back, Matt made the uniform upload functions simply upload
ctx->Const.UniformBooleanTrue for boolean values instead of 0/1, which
removed the need to convert it later. We also set UniformBooleanTrue to
1.0f for drivers which want to treat booleans as 0.0/1.0f.
Nothing ever sets these, so they are dead.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Also fix z16 restore format which was completely wrong.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor. We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DataLayoutPass was added in LLVM 3.5 r202168, commit
57edc9d4ff1648568a5dd7e9958649065b260dca "Make DataLayout a plain
object, not a pass.".
This patch fixes this build error with LLVM 3.4.
CXX llvm/libclllvm_la-invocation.lo
llvm/invocation.cpp: In function 'void {anonymous}::optimize(llvm::Module*, unsigned int, const std::vector<llvm::Function*>&)':
llvm/invocation.cpp:324:18: error: expected type-specifier
PM.add(new llvm::DataLayoutPass(mod));
^
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85189
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.
We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
v2: document the new functions
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.
v2: add comments
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|