| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
I forgot that we cannot emit vertex shader state on a chip without VS.
In such a case, clip_halfz is handled by the Draw module.
|
|
|
|
|
| |
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
| |
Requires evergreen/cayman
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
| |
This enables ARB_conditional_render_inverted.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers. Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.
This lets us not fall on our face when we encounter things like:
1: MOV TEMP[2], IN[0].xyzw
2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
3: MOV TEMP[2].xy, IN[0].yxzz
4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
5: END
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always insert extra mov's for the tex coord into the fanin. This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:
1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D
The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
dscis -> noscis
dbypass -> nobypass
a bit more consistant w/ nobin, etc. And IMO a bit more sensible names.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Kills get added to the outputs list, to ensure they get scheduled. But
they aren't *really* outputs so skip them in the header comment block.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to test compiler changes more easily, spit out the assembled
shader with some header information so that we can know about
inputs/outputs more easily.
See: git://people.freedesktop.org/~robclark/ir3test
In ir3test we have a big collection of tgsi shaders and reference
ir3_compiler outputs. When making compiler changes, regenerate the
compiler outputs and feed to ir3test to compare the new vs reference
shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The last few dwords were skipped if the total number of dwords was not a
multiple of 4. Change the formatting for better readability.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes:
- https://bugs.freedesktop.org/show_bug.cgi?id=85377
- http://llvm.org/bugs/show_bug.cgi?id=21365
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
So that the order of test messages and gallivm/llvmpipe debug output is
preserved.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In preparation of ARB_clip_control. Let the driver decide if
it supports pipe_rasterizer_state::clip_halfz being set to true.
v3:
Initially enable on ilo.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]
|
|
|
|
|
|
|
|
|
|
| |
This allows vc4_opt_cse.c to CSE-away operations involving the same
uniform values.
total instructions in shared programs: 37341 -> 36906 (-1.16%)
instructions in affected programs: 10233 -> 9798 (-4.25%)
total uniforms in shared programs: 10523 -> 10320 (-1.93%)
uniforms in affected programs: 2467 -> 2264 (-8.23%)
|
|
|
|
|
|
|
| |
This saves a bunch of extra flushes when texsubimaging a whole texture
that's been used for rendering, or subdataing a whole BO. In particular,
this massively reduces the runtime of piglit texture-packed-formats (when
the probes have been moved out of the inner loop).
|
|
|
|
| |
I'm going to want to make some other decisions here before flushing.
|
|
|
|
|
|
|
| |
total instructions in shared programs: 39022 -> 37341 (-4.31%)
instructions in affected programs: 26979 -> 25298 (-6.23%)
total uniforms in shared programs: 11242 -> 10523 (-6.40%)
uniforms in affected programs: 5836 -> 5117 (-12.32%)
|
|
|
|
|
|
| |
I'm going to be using VC4_DEBUG=shaderdb,norast to do shaderdb stats, but
when debugging regressions, I want to match shaderdb output to shader
disassembly.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 01e637114914453451becc0dc8afe60faff48d84.
Since then many Hyper-Z issues have been fixed or worked around.
Enable Hyper-Z by default so that we get enough feedback for the upcoming
mesa 10.4 release.
If you have issues with Hyper-Z try to disable Hyper-Z using the enviroment
variable R600_DEBUG=nohyperz and please report the issue on the bugtracker.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=75011
See also: https://bugs.freedesktop.org/show_bug.cgi?id=75112
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 94bb33617d1e8978dc52b8aaa4eb41bfb6703f79.
Which somehow broke gnome-shell.. and needs more investigation. For
now, revert..
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
fd_bo_cpu_prep() doesn't realize the bo is already referenced in
unflushed cmdstream. It could be made to do so (but would have to be
implemented twice, ie. both for msm and kgsl). But we still can't do
the expected thing if the caller isn't using _NOSYNC. Because of the
way the tiling works, we need to build quite a bit of cmdstream at flush
time, which is not possible to do at the libdrm level.
So rather than trying to make fd_bo_cpu_prep() smarter than it can
possibly be, just *always* discard and reallocate if the
PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE flag is set.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Fix the trivial typo in the variable name.
Cc: "10.2 10.3" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll need to update gallivm for the interface changes in LLVM 3.6, and
the fewer the number of older LLVM versions we support the less hairy that
will be.
As consequence HAVE_AVX define can disappear. (Note HAVE_AVX meant
whether LLVM version supports AVX or not. Runtime support for AVX is
always checked and enforced independently.)
Verified llvmpipe builds and runs with with LLVM 3.3, 3.4, and 3.5.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes use-after-free when the currently bound blend state is destroyed.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85267
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84140
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
| |
Also fix z16 restore format which was completely wrong.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't have a scissor enable bit in hw, so when a raster state change
results in scissor enable bit changing, we need to also mark scissor
state as dirty.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The optimization of avoiding restore (mem2gmem) if there was a clear
falls down a bit if you don't have a fullscreen scissor. We need to
make the decision logic a bit more clever to keep track of *what* was
cleared, so that we can (a) completely skip mem2gmem if entire buffer
was cleared, or (b) skip mem2gmem on a per-tile basis for tiles that
were completely cleared.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values are hardcoded in the LLVM backend, but the TGSI definitions are
going to be changed with tessellation, e.g. TGSI_PROCESSOR_COMPUTE will be
increased by 2.
We'll use VS for LS and HS, because there's nothing special about them
from the LLVM backend point of view, even though the hardware side is
different. We do the same for ES.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
v2: document the new functions
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I'll need indexed loads without the meta data flag for tessellation later.
Also rename load_const to buffer_load_const to distinguish it from indexed
const loads.
v2: add comments
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
st/mesa and gallium expect the DX9 format, so this is useless.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit 032e5548b3d4b5efa52359218725cb8e31b622ad.
I've run glsl-max-varyings 30 times and it always passed.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
The si_pm4_delete_state calls became useless, because the pm4 state is
always generated only once.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
It seemed like the function needed a context pointer. Let's remove it
to make it less confusing.
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
With 5 shader stages and various combinations of enabled and disabled shaders,
the maximum number of outputs in one shader doesn't have to be equal to
the maximum number of inputs in the following shader.
v2: return 32 for softpipe and llvmpipe
|
|
|
|
| |
Fixes glean blendFunc.
|
|
|
|
|
|
|
| |
If the writemask doesn't compress, then we want to put in the uncompressed
writemask, not the compressed writemask failure value (all-on).
Fixes glean's stencil2 and fbo-clear-formats on stencil.
|
|
|
|
|
| |
Fixes regressions in the next bugfix, because gallium util stuff leaves
the back stencil state as 0 if !back->enabled.
|