| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
| |
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
|
|
|
|
| |
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Lots of drivers need to transform the weird instructions in TGSI into
reasonable scalar ops, and this code can make those translations
canonical.
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The simulator assertion fails if you have a write to a reg and then a read
(for example, in the NOP side of an instruction), even if the read isn't
used for anything. By setting unused raddrs to NOP, we avoid the problem
(since only the phsyical registers are tracked).
|
|
|
|
| |
I'm going to be doing the same logic for some more fields next.
|
|
|
|
|
|
| |
v2: Do not demote items that are already in the pool
Signed-off-by: Niels Ole Salscheider <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Original patch by Petri Latvala <[email protected]>:
Add an optimization pass that drops min/max expression operands that
can be proven to not contribute to the final result. The algorithm is
similar to alpha-beta pruning on a minmax search, from the field of
AI.
This optimization pass can optimize min/max expressions where operands
are min/max expressions. Such code can appear in shaders by itself, or
as the result of clamp() or AMD_shader_trinary_minmax functions.
This optimization pass improves the generated code for piglit's
AMD_shader_trinary_minmax tests as follows:
total instructions in shared programs: 75 -> 67 (-10.67%)
instructions in affected programs: 60 -> 52 (-13.33%)
GAINED: 0
LOST: 0
All tests (max3, min3, mid3) improved.
A full shader-db run:
total instructions in shared programs: 4293603 -> 4293575 (-0.00%)
instructions in affected programs: 1188 -> 1160 (-2.36%)
GAINED: 0
LOST: 0
Improvements happen in Guacamelee and Serious Sam 3. One shader from
Dungeon Defenders is hurt by shader-db metrics (26 -> 28), because of
dropping of a (constant float (0.00000)) operand, which was
compiled to a saturate modifier.
Version 2 by Iago Toral Quiroga <[email protected]>:
Changes from review feedback:
- Squashed various cosmetic changes sent by Matt Turner.
- Make less_all_components return an enum rather than setting a class member.
(Suggested by Mat Turner). Also, renamed it to compare_components.
- Make less_all_components, smaller_constant and larger_constant static.
(Suggested by Mat Turner)
- Change mixmax_range to call its limits "low" and "high" instead of
"range[0]" and "range[1]". (Suggested by Connor Abbot).
- Use ir_builder swizzle helpers in swizzle_if_required(). (Suggested by
Connor Abbot).
- Make the logic more clearer by rearrenging the code and commenting.
(Suggested by Connor Abbot).
- Added comment to explain why we need to recurse twice. (Suggested by
Connor Abbot).
- If we cannot prune an expression, do not return early. Instead, attempt
to prune its children. (Suggested by Connor Abbot).
Other changes:
- Instead of having a global "valid" visitor member, let the various functions
that can determine this status return a boolean and check for its value
to decide what to do in each case. This is more flexible and allows to
recurse into children of parents that could not be prunned due to invalid
ranges (so related to the last bullet in the review feedback).
- Make sure we always check if a range is valid before working with it. Since
any use of get_range, combine_range or range_intersection can invalidate
a range we should check for this situation every time we use any of these
functions.
Version 3 by Iago Toral Quiroga <[email protected]>:
Changes from review feedback:
- Now we can make get_range, combine_range and range_intersection static too
(suggested by Connor Abbot).
- Do not return NULL when looking for the larger or greater constant into
mixed vector constants. Instead, produce a new constant by doing a
component-wise minmax. With this we can also remove of the validations when
we call into these functions (suggested by Connor Abbot).
- Add a comment explaining the meaning of the baserange argument in
prune_expression (suggested by Connor Abbot).
Other changes:
- Eliminate minmax expressions operating on constant vectors with mixed values
by resolving them.
No piglit regressions observed with Version 3.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76861
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Patch fixes following test case from 'shaders-with-varyings' WebGL
conformance suite: "vertex shader with unused varying and fragment
shader with used varying must succeed"
v2: emit still a warning if the condition happens (Ian)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Instead of crashing.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79155#c5
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
When a shader needs N surfaces, we should upload N surfaces and not depend on
how many are bound. This commit is larger than it should be because we did
not export how many surfaces a surface uses before.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
| |
When a shader needs N samplers, we should upload N samplers and not depend on
how many are bound.
Signed-off-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
v2: fix svga too
|
|
|
|
|
|
|
| |
It only needs the constant buffer with clip planes and read-write resources
for the GS->VS ring and streamout. That's 2 pointers.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
I hope this is correct.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
We only support 16 vertex attribs, not 32.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a wrong place to flush caches to say the least.
I don't think we need to flush the instruction caches if we don't patch
shaders with DMA.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
st/mesa has the same flag in its shader key, we don't need to do it
in the driver anymore.
Instead, use TGSI_INTERPOLATE_LOC_SAMPLE, which is what st/mesa sets.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
The first compiled shader is sometimes useless, because the key doesn't match
the key for the draw call where it's used.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Use an array of properties indexed by TGSI_PROPERTY_* definitions.
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
I'll need this in radeonsi.
v2: use __builtin_popcountll if available
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.2 10.3" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.3" <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes a release build segfault when wglCreateContextAttribsARB()
calls the wglCreateContext() function.
Cc: "10.3" <[email protected]>
Reviewed-by: Matthew McClure <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a few issues, including a potential empty-IB (which triggers gpu
hangs in piglit occlusion_query_meta_no_fragments)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Possibly we should map the front color to black (zeroes). But not sure
there is a way to do that without generating a shader variant.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shaders like:
FRAG
PROPERTY FS_COLOR0_WRITES_ALL_CBUFS 1
DCL IN[0], GENERIC[0], PERSPECTIVE
DCL OUT[0], COLOR
DCL SAMP[0]
DCL TEMP[0], LOCAL
IMM[0] FLT32 { 0.0000, 1.0000, 0.0000, 0.0000}
0: TEX TEMP[0], IN[0].xyyy, SAMP[0], 2D
1: MOV OUT[0], IMM[0].xyxx
2: END
cause unhappyness. They have an IN[], but once this is compiled the
useless TEX instruction goes away. Leaving a varying that is never
fetched, which makes the hw unhappy.
In the process fix a signed vs unsigned compare. If the vertex shader
has max_reg=-1, MAX2() vs an unsigned would not give the desired result.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Will investigate after XDC.
|
|
|
|
|
|
|
|
| |
This reverts commit 54e30dbf4db437748509d1319c3f6e4185f76c69.
Will investigate after XDC.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84557
|
|
|
|
| |
Added in commit f9dc7aab.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Windows, the Piglit primitive-restart test was failing a
glGetError()==0 assertion when it was run w/out any command line
arguments. Piglit's all.py script only runs primitive-restart
with arguments so this case isn't normally hit during a full
piglit run.
The basic problem is Microsoft's opengl32.dll calls glFlush
from wglGetProcAddress() and Piglit uses wglGetProcAddress() to
resolve glPrimitiveRestartNV() which is called inside glBegin/End.
See comments in the code for more info.
Plus, improve the comments for _mesa_alloc_dispatch_table().
Cc: <[email protected]>
Acked-by: Sinclair Yeh <[email protected]>
|
|
|
|
|
|
|
| |
Still failing a bunch of the fairly picky texelFetch tests, but the
1D(Array) ones are full passes.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Experimentally, this makes *ArrayShadow tex-miplevel-selection tests
pass.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
We're still doing something wrong for array textures.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Logic shamelessly copied from nv50 lowering pass.
Signed-off-by: Ilia Mirkin <[email protected]>
|