| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
hash_table_replace doesn't use get_node to avoid having to hash the key twice.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Drivers implementing GLSL 1.30 want to do integer modulus, and until we
can stop generating code via ir_to_mesa, it's easier to make it silently
generate rubbish code. Multiply will do.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Ian Romanick <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
INLINE is still seen in some files (some generated files, etc) but this
is a good start.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Lots of things set and copy this field around, but nothing uses it.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
See also 8aadd89.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If color material mode is enabled, constant buffer entries related
to the material coefficients will depend on glColor. So add
_NEW_CURRENT_ATTRIB to the bitset returned for material-related
constants in _mesa_program_state_flags().
This fixes a bug exercised by the new piglit draw-arrays-colormaterial
test.
Note: This is a candidate for the 7.11 branch.
|
|
|
|
|
|
|
|
|
|
| |
For hardware drivers, we only have ir_to_mesa called for the purposes
of potential swrast fallbacks (basically never on a 1.30 driver),
which we don't really care about. This will allow 1.30 to be
implemented without rewriting swrast for it.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL 1.30 requires us to use gl_ClipDistance for clipping if the
vertex shader contains a static write to it, and otherwise use
user-defined clipping planes. Since the driver needs to behave
differently in these two cases, we need a flag to record whether the
shader has written to gl_ClipDistance.
The new flag is called UsesClipDistance. We initially store it in
gl_shader_program (since that is the data structure that is available
when we check to see whethe gl_ClipDistance was written to), and we
later copy it to a flag with the same name in gl_vertex_program, since
that is a more convenient place for the driver to access it (in i965,
at least).
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
The depth should be in W.
v2: adjust the assertion, add a comment
|
| |
|
|
|
|
|
|
|
| |
This is a better, more fine-grained way of lowering if statements. Fixes the
game And Yet It Moves on nv50.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Using multiply and reciprocal for integer division involves potentially
lossy floating point conversions. This is okay for older GPUs that
represent integers as floating point, but undesirable for GPUs with
native integer division instructions.
TGSI, for example, has UDIV/IDIV instructions for integer division,
so it makes sense to handle this directly. Likewise for i965.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Bryan Cain <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
At least for Intel, all our uniform components are of uint32_t size, either
float or signed or unsigned int. For uploading uniform data in the driver,
it's much easier to upload a full dword per uniform element instead of trying
to pick out the bool byte and then fill in the top 3 bytes of pad with 0.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Ian Romanick explained (Message-Id: <[email protected]>),
that the return type of non-API methods shouldn't use GLboolean but a
standard C++ bool.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Bryan Cain <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Kai Wasserbäch <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
One unique aspect of TXS is that it doesn't have a coordinate.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Q should not be significant for OPCODE_TEX, but it winds up getting
passed to the compute_lambda() function. Make sure it's 1.0 to
prevent garbage values, which is effectively what we get when the
swizzle is coord.xyzz (which is what GLSL gives us).
Part of the fix for piglit's fbo-generatemipmap-array test.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Shader Model 3.0[1] requires that shaders be able to execute at least
65536 instructions. Bump Mesa maxExec to that limit. This allows
several vertex shaders in the OpenGL ES 2.0 conformance test suite to
run to completion.
1: http://en.wikipedia.org/wiki/High_Level_Shader_Language
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This cleans up some code generated by the IR-to-Mesa pass for i915.
In particular, some shaders involving arrays of constant matrices
result in really bad code.
v2: Silence several warnings from merging the gl_constant_value work.
Fix DP[23] folding. Add support for a bunch more opcodes that appear
in piglit runs on i915.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
!a && b occurs frequently when nexted if-statements have been
flattened. It should also be possible use a MAD for (a && b) || c,
though that would require a MAD_SAT.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z
!= b.z || a.w != b.w). Logical-or is implemented using addition
(followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing
the logical-or operators with addition gives !bool((int(a.x != b.x) +
int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be
implemented using a dot-product with a vector of all 1.0. After the
dot-product, the value will be an integer on the range [0,4].
Previously a SEQ instruction was used to clamp the resulting logic
value to [0,1] and invert the result. Using an SGE instruction on the
negation of the dot-product result has the same effect. Many older
shader architectures do not support the SEQ instruction. It must be
emulated using two SGE instructions and a MUL. On these
architectures, the single SGE saves two instructions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) ||
(a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x !=
b.x, a.y != b.y, a.z != b.z, a.w != b.w)). Implement the any() part
the same way the regular ir_unop_any is implemented.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is just like the ir_binop_logic_or case. The operation
ir_unop_any is (a.x || a.y || a.z || a.w). Logical-or is implemented
using addition (followed by clampling to [0,1]) on values of 0.0 and
1.0. Replacing the logical-or operators with addition gives (a.x +
a.y + a.z + a.w). This can be implemented using a dot-product with a
vector of all 1.0.
Previously a SNE instruction was used to clamp the resulting logic
value to [0,1]. In a fragment shader, using a saturate on the
dot-product has the same effect. Adding the saturate to the
dot-product is free, so (at least) one instruction is saved.
In a vertex shader, using an SLT on the negation of the dot-product
result has the same effect. Many older shader architectures do not
support the SNE instruction. It must be emulated using two SLT
instructions and an ADD. On these architectures, the single SLT saves
two instructions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Logical-or is implemented using addition (followed by clampling to
[0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators
with addition gives a + b which has a result on the range [0, 2].
Previously a SNE instruction was used to clamp the resulting logic
value to [0,1]. In a fragment shader, using a saturate on the add has
the same effect. Adding the saturate to the add is free, so (at
least) one instruction is saved.
In a vertex shader, using an SLT on the negation of the add result has
the same effect. Many older shader architectures do not support the
SNE instruction. It must be emulated using two SLT instructions and
an ADD. On these architectures, the single SLT saves two
instructions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Since our logic values are 0.0 (false) and 1.0 (true), 1.0 - x
accurately implements logical not.
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Fixes a regression in codegen quality for ff_fragment_shader
conversion to GLSL -- glean texCombine produces 7.5% fewer Mesa IR
instructions.
|
|
|
|
|
| |
This fixes the remaining regression from ff_fragment_shader in Mesa IR
instruction count, to now being a 1.9% win overall.
|
|
|
|
| |
Fixes assorted warnings about float vs. gl_constant_value pointers.
|
| |
|
|\
| |
| |
| |
| |
| | |
Conflicts:
src/mesa/state_tracker/st_atom_pixeltransfer.c
src/mesa/state_tracker/st_program.c
|
| |
| |
| |
| |
| | |
Disabled by default on all drivers. To enable it, change ctx->GLSLVersion to 130
in st_extensions.c. Currently, softpipe is the only driver with integer support.
|
| |
| |
| |
| |
| |
| | |
The functionality is not used by anything yet, and the glUniform functions will
need to be reworked before this can reach its full usefulness. It is
nonetheless a step towards integer support in the state tracker and classic drivers.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Rely on the driver to do the right thing. This probably means falling
back to software. Page 88 of the OpenGL 2.1 spec specifically says:
"A shader should not fail to compile, and a program object should
not fail to link due to lack of instruction space or lack of
temporary variables. Implementations should ensure that all valid
shaders and program objects may be successfully compiled, linked
and executed."
There is no provision for saying "No" to a valid shader that is
difficult for the hardware to handle, so stop doing that.
On i915 this causes a large number of piglit tests to change from FAIL
to WARN. The warning is because the driver still emits messages to
stderr like "i915_program_error: Unsupported opcode: BGNLOOP".
It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag
on i915 (and probably other hardware that can't handle loops).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The functions were almost identical.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Leaving the unused registers with other values caused assertion
failures and other problems in places that blindly iterate over all
sources.
brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr !=
0' failed.
Fixes i965 piglit:
vs-uniform-array-mat[234]-col-row-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-uniform-array-mat[234]-index-row-rd
vs-uniform-mat[234]-col-row-rd
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes i965 piglit:
vs-temp-array-mat[234]-col-row-wr
vs-temp-array-mat[234]-index-col-row-wr
vs-temp-array-mat[234]-index-row-wr
vs-temp-mat[234]-col-row-wr
Fixes swrast piglit:
fs-temp-array-mat[234]-col-row-wr
fs-temp-array-mat[234]-index-col-row-wr
fs-temp-array-mat[234]-index-row-wr
fs-temp-mat[234]-col-row-wr
vs-temp-array-mat[234]-col-row-wr
vs-temp-array-mat[234]-index-col-row-wr
vs-temp-array-mat[234]-index-row-wr
vs-temp-mat[234]-col-row-wr
Reviewed-by: Eric Anholt <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.
Fixes i965 piglit:
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
Fixes swrast piglit:
fs-temp-array-mat[234]-index-col-rd
fs-temp-array-mat[234]-index-col-row-rd
fs-temp-array-mat[234]-index-col-wr
fs-uniform-array-mat[234]-index-col-rd
fs-uniform-array-mat[234]-index-col-row-rd
fs-varying-array-mat[234]-index-col-rd
fs-varying-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-rd
vs-varying-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-wr
Reviewed-by: Eric Anholt <[email protected]>
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
inline the hotpath of the reference remaining the same. This shouldn't
penalise the slow path at all but improve the hot path so we don't have
to jump to the function.
It also moves some assert checks under an #ifndef NDEBUG.
Minor clean-ups added by Brian.
Signed-off-by: Dave Airlie <[email protected]>
Signed-off-by: Brian Paul <[email protected]>
|