| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
The linker_error() function sets prog->LinkStatus to false. There's
no reason for the caller of linker_error() to also do so.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
If the size argument isn't a multiple of four, we would have read/
copied uninitialized memory.
Fixes an issue reported by Myles C. Maxfield <[email protected]>
|
|
|
|
|
|
|
|
|
| |
See my explanation in mtypes.h.
v2: don't do this in gallium
v3: also updated the comment at the gl_shader_type definition
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Every driver left in Mesa enables this extension all the time. There's
no reason to let it be optional.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Every driver left in Mesa enables this extension all the time. There's
no reason to let it be optional.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Assertions are not sufficient to check for null pointers as they don't
show up in release builds. So, return ZeroVec/dummyReg instead of NULL
pointer in get_{src,dst}_register_pointer(). This should calm down the
warnings from static analysis tool.
Note: This is a candidate for the 9.1 branch.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 551c991606e543c3a264a762026f11348b37947e tried to avoid spilling
registers that were trivially colorable. But since we do optimistic
coloring, the top of the stack also contains nodes that are not trivially
colorable, so we need to consider them for spilling (since they are some
of our best candidates).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58384
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63674
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
|
| |
This code had no relation to ir_to_mesa.cpp, since it was also used by
intel and state_tracker, and most of it was duplicated with the standalone
compiler (which has periodically drifted from the Mesa copy).
v2: Split from the ir_to_mesa to shaderapi.c changes.
Acked-by: Paul Berry <[email protected]> (v1)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There was nothing ir_to_mesa-specific about this code, but it's not
exactly part of the compiler's core turning-source-into-IR job either.
v2: Split from the ir_to_mesa to glsl/ commit, avoid renaming the sh
variable.
Acked-by: Paul Berry <[email protected]> (v1)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this while trying to merge code with the builtin compiler, which
does set it.
Note that this causes two regressions in piglit in
default-precision-sampler.* which try to link without a vertex or fragment
shader, due to being run under the desktop glslparsertest binary (using
ARB_ES3_compatibility) that doesn't know about this requirement.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
We were duplicating this code all over the place, and they all would need
updating for the next set of shader targets.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have ir->print() to do the old declaration of a visitor and having the
IR accept the visitor (yuck!). And now you can call _mesa_print_ir()
safely anywhere that you know what an ir_instruction is.
A couple of missing printf("\n")s are added in error paths -- when an
expression is handed to the visitor, it doesn't print '\n' (since it might
be a step in printing a whole expression tree).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
i965 and radeon use ra_set_node_reg() to force payload registers to
specific registers while exposing those registers to the allocator still.
We were treating those register nodes as unsuccessfully allocated in the
ra_simplify() step, leading to walking the registers again to do
optimistic coloring even if there was nothing left ot do.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The problem is the sampler units are allocated from the same pool for all
shader stages, so if a vertex shader uses 12 samplers (0..11), the fragment
shader samplers start at index 12, leaving only 4 sampler units
for the fragment shader. The main cause is probably the fact that samplers
(texture unit -> sampler unit mapping, etc.) are tracked globally
for an entire program object.
This commit adapts the GLSL linker and core Mesa such that the sampler units
are assigned to sampler uniforms for each shader stage separately
(if a sampler uniform is used in all shader stages, it may occupy a different
sampler unit in each, and vice versa, an i-th sampler unit may refer to
a different sampler uniform in each shader stage), and the sampler-specific
variables are moved from gl_shader_program to gl_shader.
This doesn't require any driver changes, and it fixes piglit/max-samplers
for gallium and classic swrast. It also works with any number of shader
stages.
v2: - converted tabs to spaces
- added an assertion to _mesa_get_sampler_uniform_value
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relaxes the validation of
OPTION ARB_precision_hint_{nicest,fastest};
to allow duplicate options. The spec says that both /nicest/ and
/fastest/ cannot be specified together, but could be interpreted
either way for respecification of the same option.
Other drivers (NVIDIA etc) accept this, and at least one Unity3D game
expects it to succeed (Kerbal Space Program).
V2: Add spec quote.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will eventually replace do_vec_index_to_cond_assign. This lowering
pass is called in all the places where do_vec_index_to_cond_assign or
do_vec_index_to_swizzle is called.
v2: Use WRITEMASK_* instead of integer literals. Use a more concise
method of generating broadcast_index. Both suggested by Eric.
v3: Use a series of scalar compares instead of a single vector compare.
Suggested by Eric and Ken. It still uses 'if (cond) v.x = y;' instead
of conditional assignments because ir_builder doesn't do conditional
assignments, and I'd rather keep the code simple.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to generate a new vector with a single field from
the source vector replaced. This will eventually replace
ir_dereference_array of vectors in the LHS of assignments.
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add constant expression handling for ir_triop_vector_insert. This
prevents the constant matrix inversion tests from regressing. Duh.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The new opcode is used to get a single field from a vector. The field
index may not be constant. This will eventually replace
ir_dereference_array of vectors. This is similar to the extractelement
instruction in LLVM IR.
http://llvm.org/docs/LangRef.html#extractelement-instruction
v2: Convert tabs to spaces. Suggested by Eric.
v3: Add array index range checking to ir_binop_vector_extract constant
expression handling. Suggested by Ken.
v4: Use CLAMP instead of MIN2(MAX2()). Suggested by Ken.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This flag essentially tells the compiler whether it prefers
dot products or multiply/adds for matrix operations. As such,
ShaderCompilerOptions seems like the right place for it.
This also lets us specify it on a per-stage basis. This patch makes all
existing users set the flag for the Vertex Shader stage only, as it's
currently only used for fixed-function vertex programs. That will
change soon, and I wanted to preserve the existing behavior.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
do_common_optimization may need to make choices about whether to emit
certain kinds of instructions. gl_context::ShaderCompilerOptions
contains exactly that information, so it makes sense to pass it in.
Rather than passing the whole array, pass the structure for the stage
that's currently being worked on.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Const.MaxTextureImageUnits -> Const.FragmentProgram.MaxTextureImageUnits
Const.MaxVertexTextureImageUnits -> Const.VertexProgram.MaxTextureImageUnits
etc.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Do not propagate a copy if source and destination are identical.
Otherwise code like
MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].xyzw
is changed to
MOV TEMP[0].xyzw, TEMP[0].wzyx
MOV TEMP[1].xyzw, TEMP[0].wzyx
This fixes Piglit test shaders/glsl-copy-propagation-self-2 for drivers that
use Mesa IR.
NOTE: This is a candidate for the stable branches.
Signed-off-by: Fabian Bieler <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
i965/Gen7+ and Radeon/Evergreen+ have bfm/bfi instructions to implement
bitfieldInsert() from ARB_gpu_shader5.
v2: Add ir_binop_bfm and ir_triop_bfi to st_glsl_to_tgsi.cpp.
Remove spurious temporary assignment and dereference.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
| |
v2: Move use of ir_binop_bfm and ir_triop_bfi to a later patch.
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous commit introduced extra words, breaking the formatting.
This text transformation was done automatically via the following shell
command:
$ git grep 'THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY' | sed 's/:.*$//' | xargs -I {} sh -c 'vim -e -s {} < vimscript
where 'vimscript' is a file containing:
/THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY/;/\*\// !fmt -w 78 -p ' * '
:wq
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This brings the license text in line with the MIT License as published
on the Open Source Initiative website:
http://opensource.org/licenses/mit-license.php
Generated automatically be the following shell command:
$ git grep 'THE AUTHORS BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
sed -i 's/THE AUTHORS/THE AUTHORS OR COPYRIGHT HOLDERS/' {}
This introduces some wrapping issues, to be fixed in the next commit.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated automatically be the following shell command:
$ git grep 'BRIAN PAUL BE LIABLE' | sed 's/:.*$//g' | xargs -I '{}' \
sed -i 's/BRIAN PAUL/THE AUTHORS/' {}
The intention here is to protect all authors, not just Brian Paul. I
believe that was already the sensible interpretation, but spelling it
out is probably better.
More practically, it also prevents people from accidentally copy &
pasting the license into a new file which says Brian is not liable when
he isn't even one of the authors.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Without this patch, $$.negate, $$.rgba_valid, and $$.xyzw_valid take
on garbage values. At the moment this problem is benign (the garbage
values happen to be zero), but in my experiments executing GL
operations on a background thread, the garbage values change, leading
to piglit failures.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
None of the remaining FEATURE_x symbols in mfeatures.h are used anymore.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
For the sake of consistency.
Tested-by: Emil Velikov <[email protected]>
Reviewed-and-Tested-by: Andreas Boll <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This should reduce shader recompilations with drivers that emulate fragment
color clamping, because we want the clamping to be enabled only if there is
a signed normalized or floating-point colorbuffer.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reported-by: `per` in #intel-gfx
The size of the cache key varies, so store the actual size as well as
the key blob itself, rather than just assuming it's the same as the size
passed in.
NOTE: This is a candidate for stable branches.
V2: Don't leave silly holes in structure; use unsigned instead of GLuint.
V3: Fix missing case for `last` match.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way we were allocating registers before, packing into low register
numbers for Ironlake, resulted in an overly-constrained dependency graph
for instruction scheduling. Improves GLBenchmark 2.1 performance by
4.5% +/- 0.7% (n=26). No difference on my old GLSL demo (n=20). No
difference on nexuiz (n=15).
v2: Fix off-by-one bug that made the change only work for 16-wide on i965.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 [mattst88]:
- Rebase.
- #define GL_ARB_texture_query_lod to 1.
- Remove comma after ir_lod in ir.h for MSVC.
- Handled ir_lod in ir_hv_accept.cpp, ir_rvalue_visitor.cpp,
opt_tree_grafting.cpp.
- Rename textureQueryLOD to textureQueryLod, see
https://www.khronos.org/bugzilla/show_bug.cgi?id=821
- Fix ir_reader of (lod ...).
v3 [mattst88]:
- Rename textureQueryLod to textureQueryLOD, pending resolution of
Khronos 821.
- Add ir_lod case to ir_to_mesa.cpp.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_frag_attrib -> gl_varying_slot
FRAG_ATTRIB_* -> VARYING_SLOT_*
FRAG_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_frag_attrib enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_geom_result -> gl_varying_slot
GEOM_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_geom_attrib -> gl_varying_slot
GEOM_ATTRIB_* -> VARYING_SLOT_*
GEOM_BIT_* -> VARYING_BIT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This paves the way for eliminating the gl_vert_result enum entirely.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After the previous fix that almost removes an allocation of 4*n^2
bytes, we can use a bitset to reduce another allocation from n^2 bytes
to n^2/8 bytes.
Between the previous commit and this one, the peak heap size for an
oglconform ARB_fragment_program max instructions test on i965 goes from
4GB to 255MB.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55825
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We were allocating an adjacency_list entry for every possible
interference that could get created, but that usually doesn't happen.
We can save a lot of memory by resizing the array on demand.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|