| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
q_total should never go below 0 (which is why it's defined as unsigned),
and if it does, then something is seriously wrong.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As noted in the previous commit, this was introduced in
567e2769b81863b6dffdac3826a6b729ce6ea37c ("ra: make the p, q test more
efficient"), but I forgot to mention it.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Excess conversions considered harmful.
Recently Matt reworked the boolean uniform handling to use the value of
UniformBooleanTrue, rather than integer 1, when uploading uniforms:
mesa: Upload boolean uniforms using UniformBooleanTrue.
glsl: Use UniformBooleanTrue value for uniform initializers.
Marek then set the default to 1.0f for drivers without native integer
support:
mesa: set UniformBooleanTrue = 1.0f by default
However, ir_to_mesa was assuming a value of integer 1, and arranging for
it to be converted to 1.0f on upload. Since Marek's commit, we were
uploading 1.0f = 0x3f800000 which was being interpreted as the integer
value 1065353216 and converted to float as 1.06535322E9, which broke
assumptions in ir_to_mesa that "true" was exactly 1.0f.
+13 Piglits on classic swrast (fs-bool-less-compare-true,
{vs,fs}-op-not-bool-using-if, glsl-1.20/execution/uniform-initializer).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83573
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 32f2fd1c5d6088692551c80352b7d6fa35b0cd09, several calls to
_mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly
equivalent to what the code was doing previously.
But for cases where "x" involves multiplication, now that we are explicitly
using the two-argument calloc, we can do one step better and replace:
calloc(1, A * B);
with:
calloc(A, B);
The advantage of the latter is that calloc will detect any overflow that would
have resulted from the multiplication and will fail the allocation, (whereas
the former would return a small allocation). So this fix can change
potentially exploitable buffer overruns into segmentation faults.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Coverity reported this, and I think this is the right solution,
since cache->items is struct cache_item ** not struct cache_item *,
we also realloc it using struct cache_item * at some point.
Reviewed-by: Tapani Pälli <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that saturate is implemented natively as instruction,
we can cut down on unneeded functionality.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
| |
Needed when vertex programs doesn't allow saturate
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The VertexProgram and FragmentProgram have a Cache member for dealing
with fixed function programs. There are no fixed function geometry
programs, so this should never have existed, and was just copy and
pasted.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Timothy Arceri <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, when we encountered a situation where we had to optimistically
color a node, we would immediately give up and push all the remaining
nodes on the stack in the order of their index - which is a random, and
potentially not optimal, order. Instead, choose one node to
optimistically color in ra_select(), and then once we've optimistically
colored it, keep on going as normal in the hopes that we've opened up
more avenues for the normal select phase to make progress. In cases with
high register pressure, this helps make the order we push things on the
stack much better, and therefore increase the chance that we can allocate
successfully.
total instructions in shared programs: 4545447 -> 4545401 (-0.00%)
instructions in affected programs: 1353 -> 1307 (-3.40%)
GAINED: 124
LOST: 6
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we would consider any optimistically colored nodes for
spilling. However, spilling any optimistically colored nodes below the
node that we failed to color on the stack wouldn't help us make
progress, since it wouldn't help with allowing us to find a color for
the node currently failing to get colored. Only consider nodes
which were above the failing node on the stack for spilling, which
simplifies the logic, and comment the code better so people know what's
going on here.
No shader-db changes with BRW_MAX_GRF reduced to 90 (or with the normal
number of GRF's).
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can store the q total that pq_test() would've calculated in the node
itself, updating it when we add a node to the stack. This way, we only
have to walk the adjacency list when we push a node on the stack (i.e.
when the p, q test succeeds) instead of every time we do the p, q test.
No difference in shader-db run times, but I'm keeping this in because
the q total that it calculates will also be used in the next few commits.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, there were 3 entrypoints into parts of the actual allocator,
and an API called ra_allocate_no_spills() that called all 3. Nobody
would ever want to call any of the 3 entrypoints by themselves, so
everybody just used ra_allocate_no_spills(). So just make them static
functions, and while we're at it rename ra_allocate_no_spills() to
ra_allocate() since there's no equivalent "with spills," because the
backend is supposed to handle spilling.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the array index is not a constant expression, the existing support
will assume a zero offset (giving us the sampler index of the base of
the array).
For dynamically uniform indexing of sampler arrays, we need both that
and the indexing expression.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
No need to return a value. Remove unused ctx parameter. Remove
_mesa_ prefix since it's static.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think OpenVMS was the only platform that Mesa ran on that used a
non-IEEE representation for floats. We removed OpenVMS support a while
back, and this should alleviate the need to continue updating the
this-platform-uses-IEEE list.
The one bit of this patch that needs review is the IS_INF_OR_NAN,
because I'm not sure if MSVC supports isfinite.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82268
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Historically, we've implemented the rules for overriding built-in
functions by creating multiple ir_functions and relying on the symbol
table to hide the one containing built-in functions. That works, but
has a few drawbacks, so the next patch will change it.
Instead, we'll have a single ir_function for a particular name, which
will contain both built-in and user-defined signatures. Passing an
extra parameter to matching_signature makes it easy to ignore built-ins
when they're supposed to be hidden.
I didn't add the parameter to exact_matching_signature since it wasn't
necessary.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Haswell, this cuts 1-3 instructions from 183 vertex shaders in
"Shadowrun Returns", "Shatter", and "Trine 2." It adds 2 instructions
to a single fragment shader in "Closure."
total instructions in shared programs: 278803 -> 278546 (-0.09%)
instructions in affected programs: 41930 -> 41673 (-0.61%)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.
ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.
Signed-off-by: Kenneth Graunke <[email protected]>
v2 (Jason Ekstrand): More realloc uses and some scons fixes
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Instead of hand-rolling it.
v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader.
Reviewed-by: Ian Romanick <[email protected]> [v1]
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
Will be used to implement interpolateAt*() from ARB_gpu_shader5
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, the Intel mesa driver is the only driver in the world
still supporting this legacy extension. If someone wants to do bump
mapping, they can use shaders.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]> [v1]
Reviewed-by: Chris Forbes <[email protected]> [v2]
Reviewed-by: Ian Romanick <[email protected]> [v3]
|
|
|
|
|
|
|
|
|
|
|
| |
On Intel hardware when a geometry shader outputs GL_POINTS primitives we
only need to emit vertex control bits if it emits vertices to non-zero
streams, so use a flag to track this.
This flag will be set to TRUE when a geometry shader calls EmitStreamVertex()
or EndStreamPrimitive() with a non-zero stream parameter in a later patch.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.1 10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
Check calloc return values in hash_table_insert() and
hash_table_replace()
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we pass in gl_shader_compiler_options, it makes sense to just
use options->MaxUnrollIterations, rather than passing a separate
parameter.
Half of the invocations already passed options->MaxUnrollIterations,
while the other half passed in a hardcoded value of 32.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The next few patches will introduce an optimization that only works when
integers are not represented as floating point values.
v2: Re-word-wrap a line, as requested by Ian Romanick.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Nested for loops running through tables against which they
finally do an assert were ran also with optimized builds.
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Add missing null check in program_parse.tab.c through
program_parse.y
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
% operator could return negative value which would cause
indexing before perm table. Change %256 to &0xff
Signed-off-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
program/ir_to_mesa.cpp:2008:1: warning: unused parameter 'ir' [-Wunused-parameter]
program/ir_to_mesa.cpp:2272:1: warning: unused parameter 'ir' [-Wunused-parameter]
program/ir_to_mesa.cpp:2278:1: warning: unused parameter 'ir' [-Wunused-parameter]
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Also clean up an old whitespace blooper.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically a sed but shaderapi.c and get.c.
get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior
shaderapi.c => the old api stil update the Shader object directly
V2: formatting improvement
V3 (idr):
* Rebase fixes after a block of code was moved from ir_to_mesa.cpp to
shaderapi.c.
* Trivial reformatting.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=76331
|
|
|
|
|
|
|
|
|
| |
This one saves about 2MB peak allocation in glsl-fs-algebraic-add-add-1,
with no performance difference on timing short shader-db runs (n=9/10,
warmup outlier removed).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This should use 1/8 the memory.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Christoph Brill <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a little easier to read.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Christoph Brill <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This isn't the GL API, so there's no reason to use GLboolean.
Using bool is safer: any non-zero value is treated as "true". When
converting a value to a GLboolean, all but the low byte is discarded,
which means that values like 256 will be incorrectly rendered as false.
Done via the following vim commands:
:%s/GLboolean/bool/g
:%s/GL_TRUE/true/g
:%s/GL_FALSE/false/g
and one line of manual whitespace tidying.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing uses this structure, removal fixes Klocwork error about
the possible oom condition in _mesa_symbol_table_iterator_ctor.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
i965 recently moved debug printfs to use stderr, including ones which
trigger on MESA_GLSL=dump. This resulted in scrambled output.
For drivers using ir_to_mesa, print_program was already using stderr,
yet all the code around it was using stdout.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
While we want to be able to print to stdout for glsl_compiler, for
debugging drivers we want to be able to dump to stderr because that's
where other driver debug (like LIBGL_DEBUG) tends to go, and because some
apps actually close stdout to shut up their own messages (such as the X
Server, or NWN).
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Reuse the glsl_sampler_dim enum for images. Reuse the
glsl_type::sampler_* fields instead of creating new ones specific
to image types. Reuse the same constructor as for samplers adding
a new 'base_type' argument.
Reviewed-by: Paul Berry <[email protected]>
|