| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|\ |
|
| |
| |
| |
| | |
Reviewed-by: Ian Romanick <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the first pass of implementing exact handling, I made a mistake with
search-and-replace. In particular, we only reallly handled exact/inexact
on the root of the tree. Instead, we need to check every node in the tree
for an exact/inexact match. As an example of this, consider the following
GLSL code
precise float a = b + c;
if (a < 0) {
do_stuff();
}
In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results. The solution is to simply bail if any of the
values are exact when matching an inexact expression.
Reviewed-by: Ian Romanick <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The SIN and COS instructions on Intel hardware can produce values
slightly outside of the [-1.0, 1.0] range for a small set of values.
Obviously, this can break everyone's expectations about trig functions.
According to an internal presentation, the COS instruction can produce
a value up to 1.000027 for inputs in the range (0.08296, 0.09888). One
suggested workaround is to multiply by 0.99997, scaling down the
amplitude slightly. Apparently this also minimizes the error function,
reducing the maximum error from 0.00006 to about 0.00003.
When enabled, fixes 16 dEQP precision tests
dEQP-GLES31.functional.shaders.builtin_functions.precision.
{cos,sin}.{highp,mediump}_compute.{scalar,vec2,vec4,vec4}.
at the cost of making every sin and cos call more expensive (about
twice the number of cycles on recent hardware). Enabling this
option has been shown to reduce GPUTest Volplosion performance by
about 10%.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
See commit 3b0279a69 - this restriction is documented in the "Surface
Format" field of RENDER_SURFACE_STATE.
Looking at newer documentation, this restriction appears to exist on
Haswell, but no longer applies on Gen8+.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
| | |
|
| |
| |
| |
| | |
Signed-off-by: Emil Velikov <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit e7fb889dcc002f87c316f3cdc6e7907a88c12697)
|
| |
| |
| |
| |
| | |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit ff9ddb9eb1b3b25f40e71a95bb48421abfcb11d9)
|
| |
| |
| |
| |
| |
| |
| | |
this was returning the fragment shader value.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Trivial.
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This extension is identical to ARB_base_instance. Reuse the same
entrypoints.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The extension spec was extended to also support ES. This functionality
is provided all the way back to ES 1.0.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For adding .v4f32 like suffixes to intrinsics, taking special care for
scalar case, which was being often neglected.
This fixes invalid IR when doing mipmap filtering on SSE2 (the only
case where we'd use intrinsics with scalars.)
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Also avoid double-adding the *sampler2DMS types when the array ext is
enabled.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
I can't tell whether this actually matters, but we're creating function
signatures with this predicate, so it should probably match when SSBO's
are available.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
As the relevant extensions get implemented, the lines should be
uncommented. I believe this is (almost) everything needed for those GL
versions though.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Oddly a bunch of the features it adds are actually from ESSL 3.20. But
the spec is quite clear, oh well.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Exactly the same code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We could unconditionally use these instrinsics, but performance with SSE2
would suck, as LLVM falls back to calling libm.
lp_test_arit.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| | |
For simulating less capable machines.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| | |
Trivial.
|
| |
| |
| |
| |
| |
| | |
It builds fine now. Probably due to C99 support.
Trivial.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
LLVM often can't determine the mask elements are all ones/zeros, and
there doesn't seem to be a good way to hint that.
Thanks to Roland Scheidegger for spotting and analyzing the issue.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
No longer needed.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Only provide a fallback for LLVM 3.3.
One less dependency on LLVM C++ interface.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
| |
| |
| |
| |
| | |
The if always returns so no need for an else.
Reviewed-by: Brian Paul <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
The current DSQRT lowering code emits an OP_SELP, so we have to handle
its emission. This will eventually go away, but no harm supporting this
op.
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This fixes piglit tests like
tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test
and related ones.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently mesa fails building with the x32 abi as ms_abi is not defined
in such a case.
The patch uses ms_abi only for amd64 targets and stdcall only for i386
targets to be sure that those are defined.
This patch additionally checks for __GNUC__ to guarantee that
__attribute__ is available.
CC: "11.1 11.2" <[email protected]>
Signed-off-by: Christian Schmidbauer <[email protected]>
Acked-by: Axel Davy <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
nvc0 and nve4 have been respectively replaced by gf100 and gk104.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Rather than the currently bound texture. This goes along with the
earlier patch to get away from examining bound textures and sampler
views during shader translation.
Fixes VMware bug 1632739.
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
| |
| |
| |
| |
| | |
is_ubo_var is true for both UBOs and SSBOs
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Previously we store the buffer block index i.e the index of a combined
ubo/ssbo list.
Fixes several dEQP-GLES31.functional tests:
- program_interface_query.uniform.block_index.block_array
- program_interface_query.uniform.block_index.named_block
- program_interface_query.uniform.block_index.unnamed_block
- program_interface_query.uniform.random.10
- program_interface_query.uniform.random.15
- program_interface_query.uniform.random.22
- program_interface_query.uniform.random.24
- program_interface_query.uniform.random.26
- program_interface_query.uniform.random.28
- program_interface_query.uniform.random.3
- program_interface_query.uniform.random.31
- program_interface_query.uniform.random.38
- program_interface_query.uniform.random.5
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94116
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This allows us to simplify the code and drop InterfaceBlockStageIndex
which is a per stage array of integers the size of all blocks in the
program combined including duplicates across stages. Adding a stage
ref per block will use less memory.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This changes the code to use the buffer counts stored for each stage
rather than counting from scratch. It also moves the checks outside
of the for loop which means we now just get a single link error
message if we go over the max rather than X error messages where X
is the number we have exceeded the max by.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| | |
We already have a count of active SSBOs per stage so use it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
This will allow us to use them when checking resources in a
following patch and clean up a bunch of code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Since 8683d54d2be825 there is now a single instance of the buffer
block information that needs to be updated rather than one instance
for each stage.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
With SSO, the GL_PROGRAM_INPUT and GL_PROGRAM_OUTPUT interfaces refer to
the first and last shader stage linked into a program. This may not be
the vertex and fragment shader stages.
So, subtracting VERT_ATTRIB_GENERIC0 and FRAG_RESULT_DATA0 is bogus.
We need to subtract VERT_ATTRIB_GENERIC0 for VS inputs,
FRAG_RESULT_DATA0 for FS outputs, and VARYING_SLOT_VAR0 for other cases.
Note that built-in variables get a location of -1.
Fixes 4 dEQP-GLES31.functional.program_interface_query tests:
- program_input.location.separable_fragment.var_explicit_location
- program_input.location.separable_fragment.var_array_explicit_location
- program_output.location.separable_vertex.var_array_explicit_location
- program_output.location.separable_vertex.var_array_explicit_location
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We were recording locations for all variables, even ones without an
explicit location set. Implement the rules from the spec, and record
-1 in the resource list accordngly. Make program_resource_location
stop doing math on negative values. Remove hacks that are no longer
necessary now that we've stopped doing that.
Fixes 4 dEQP-GLES31.functional.program_interface_query tests:
- program_input.location.separable_fragment.var
- program_input.location.separable_fragment.var_array
- program_output.location.separable_vertex.var_array
- program_output.location.separable_vertex.var_array
v2: Delete more code
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
A program will either have gl_VertexID or gl_VertexIDMESA (the lowered
zero-based version), not both. Just spoof it in the resource list so
the hacks are done in a single place.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|