| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
negative
Section 2.3.1 (Errors) of the OpenGL 4.5 spec says:
"If a negative number is provided where an argument of type sizei or
sizeiptr is specified, an INVALID_VALUE error is generated.
This patch adds checks for negative buffer size values passed to different APIs.
It also moves up the check on other APIs that already had it, making it the first
error check performed in the function, for consistency.
While there may be other APIs throughtout the code lacking this check (or at least
not at the beginning of the function), this patch focuses on the cases that break
the dEQP tests listed below. It could be a good excersize for the future to check
all other cases, and improve consistency in the order of the checks throughout the
whole Mesa code base.
This fixes 5 dEQP test:
* dEQP-GLES3.functional.negative_api.state.get_attached_shaders
* dEQP-GLES3.functional.negative_api.state.get_shader_source
* dEQP-GLES3.functional.negative_api.state.get_active_uniform
* dEQP-GLES3.functional.negative_api.state.get_active_attrib
* dEQP-GLES3.functional.negative_api.shader.program_binary
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Section 6.1.13 "Framebuffer Object Queries" of OpenGL ES 3.0 spec:
"If the default framebuffer is bound to target, then attachment must be
BACK, identifying the color buffer; DEPTH, identifying the depth buffer; or
STENCIL, identifying the stencil buffer."
OpenGL ES 3.0, section 2.5 (GL Errors):
"If a command that requires an enumerated value is passed a
symbolic constant that is not one of those specified as allowable
for that command, an INVALID_ENUM error is generated."
Then change the returned error to INVALID_ENUM.
Fixes:
dEQP-GLES3.functional.fbo.api.attachment_query_default_fbo
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement
mod(x,y) as y * fract(x/y). This implementation has a down side though:
it introduces precision errors due to the fract() operation. Even worse,
since the result of fract() is multiplied by y, the larger y gets the
larger the precision error we produce, so for large enough numbers the
precision loss is significant. Some examples on i965:
Operation Precision error
-----------------------------------------------------
mod(-1.951171875, 1.9980468750) 0.0000000447
mod(121.57, 13.29) 0.0000023842
mod(3769.12, 321.99) 0.0000762939
mod(3769.12, 1321.99) 0.0001220703
mod(-987654.125, 123456.984375) 0.0160663128
mod( 987654.125, 123456.984375) 0.0312500000
This patch replaces the current lowering pass with a different one
(MOD_TO_FLOOR) that follows the recommended implementation in the GLSL
man pages:
mod(x,y) = x - y * floor(x/y)
This implementation eliminates the precision errors at the expense of
an additional add instruction on some systems. On systems that can do
negate with multiply-add in a single operation this new implementation
would come at no additional cost.
v2 (Ian Romanick)
- Do not clone operands because when they are expressions we would be
duplicating them and that can lead to suboptimal code.
Fixes the following 16 dEQP tests:
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_*
dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_*
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLES 3.0.0 spec introduces context state PRIMITIVE_RESTART_FIXED_INDEX
(2.8.1 Transferring Array Elements, page 26) which is not currently
possible to query using glGet*() funcs.
Fixes 4 dEQP tests:
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getboolean
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getinteger64
* dEQP-GLES3.functional.state_query.boolean.primitive_restart_fixed_index_getfloat
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes the following 2 dEQP tests:
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes the following 2 dEQP tests:
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex
dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For code such as:
uint tmp1 = uint(in0);
uint tmp2 = -tmp1;
float out0 = float(tmp2);
We produce code like:
mov(8) g5<1>.xF -g9<4,4,1>.xUD
which does not produce correct results. This code produces the
results we would expect if tmp1 and tmp2 were signed integers
instead.
It seems that a similar problem was detected and addressed when
using negations with unsigned integers as part of condionals, but
it looks like the problem has a wider impact than that.
This patch fixes the problem by preventing copy-propagation of
negated UD registers in all scenarios, not only in conditionals.
Fixes the following 24 dEQP tests:
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uint_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec2_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec3_*
dEQP-GLES3.functional.shaders.operator.unary_operator.minus.*_uvec4_*
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Replace the hard-coded 0's with the context clamp value.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Nothing enables the extension yet, but the values are now available.
The spec calls for it to only be exposed for GL 3.3+, which is core-only
in mesa. Instead we allow any driver to enable it, including in a compat
context for any GL version.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the ?: operator's condition is a constant value, and both branches
were pure expressions, we can just make the resulting value one or the
other.
Previously, we only did this if op[1] and op[2] were also constant
values - but there's no actual reason for that restriction.
No changes in shader-db, probably because we usually optimize this later
anyway. But it does make us generate less stupid code up front.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Paul originally had to reverse engineer these formulas based on the
description about how the sampler works. The description here is not
the easiest to follow - especially given that it's from the Sandybridge
era, when the hardware only did 4x multisampling.
Jordan and I recently found another part of the documentation where they
simply state that IMS dimensions must be adjusted by a set of formulas.
Quoting this section provides an easy to follow explanation for the
code, including 2x/4x/8x/16x.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In preparation for glBlitNamedFramebuffer, the DD table function
BlitFramebuffer needs to accept two arbitrary framebuffer objects rather
than assuming ctx->ReadBuffer and ctx->DrawBuffer.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.
v2: Apply to all the .dir-locals.el files, not just the one in the
root directory.
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
For GL_TEXTURE_1D_ARRAY targets we store the depth of the array
in the Height field and leave Depth=1 in the underlying texture
object. When we call intel_miptree_copy_teximage in the process
of re-creating a miptree (possibily because the number of miplevels
has changed) we didn't account for this, so we where only copying
texture images for the first slice.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I could have done this in the bit that generates the ANDs and ORs, but
it's probably generally useful. Sadly, I still need this even if I move
to NIR, because I can't yet express my read of the destination color in
NIR, which I would need to move my blend/logicop/colormask handling into
NIR.
total uniforms in shared programs: 13497 -> 13455 (-0.31%)
uniforms in affected programs: 101 -> 59 (-41.58%)
total instructions in shared programs: 40797 -> 40296 (-1.23%)
instructions in affected programs: 1639 -> 1138 (-30.57%)
|
|
|
|
|
| |
Since the VPM reads have to be in order, it's useful to see their indices
in the dump.
|
|
|
|
|
|
|
|
|
|
| |
The GL spec guarantees that glGetTexImage will never get a multisampled
texture, but this is not true for glReadPixels. If we get a multisampled
buffer, we have to do a multisample resolve on it before we can pull the
data down for the user. Since this isn't practical to handle in
tiled_memcpy, we just fall back to the other paths that can handle this.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
And remove the mocs argument of the emit_buffer_surface_state vtbl hook. Its
semantics vary greatly from one generation to another, so it kind of
encourages the caller to pass 0 which is the only valid setting across
generations. After this commit the hardware-specific code decides what the
best cacheability settings are for buffer surfaces, just like we do for
textures.
This together with some additional changes coming is expected to improve
performance of pull constants, buffer textures, atomic counters and image
objects on Gen7 and up.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The dri2_x11_add_configs_for_visuals() function happily matches a 32
bits EGLconfig with a 24 bits X visual. However it was passing 32bits
depth to xcb_put_image(), making X server unhappy:
https://github.com/apitrace/apitrace/issues/313#issuecomment-70571911
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88841
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a bug on BDW when our meta-based stencil blit path assert-fails
due to an invalid internal format even though we do support the
ARB_stencil_texturing extension.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
No functional change.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
The size of a Node is always four bytes so no need for the old code
that was used when sizeof(Node)==8.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
sizeof(Node) is always 4 bytes.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Just minor clean-up.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment. On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.
The solution is to add a new _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.
The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).
The gears demo and others hit this bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes build with Windows SDK 7.0.7600.
Tested with u_atomic_test, both on x86 and x86_64.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
The intrinsics are universally available, whereas older Windows SDKs (e.g.
7.0.7600) don't have the non-intrisic entrypoint.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take
effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_*
command. This patch just makes it set the BRW_NEW_SURFACES state when
uploading the push constants to ensure the binding tables will be
updated.
This fixes the fbo-blending-formats Piglit test and possibly others.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When color buffers alone are concerned the depth is not needed.
No regression on BDW where meta blit is used instead of blorp. I
also disabled blorp temporarily for fbo-blits on IVB and saw no
regressions there either.
I also compared several graphics benchmarks on BDW and saw neither
regressions or improvements.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementing an idea from Ken, on i965 the shader program for 2D
blits becomes significantly simpler.
Before:
pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
send(8) g2<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q };
mov(8) g123<1>F g2<8,8,1>F { align1 1Q compacted };
mov(8) g124<1>F g3<8,8,1>F { align1 1Q compacted };
mov(8) g125<1>F g4<8,8,1>F { align1 1Q compacted };
mov(8) g126<1>F g5<8,8,1>F { align1 1Q compacted };
mov(8) g127<1>F g2<8,8,1>F { align1 1Q compacted };
nop ;
sendc(8) null g123<8,8,1>F
render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT };
After:
pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
send(8) g124<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q };
sendc(8) null g124<8,8,1>F
render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT };
v2 (Matt): Removed unintended white-space change
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Currently all blit programs are unconditionally compiled with
gl_FragDepth.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5998190 -> 5997603 (-0.01%)
instructions in affected programs: 54276 -> 53689 (-1.08%)
helped: 293
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5998321 -> 5998287 (-0.00%)
instructions in affected programs: 4520 -> 4486 (-0.75%)
helped: 8
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
the search
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This allows you to match on an unknown value but only if it is of a given
type. 90% of the uses of this are for matching only booleans, but adding
the generality of arbitrary types is no more complex.
nir_algebraic.py doesn't handle this yet but that's ok because the C
language will ensure that the default type on all variables is void.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some algebraic transformations that we want to do but only if
certain things are constants. For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.
nir_algebraic.py doesn't handle this yet, but that's ok because the C
language will make sure that false is the default for now.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This allows us to indicate a concept of an invalid type.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We end up with these from TGSI-to-NIR because the pass generating the
comparisons doesn't know if the arg is actually a bool input or not. vc4
results:
total instructions in shared programs: 41801 -> 41508 (-0.70%)
instructions in affected programs: 4253 -> 3960 (-6.89%)
Reviewed-by: Matt Turner <[email protected]>
|
| |
|