| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
Dereffing all the values in the two callers was just pointless, and
the function isn't inlined so there was actual code impact.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
vb.inputs_read has never been a thing, even in the initial import.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
If you want debug, set --enable-debug in your config flags.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The core Mesa code has just one more case than this (GL_BITMAP), so I
don't see any cause to special-case it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This was added in b93684f5f311f89c965960ab42bfea71a397b180, but there's
no need for it -- get_size has to succeed, and it has an assert for us
in debug builds.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For our current types, the required alignment is actually just 1 byte.
When we get doubles, we have to worry (those have to be aligned to the
natural size), but we don't have doubles yet and they'll just be a
special case.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This is the same logic from _mesa_map_function_array().
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
glGetUniformIndices requires that the block instance index not be
present in the name of queried uniforms. However,
gl_uniform_buffer_variable::Name will include the instance index. The
IndexName field is added to handle this difference.
Note that currently IndexName will always point to the same string as
Name. This will change soon.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Not used yet, but the UBO layout visitor will use this.
v2: Remove a spruious hunk. This is moved to the patch "glsl: Remove
ir_variable::uniform_block". Suggested by Carl Worth.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interfaces are structurally identical to structures from the compiler's
point of view. They have some additional restrictions, and generally
GPUs use different instructions to access them. Using a different base
type should make this a bit easier.
This commit also adds the glsl_type::interface_packing fields. For
GLSL_TYPE_INTERFACE types, this will track the specified packing mode.
It is analogous to gl_uniform_buffer::_Packing.
v2: Add serveral missing GLSL_TYPE_INTERFACE cases in switch-statements.
v3: Add information about glsl_type::interface_packing. Move row_major
checking in glsl_type::record_key_compare from this patch to the
previous patch. Both suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This allows the next patch to verify that two uniform blocks match
without first calculating the locations of the fields.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it easier to find switch-statements that need to be updated
after a new GLSL_TYPE_* is added because the compiler will generate a
warning.
Switch-statements that only had a small number of cases (e.g.,
everything in ir_constant_expression.cpp) were not modified. I may
regret that decision when we eventually add support for doubles.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Too much attention was paid to the first paragraphs, and not enough to
the last little note that "oh, by the way, the rendered things
themselves still have to be clipped to just 8192 wide/high".
Fixes GTF's clip.c test with 4096 or higher width on ivb, where one of
the triangles got the upper half of its pixels dropped.
Tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Khronos has apparently decided that depth textures with sized formats
(allowed with ARB_internalformat_query or ES 3.0) should be treated as
GL_RED, while unsized formats (an existing feature) should be treated
as GL_INTENSITY for compatibility with ES 2.0.
Ian is proposing changes to ARB_internalformat_query which will make
this actually legal and consistent.
A similar problem exists with GL 4.2, but we're going to ignore that
for the time being.
Tested on Ivybridge: no Piglit regressions; fixes 4 es3conform tests:
- depth_texture_fbo
- depth_texture_fbo_clear
- depth_texture_teximage
- depth_texture_texsubimage
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since patch "i965: Validate requested GLES context version in
brwCreateContext", we have been able to create ES 3.0 contexts due to the
max version check. So...bump the max version.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Add ARB_internalformat_query to the list of required extensions.
v3: Add OES_depth_texture_cube_map to the list of required extensions.
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
s/src/src_w/
That little typo, which sneaked into v4 of the previous patch, generates
incorrect fs code.
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Remove lewd comment. [for idr]
v3: - Optimize away tmp register for packHalf2x16. [for anholt, paul]
- Improve comments. [for anholt, paul]
- Reduce near-duplicate code by removing vec4_visitor emit_pack/unpack
methods. [for chadv]
v4: Factor our UD/W register conversion into helper function. [for anholt]
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> (v2)
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FIXME: This patch emits VS code that violates documented hardware
restrictions and then relies on undocumented behavior that results from
that violation. This patch passes all tests, but should be fixed ASAP to
conform to the hardware documentation.
v2: Explain undocumented hardware behavior. Improve comments.
v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt]
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> (v1)
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.
- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen < 7, we fully lower all operations to arithmetic and bitwise
operations.
On gen >= 7, we fully lower the Snorm2x16 and Unorm2x16 operations, and
partially lower the Half2x16 operations.
v2:
- Comment that scalarization is needed only for SOA code [for idr].
- Replace switch-statement with if-statement [for idr].
- Remove misplaced hunk from previous patch [found by idr].
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not all float32 values can be exactly represented as a float16.
_mesa_float_to_half() rounded such intermediate float32 values to zero by
truncating unrepresentable bits in the mantissa.
This patch improves _mesa_float_to_half() by rounding intermediate float32
values to the nearest float16; when the float32 is exactly between two
float16 values we round to the one with an even mantissa. This behavior is
preferred over the old behavior because:
- It has reduced bias relative to the old behavior.
- It reproduces the behavior of real hardware: opcode F32TO16 in
Intel's GPU ISA.
- By reproducing the behavior of the GPU (at least on Intel hardware),
compile-time evaluation of constant packHalf2x16 GLSL expressions will
result in the same value as if the expression were executed on the GPU.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move round_to_even's definition to mesa/main so that _mesa_float_to_half()
can use it in order to eliminate rounding bias.
In additon to moving the fuction definition, prefix its name with "_mesa",
just as all other functions in mesa/main are prefixed.
v2: Fix Android build.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Also, add opcodes for scalarized variants of the Half2x16 functions. (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized. A lowering pass, to be added later, will
scalarize the Half2x16 functions).
v2: Fix assertion message in ir_to_mesa [for idr].
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The bug: The printed horizontal stride was the numerical value of the
BRW_HORIZONTAL_$N enum.
The fix: Translate the enum before printing.
Note: This is a candidate for the stable releases.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When possible, glCopyTexSubImage calls are performed using the
hardware blitter. However, according to the Ivy Bridge PRM, Vol1
Part4, section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels. Other pixel formats have
accordingly larger limits.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_copy_texsubimage's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding use of the blitter
when the miptree's pitch is >= 32k.
Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit"
tests when the buffer width is equal to 8192.
Note: this is very similar to the recent patch "intel: Fix ReadPixels
on buffers whose width >= 32kbytes" except that it applies to
glCopyTexSubImage instead of glReadPixels. In a future patch it would
be nice to refactor the code so that (a) overflow is avoided, and (b)
intelEmitCopyBlit is responsible for checking whether the blitter can
handle the width, so that all callers of intelEmitCopyBlit work
properly, rather than just these two.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch replaces the three ir_variable_mode enums:
- ir_var_in
- ir_var_out
- ir_var_inout
with the following five:
- ir_var_shader_in
- ir_var_shader_out
- ir_var_function_in
- ir_var_function_out
- ir_var_function_inout
This eliminates a frustrating ambiguity: it used to be impossible to
tell whether an ir_var_{in,out} variable was a shader in/out or a
function in/out without seeing where the variable was declared in the
IR. This complicated some optimization and lowering passes, and would
have become a problem for implementing varying structs.
In the lisp-style serialization of GLSL IR to strings performed by
ir_print_visitor.cpp and ir_reader.cpp, I've retained the names "in",
"out", and "inout" for function parameters, to avoid introducing code
churn to the src/glsl/builtins/ir/ directory.
Note: a couple of comments in the code seemed to indicate that we were
planning for a possible future in which geometry shaders could have
shader-scope inout variables. Our GLSL grammar rejects shader-scope
inout variables, and I've been unable to find any evidence in the GLSL
standards documents (or extensions) that this will ever be allowed, so
I've eliminated these comments.
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For texelFetchOffset(), we just add the texel offsets to the coordinate
rather than using the message header's offset fields. So we don't
actually need a header on Gen5+.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When possible, glReadPixels calls are performed using the hardware
blitter. However, according to the Ivy Bridge PRM, Vol1 Part4,
section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_miptree_map_blit's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding the readpixels
blit path when the miptree's pitch is >= 32k.
Fixes gles3conform "half_float" tests when the buffer width is greater
than 2048.
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I believe that the size used to vary, so the dynamic allocation is
necessary.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always enable the use of pre-compressed texture data. The ability to
perform on-line compression still requires the presence of libtxc_dxtn
or an explicit driconf over-ride. Applications that just want to submit
precompessed data when an on-line compressor is not available can look
for the GL_EXT_texture_compression_dxt1 and
GL_ANGLE_texture_compression_dxt[35] extensions.
v2: Only enable the extensions that do not require on-line compression
by default. The previous statement "This should not impact many (if
any) real applications." proved to be false for at least Sauerbraten.
This application mostly submits pre-compressed data, but it also can
submit uncompressed data that it asks the driver to compress.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]> [v1]
Reviewed-by: Kenneth Graunke <[email protected]> [v1]
Acked-by: Eric Anholt <[email protected]> [v1]
Acked-by: Lee Salzman <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ANGLE_texture_compression_dxt in all APIs
This is technically outside the ANGLE spec, but it seems unlikely to
cause any harm.
v2: Simplify the extension checks by assuming the ANGLE extension will
always be enabled by any driver that enables the EXT. Suggested by
Eric Anholt.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Lee Salzman <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For non-generic compressed format we assert two things:
1. The format has already been validated against the set of available
extensions.
2. The driver only enables the extension if it supports all of the
formats that are part of that extension.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
compression
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Acked-by: Lee Salzman <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Similar to the previous commit, we may be using a texture with actual RGBA
storage for the GL_ALPHA format, so force the color values to 0.0.
This commit fixes the following piglit (sub) tests:
EXT_texture_snorm/fbo-blending-formats
GL_ALPHA16_SNORM
GL_ALPHA8_SNORM
GL_ALPHA_SNORM
Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We may be using a texture with actual RGBA storage for these formats, so force
the alpha value read to 1.0.
This commit fixes the following piglit (sub) tests:
ARB_texture_float/fb-blending-formats
GL_RGB16F_ARB
EXT_framebuffer_object/fbo-blending-formats
GL_RGB10
GL_RGB12
GL_RGB16
EXT_texture_snorm/fbo-blending-formats
GL_RGB16_SNORM
GL_RGB8_SNORM
GL_RGB_SNORM
These test improvements depend on the previous commit as well. That commit
smashes alpha to 1.0 for the case of ReadPixels (so fixes "FBO testing" as
reported by this test), while this commit smashes alpha to 1.0 for the case of
texturing (fixed the "window testing" as reported by this test).
Note: Haswell bypasses this swizzle code, so may require an independent fix
for this bug.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When performing a ReadPixels operation, we may be reading from a buffer that
stores alpha values, but that is actually representing a buffer with no alpha
channel. In this case, while rebasing the values, touch up all alpha values
read to 1.0.
This commit fixes the following piglit (sub) tests:
ARB_texture_float/fbo-colormask-formats
GL_RBG16F_ARB
EXT_texture_snorm/fbo-colormask-formats
GL_RGB16_SNORM
GL_RGB8_SNORM
GL_RGB_SNORM
It likely improves the results of other tests as well, but a PASS remains
elusive due to additional bugs.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The renderbuffer's Format field may have an alpha channel even when the
underlying _BaseFormat does not. This can happen when mesa chooses to use
RGBA16 for an RGB16 format, for example.
So look at _BaseFormat when deciding whether to fixup the blend factors.
This test improves the results of at least the following piglit tests:
EXT_frambebuffer_object/fbo-blending-formats
{GL_RGB10, GL_RGB12, GL_RGB16}
EXT_texture_snorm/fbo-blending-formats
{GL_RGB16_SNORM, GLRGB8_SNORM, GL_RGB_SNORM}
But none of these actually change from FAIL to PASS yet. The R, G, and B probe
values are fixed with this commit, but the tests still fail because the alpha
values are still wrong.
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thanks to Fredrik Höglund, all the hard work was already done.
Tested using a modified oglconform (that actually runs these tests on
our driver); it looks like there may be some bugs when using client
arrays. All applicable non-compatibility tests passed.
For now, only enable it in core profiles.
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Fixes piglit's fbo-blit-stretch test.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: José Fonseca <[email protected]>
|