| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Not used yet, but the UBO layout visitor will use this.
v2: Remove a spruious hunk. This is moved to the patch "glsl: Remove
ir_variable::uniform_block". Suggested by Carl Worth.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Not used yet, but the UBO layout visitor will use this.
v2: Add some commentary as to why row_major is always set to false in
process. Suggesed by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the first declaration below, there will be an ir_variable named
"instance" whose type and whose instance_type will be the same
glsl_type. For the second declaration, there will be an ir_variable
named "f" whose type is float and whose instance_type is B2.
"instance" is an interface instance variable, but "f" is not.
uniform B1 {
float f;
} instance;
uniform B2 {
float f;
};
v2: Copy the comment message documentation into the code. Suggested by
Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For variables that are in an interface block or are an instance of an
interface block, this is the GLSL_TYPE_INTERFACE type for that block.
Convert the ir_variable::is_in_uniform_block method added in the
previous commit to use this field instead of ir_variable::uniform_block.
v2: Fix the place-holder comment on ir_variable::interface_type.
Suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The way a variable is tested for this property is about to change, and
this makes the code easier to modify.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the block has an instance name, add the instance name to the symbol
table instead of the individual fields.
Fixes the piglit test interface-name-access-without-interface-name.vert
for real.
v2: Update the comment before the assertion that interface block
definitions won't generate instructions. Suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Interfaces are structurally identical to structures from the compiler's
point of view. They have some additional restrictions, and generally
GPUs use different instructions to access them. Using a different base
type should make this a bit easier.
This commit also adds the glsl_type::interface_packing fields. For
GLSL_TYPE_INTERFACE types, this will track the specified packing mode.
It is analogous to gl_uniform_buffer::_Packing.
v2: Add serveral missing GLSL_TYPE_INTERFACE cases in switch-statements.
v3: Add information about glsl_type::interface_packing. Move row_major
checking in glsl_type::record_key_compare from this patch to the
previous patch. Both suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now, this will always be false. In the near future, an "interface"
type will be added that shares a lot of infrastructure with structures.
v2: Move row_major checking in glsl_type::record_key_compare from the
next patch to this patch. Suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will soon also be used for processing interface block fields.
v2: Add a comment explaining the interface of
ast_process_structure_or_interface_block. Suggested by Paul Berry.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The size is parsed and stored in the AST, but it is not used yet.
Processing of the array size is added in the patch "glsl: Handle
instance array declarations"
v2: Update the commit message (suggested by Carl Worth). Add a comment
to ast_uniform_block::array_size (suggested by Paul Berry).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GLSL ES 3.00 (and GLSL 1.50), uniform blocks can have an associated
"instance name", which essentially namespaces the variables inside.
This patch adds basic parsing for this new feature, but doesn't yet hook
it up to actually do anything yet.
It does not support for arrays of interface blocks; a later commit will
take care of that.
This change temporarily regresses the piglit test
interface-name-access-without-interface-name.vert. This shader failed
to compile before (the expected result), but it failed to compile for
the wrong reason. This is not a real regression.
v2: Add some comments to ast_uniform_block::instance_name. Suggested by
Paul Berry.
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code has a lot of duplication; the only difference between
the two cases is whether we merge in an additional layout qualifier.
Apparently creating a layout_qualifieropt rule that can be empty causes
a lot of conflicts and confusion. However, refactoring out the guts of
the ast_uniform_block creation works fine.
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also slightly change the compatibility test. Instead of comparing the
offsets of the block variables, compare the packing mode of the blocks.
Ideally we don't want to assign the offsets until a later stage of
linking.
This is put in a new file called link_uniform_blocks.cpp. Some new
functions related to uniform blocks are going to live in that file as
well.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This allows the next patch to verify that two uniform blocks match
without first calculating the locations of the fields.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it easier to find switch-statements that need to be updated
after a new GLSL_TYPE_* is added because the compiler will generate a
warning.
Switch-statements that only had a small number of cases (e.g.,
everything in ir_constant_expression.cpp) were not modified. I may
regret that decision when we eventually add support for doubles.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Carl Worth <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Too much attention was paid to the first paragraphs, and not enough to
the last little note that "oh, by the way, the rendered things
themselves still have to be clipped to just 8192 wide/high".
Fixes GTF's clip.c test with 4096 or higher width on ivb, where one of
the triangles got the upper half of its pixels dropped.
Tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Khronos has apparently decided that depth textures with sized formats
(allowed with ARB_internalformat_query or ES 3.0) should be treated as
GL_RED, while unsized formats (an existing feature) should be treated
as GL_INTENSITY for compatibility with ES 2.0.
Ian is proposing changes to ARB_internalformat_query which will make
this actually legal and consistent.
A similar problem exists with GL 4.2, but we're going to ignore that
for the time being.
Tested on Ivybridge: no Piglit regressions; fixes 4 es3conform tests:
- depth_texture_fbo
- depth_texture_fbo_clear
- depth_texture_teximage
- depth_texture_texsubimage
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since patch "i965: Validate requested GLES context version in
brwCreateContext", we have been able to create ES 3.0 contexts due to the
max version check. So...bump the max version.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Add ARB_internalformat_query to the list of required extensions.
v3: Add OES_depth_texture_cube_map to the list of required extensions.
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes build regression introduced by commit
eac030e38e3cdd4ed4534516e3d3a50c8a372719.
Signed-off-by: Vinson Lee <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59835
|
|
|
|
|
|
|
|
|
| |
s/src/src_w/
That little typo, which sneaked into v4 of the previous patch, generates
incorrect fs code.
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Remove lewd comment. [for idr]
v3: - Optimize away tmp register for packHalf2x16. [for anholt, paul]
- Improve comments. [for anholt, paul]
- Reduce near-duplicate code by removing vec4_visitor emit_pack/unpack
methods. [for chadv]
v4: Factor our UD/W register conversion into helper function. [for anholt]
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> (v2)
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
FIXME: This patch emits VS code that violates documented hardware
restrictions and then relies on undocumented behavior that results from
that violation. This patch passes all tests, but should be fixed ASAP to
conform to the hardware documentation.
v2: Explain undocumented hardware behavior. Improve comments.
v3: Use ALU1 helper methods F32TO16() and F16TO32(). [for anholt]
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> (v1)
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.
- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen < 7, we fully lower all operations to arithmetic and bitwise
operations.
On gen >= 7, we fully lower the Snorm2x16 and Unorm2x16 operations, and
partially lower the Half2x16 operations.
v2:
- Comment that scalarization is needed only for SOA code [for idr].
- Replace switch-statement with if-statement [for idr].
- Remove misplaced hunk from previous patch [found by idr].
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lower them to arithmetic and bit manipulation expressions.
v2: Rewrite using ir_builder [for idr].
v3: Comment typos. [for mattst88]
v4: Fix arithmetic error in comments.
Factor out a shift instruction.
Don't heap allocate factory.instructions.
[for paul]
Reviewed-by: Ian Romanick <[email protected]> (v2)
Reviewed-by: Matt Tuner <[email protected]> (v3)
Reviewed-by: Paul Berry <[email protected]> (v4)
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In ir_expression's constructor, the cases for {bit,logic}_{and,or,xor}
failed to handle the case when both operands were vectors.
Note: This is a candidate for the stable branches.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Replace tabs with spaces. According to docs/devinfo.html, Mesa's
indetation style is:
indent -br -i3 -npcs --no-tabs infile.c -o outfile.c
This patch prevents whitespace weirdness in the next patch.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Add two overloaded variants of
ir_if *if_tree()
The new functions allow one to chain together if-trees within a single C++
expression that resembles a real if-statement.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Using this enum improves the readibility of calls to assign(), whose third
argument is a writemask.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Add method ir_factory::constant. This little method constructs an
ir_constant using the factory's mem_ctx.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the following functions, each of which construct the similarly named
ir expression:
div, round_even, clamp
equal, less, greater, lequal, gequal
logic_not, logic_and, logic_or
bit_not, bit_or, bit_and, lshift, rshift
f2i, i2f, f2u, u2f, i2u, u2i
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
This eliminates unexpected behavior due to unitialized values.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
That is, evaluate constant expressions of the following functions:
packSnorm2x16 unpackSnorm2x16
packUnorm2x16 unpackUnorm2x16
packHalf2x16 unpackHalf2x16
v2: Reuse _mesa_pack_float_to_half and its inverse to evaluate
pack/unpackHalf2x16. [for idr]
v3: Whitespace fixes. [for mattst88]
Don't cast neg floats directly to uint16; use an intermediate cast to
int16. [for paul]
Reviewed-by: Ian Romanick <[email protected]> (v2)
Reviewed-by: Paul Berry <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Not all float32 values can be exactly represented as a float16.
_mesa_float_to_half() rounded such intermediate float32 values to zero by
truncating unrepresentable bits in the mantissa.
This patch improves _mesa_float_to_half() by rounding intermediate float32
values to the nearest float16; when the float32 is exactly between two
float16 values we round to the one with an even mantissa. This behavior is
preferred over the old behavior because:
- It has reduced bias relative to the old behavior.
- It reproduces the behavior of real hardware: opcode F32TO16 in
Intel's GPU ISA.
- By reproducing the behavior of the GPU (at least on Intel hardware),
compile-time evaluation of constant packHalf2x16 GLSL expressions will
result in the same value as if the expression were executed on the GPU.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move round_to_even's definition to mesa/main so that _mesa_float_to_half()
can use it in order to eliminate rounding bias.
In additon to moving the fuction definition, prefix its name with "_mesa",
just as all other functions in mesa/main are prefixed.
v2: Fix Android build.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A subsequent patch will add mesa/main/imports.c as a dependency to the
compiler, which in turn requires that _mesa_warning() be defined.
The real definition of _mesa_warning() is in mesa/main/errors.c, but to
pull that file into the standalone scaffolding would require transitively
pulling in the dispatch tables.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each function {pack,unpack}{Snorm,Unorm,Half}2x16, add a corresponding
opcode to enum ir_expression_operation. Validate the new opcodes in
ir_validate.cpp.
Also, add opcodes for scalarized variants of the Half2x16 functions. (The
code generator for the i965 fragment shader requires that all vector
operations be scalarized. A lowering pass, to be added later, will
scalarize the Half2x16 functions).
v2: Fix assertion message in ir_to_mesa [for idr].
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For each of the following functions, add a declaration to
builtins/profiles/300es.glsl and create new file
builtins/ir/${funcname}.ir:
packSnorm2x16 unpackSnorm2x16
packUnorm2x16 unpackUnorm2x16
packHalf2x16 unpackHalf2x16
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Tuner <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
s/num_operands()/get_num_operands()/
Discovered because Eclipse failed to resolve the false reference.
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The bug: The printed horizontal stride was the numerical value of the
BRW_HORIZONTAL_$N enum.
The fix: Translate the enum before printing.
Note: This is a candidate for the stable releases.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When possible, glCopyTexSubImage calls are performed using the
hardware blitter. However, according to the Ivy Bridge PRM, Vol1
Part4, section 1.2.1.2 (Graphics Data Size Limitations):
The BLT engine is capable of transferring very large quantities of
graphics data. Any graphics data read from and written to the
destination is permitted to represent a number of pixels that
occupies up to 65,536 scan lines and up to 32,768 bytes per scan
line at the destination. The maximum number of pixels that may be
represented per scan line’s worth of graphics data depends on the
color depth.
With an RGBA32F color buffer (which has 16 bytes per pixel) this
imposes a maximum width of 2048 pixels. Other pixel formats have
accordingly larger limits.
To make matters worse, if the pitch of the buffer is 32k or greater,
intel_copy_texsubimage's call to intelEmitCopyBlit will overflow
intelEmitCopyBlit's src_pitch and dst_pitch parameters (which are
16-bit signed integers).
We can conveniently avoid both problems by avoiding use of the blitter
when the miptree's pitch is >= 32k.
Fixes gles3conform "framebuffer_blit_functionality_magnifying_blit"
tests when the buffer width is equal to 8192.
Note: this is very similar to the recent patch "intel: Fix ReadPixels
on buffers whose width >= 32kbytes" except that it applies to
glCopyTexSubImage instead of glReadPixels. In a future patch it would
be nice to refactor the code so that (a) overflow is avoided, and (b)
intelEmitCopyBlit is responsible for checking whether the blitter can
handle the width, so that all callers of intelEmitCopyBlit work
properly, rather than just these two.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously I thought that varying structs had been added to GLSL ES
3.00 by mistake, because chapter 11 of the GLSL ES 3.00 spec
("Counting of Inputs and Outputs") failed to mention how structs
should be handled. Khronos has clarified
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9828) that varying
structs are indeed required, and that chapter 11 will be modified to
indicate that the minimal reference packing algorithm flattens varying
structs to their individual components.
Mesa doesn't flatten varying structs to their individual components,
but this is ok, since it packs varyings of all kinds with no wasted
space at all (except where this is impossible due to differing
interpolation modes), so it will outperform the minimal reference
packing algorithm in all but the most pathological cases.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is not clear from the GLSL ES 3.00 spec how transform feedback is
supposed to apply to varying structs:
- There is no specification for how the structure is to be packed when
it is recorded into the transform feedback buffer.
- There is no reasonable value for GetTransformFeedbackVarying to
return as the "type" of the variable.
We currently have a Khronos bug requesting clarification on how this
feature is supposed to work
(https://cvs.khronos.org/bugzilla/show_bug.cgi?id=9856).
This patch just disables transform feedback of varying structs for
now; we can implement the proper behaviour once we find out from
Khronos what it is.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds code to lower_packed_varyings to handle varyings of
type struct. Varying structs are currently packed in the most naive
possible way (in declaration order, with no gaps), so there is a
potential loss of runtime efficiency. In a later patch it would be
nice to replace this with a "flattening" approach (wherein a varying
struct is flattened to individual varyings corresponding to each of
its structure elements), so that the linker can align each structure
element independently. However, that would require a significantly
more complex implementation.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch paves the way for allowing varying structs by generalizing
varying_matches::compute_packing_order to handle any type of varying.
Previously, we packed in the order (vec4, vec2, float, vec3), with
matrices being packed according to the size of their columns. Now, we
pack everything according to its number of components mod 4, in the
order (0, 2, 1, 3).
There is no behavioural change for vectors. Matrices are now packed
slightly differently:
- mat2x2 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC2. This is slightly better, because it guarantees
that the matrix occupies a single varying slot.
- mat2x3 gets assigned PACKING_ORDER_VEC2 instead of
PACKING_ORDER_VEC3. This is kind of a wash. Previously, mat2x3 had
a 25% chance of having neither of its columns double parked, a 50%
chance of having exactly one of its columns double parked, and a 25%
chance of having both of its columns double parked. Now it always
has exactly one of its columns double parked.
- mat3x3 gets assigned PACKING_ORDER_SCALAR instead of
PACKING_ORDER_VEC3. This doesn't affect much, since in both cases
there is no guarantee of how the matrix will be aligned.
- mat4x2 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC2. This is slightly better for the same reason as
in mat2x2.
- mat4x3 gets assigned PACKING_ORDER_VEC4 instead of
PACKING_ORDER_VEC3. This is slightly better for the same reason as
in mat2x2.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|