| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a geometry shader is present, the fragment shader gl_PrimitiveID
input acts like an ordinary varying, receiving data from the gs
gl_PrimitiveID output. When there's no geometry shader, we have to
ask the fixed function SF hardware to provide the primitive ID to the
fragment shader instead.
Previously, the SF setup code would handle this situation by
recognizing that the FS gl_PrimitiveID input didn't match to any VS
output; since normally an FS input with no corresponding VS output
leads to undefined data, the SF setup code used to just arbitrarily
assign it to receive data from attribute 0.
This patch changes the SF setup code so that instead of arbitrarily
using attribute 0, it assigns the unmatched FS input to receive
gl_PrimitiveID. In the case where the FS input really is
gl_PrimitiveID, this produces the intended result. In all other
cases, no harm is done since GL specifies that the behaviour is
undefined.
Fixes piglit test primitive-id-no-gs.
v2: If an attribute is already being overridden with point
coordinates, don't try to also override it with gl_PrimitiveID. This
is necessary to avoid regressing piglit tests such as
shaders/glsl-fs-pointcoord.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes 'make check' failures introduced with commit
80964226e9b8a05c39157f9305c06c0b2861e080.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70900
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Fixes SCons build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Actually implement interop between the gallium
state tracker and the VDPAU backend.
v3: Make it also available in non legacy contexts,
fix video buffer sharing.
v4: deny interop if we don't have the same screen object
v5: rebased on upstream changes
v6: implemented VDPAUGetSurfaceivNV, improved error handling,
unregister all surfaces in VDPAUFiniNV
v7: squash merge with Mareks changes
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
Otherwise OpenGL/VDPAU interop won't work as expected.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
| |
Just let it be handled by the lowering pass.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
ir_txf expects an ivec* coordinate, and may be larger than ivec2;
shuffle things around so that this will work.
V2: Fix style nits, use ir_builder
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It turns out that nonzero offsets with gsampler2DRect don't work -- they
just return garbage. Work around this by folding the offset into the
coord.
Done as an IR pass rather than yet another hack in the visitors because
it's clear what's going on this way. Can possibly reuse this to replace
the existing txf coord+offset hacks.
V2: Use ir_builder
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rewrites textureGatherOffsets(s, p, offsets) into
gvec4(
textureGatherOffset(s, p, offsets[0]).w,
textureGatherOffset(s, p, offsets[1]).w,
textureGatherOffset(s, p, offsets[2]).w,
textureGatherOffset(s, p, offsets[3]).w
)
V2: Use ir_builder to be slightly clearer.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't have a message that does 4 independent offsets; a lowering
pass needs to lower it to 4 normal gather4s before reaching this
point.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is needed for textureGatherOffsets()
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Note that gather4_po_c's parameters are too long for SIMD16. It might be
worth emitting 2xSIMD8 messages in this case at some point.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
gather4_c's argument layout is straightforward -- refz just goes on the
end.
gather4_po_c's layout however -- the array index is replaced with refz.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
ARB_gpu_shader5's textureGather*() functions which take shadow samplers
have a separate `refz` parameter rather than adding it to the
coordinate.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
V3: fixup crazy check for whether we need to emit the coordinate after
custom handling.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Some texturing ops are about to have nonconstant offset support; the
offset in the header in these cases should be zero.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The generator code ends up clearer this way than if we had to sniff
via the message length. Implemented via the gather4_po message in
hardware, which is present in Gen7 and later.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Prior to ARB_gpu_shader5 / GLSL 4.0, the offset is required to be
a constant expression.
With that extension, it is relaxed to be an arbitrary expression.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
- gsampler2DRect
- optional `comp` parameter
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since 062317d6671 (i965: Go back to using the kernel SOL reset feature.)
we've been flushing the batch on BeginTransformFeedback(). So it's not
necessary to do it on EndTransformFeedback(). A PIPE_CONTROL will work.
This makes gen7_end_transform_feedback() exactly the same as the gen6
variant. However, they'll diverge again shortly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a hack to avoid choosing to schedule all texturing before
consumption of any texture results due to the way dependency chains worked
out in the presence of MRFs. On gen7, we don't have MRFs, so the problem
doesn't apply, and this was just badly constraining our scheduling.
total instructions in shared programs: 1615306 -> 1612534 (-0.17%)
instructions in affected programs: 9958 -> 7186 (-27.84%)
GAINED: 259
LOST: 9
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The LIFO plan was simple: Take the most recently made available
instructions, and pick those first.
But because of the order we were pushing things onto our list of
available-to-schedule instructions, it meant that when a set of
instructions was made available at the same time (for example, everything
at the start of the program that didn't depend on other instructions) we'd
schedule them in reverse order.
If you had 10 texture calls in a row in your program, each with
independent argument setup, we'd set up the last texture call's args and
execute it first, even though we wouldn't be able to consume its results
until we'd finished the other 9 texture calls (assuming consumption of
texture results happens near each texture call, and combines it with
another texture result, which is normal for a convolution shader).
To fix this, walk the list for doing LIFO in the order that instructions
were originally generated in the program, but choose to push
newly-made-available instructions to the other end of the list instead.
total instructions in shared programs: 1587242 -> 1586290 (-0.06%)
instructions in affected programs: 7801 -> 6849 (-12.20%)
GAINED: 76
LOST: 67
Thanks to Chia-I Wu for pointing out the bug in my first version of the
patch that made it a huge loss.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
When PIPE_CAP_MIXED_FRAMEBUFFER_SIZES is not provided, parts of
ARB_framebuffer_object can't be supported, such as on NV30.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This CAP will determine whether ARB_framebuffer_object can be enabled.
The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf
textures.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
_XReply returns 1 on success, but indirect_bind_context returns 0 on
success.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70486
Reviewed-and-tested-by: Ian Romanick <[email protected]>
Signed-off-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
| |
No shader-db changes, but seems like a good idea.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
A few Serious Sam 3 shaders affected:
instructions in affected programs: 4384 -> 4344 (-0.91%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
total instructions in shared programs: 1645011 -> 1644938 (-0.00%)
instructions in affected programs: 17543 -> 17470 (-0.42%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
This fixes a lockup in piglit/spec/glsl-1.40/execution/tf-no-position.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 1b4a737 (glsl: Support redeclaration of VS and GS
gl_PerVertex output), I added code to ensure that when an unnamed
gl_PerVertex interface block is redeclared, any ir_variables that
weren't included in the redeclaration are removed from the IR (and the
symbol table). This ensures that only those variables that were
explicitly redeclared may be used.
However, when I wrote this code, I neglected to match the variable
mode when finding variables to remove. This meant that redeclaring a
built-in output block might cause the built-in input gl_in to be
accidentally removed.
Fixes piglit test gs-redeclares-pervertex-out-only.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL 4.10 rules for redeclaration of built-in interface blocks
(which we've chosen to regard as clarifications of GLSL 1.50) only
require gl_PerVertex blocks to match in shaders that actually use
those blocks. The easiest way to implement this is to detect
situations where a compiled shader doesn't refer to any elements of
gl_PerVertex, and remove all the associated ir_variables from the
shader at the end of ast-to-ir conversion.
Fixes piglit tests
linker/interstage-{pervertex,pervertex-in,pervertex-out}-redeclaration-unneeded.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally when a built-in array (such as gl_ClipDistance) is
redeclared, we call get_variable_being_redeclared() to do the
redeclaration, and it in turn calls check_builtin_array_max_size() to
make sure that the redeclared array size isn't too large.
However when a built-in array is redeclared as part of redeclaring
gl_in, we don't call get_variable_being_redeclared() (since the
individual built-ins aren't each represented by their own ir_variable
anymore). So we need to add an explicit call to
check_builtin_array_max_size() to make sure the new array size isn't
too large.
Note: at the moment this is redundant with a test that's done at link
time, so there's no change to piglit results. But the patch that
follows will prevent link errors from being reported if gl_PerVertex
isn't used, so in order to prevent that patch from causing
regressions, we need to add the compile check now. Besides, it's
nicer to report this error at compile time anyhow.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The queries GEOMETRY_VERTICES_OUT, GEOMETRY_INPUT_TYPE, and
GEOMETRY_OUTPUT_TYPE (defined by GL 3.2) differ from the corresponding
queries in ARB_geometry_shader4 in the following ways:
- They use different enum values
- They can only be queried; they cannot be set.
- Attempting to query them yields INVALID_OPERATION if the program is
not linked, or lacks a geometry shader.
This patch switches us over from the ARB_geometry_shader4 behaviour to
the GL 3.2 behaviour.
Fixes piglit test query-gs-prim-types.
v2: Improve comment above has_core_gs.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When program_resource_visitor visits variables that were created by
lower_named_interface_blocks, it needs to do extra work to un-do the
effects of lower_named_interface_blocks and construct the proper API
names.
Fixes piglit test
spec/glsl-1.50/execution/interface-blocks-api-access-members.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These variables will need to be treated specially by
program_resource_visitor, so that they can be addressed through the
API using their interface block name (and array index, for interface
block arrays).
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit tests:
- interface-block-interpolation-{array,named,unnamed}
- glsl-1.50-interface-block-centroid {array,named,unnamed}
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Later patches will use this information to do proper error checking of
interpolation qualifiers that appear inside of interface blocks.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In future patches, we will need this in order to interpret
interpolation qualifiers that appear inside interface blocks.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Future patches will need to call this function when there isn't an
ir_varible present to refer to.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Although in principle there is no hardware limitation that prevents
gl_MaxGeometryInputComponents from being set to 128 on Gen7, we have
the following limitations in the vec4 compiler back end:
- Registers assigned to geometry shader inputs can't be spilled or
later re-used for any other purpose.
- The last 16 registers are set aside for the "MRF hack", meaning they
can only be used to send messages, and not for general purpose
computation.
- Up to 32 registers may be reserved for push constants, even if there
is sufficient register pressure to make this impractical.
A shader using 128 geometry input components, and having an input type
of triangles_adjacency, would use up:
- 1 register for r0 (which holds URB handles and various pieces of
control information).
- 1 register for gl_PrimitiveID.
- 102 registers for geometry shader inputs (17 registers per input
vertex, assuming DUAL_INSTANCED dispatch mode and allowing for one
register of overhead for gl_Position and gl_PointSize, which are
present in the URB map even if they are not used).
- Up to 32 registers for push constants.
- 16 registers for the "MRF hack".
That's a total of 152 registers, which is well over the 128 registers
the hardware supports.
Fortunately, the GLSL 1.50 spec allows us to reduce
gl_MaxGeometryInputComponents to 64. Doing that frees up 48
registers, brining the total down to 104 registers, leaving 24
registers available to do computation.
Fixes piglit test
spec/glsl-1.50/execution/geometry/max-input-components.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is similar to what we do for 16-wide vs 8-wide fragment shaders.
First we try compiling the geometry shader in DUAL_OBJECT mode. If we
can't do that without spilling, we fall back on DUAL_INSTANCED mode,
which should require less spilling (since it uses an interleaved
layout of payload registers).
In an ideal world we'd fall back to SINGLE mode, which would allow us
to interleave general-purpose registers too (resulting in even less
likelihood of spilling). But at the moment, the vec4 generator and
visitor classes don't have the infrastructure to interleave general
purpose registers, so DUAL_INSTANCED is the best we can do.
As a side benefit this paves the way for implementing instanced
geometry shaders (which are incompatible with DUAL_OBJECT mode).
Since most geometry shaders used in piglit testing are small,
DUAL_INSTANCED mode won't get exercised very much in a normal piglit
run. To force DUAL_INSTANCED mode to be used for all geometry
shaders, set INTEL_DEBUG=nodualobj.
Reviewed-by: Eric Anholt <[email protected]>
|