| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The number of channels must be 4 for all RGBA components.
Fixes: 22d129601 ("tgsi: add support for image operations to tgsi_exec. (v2.1)")
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It was kind of overloaded, returning two different things. Now get
the index of the shadow reference src register with a new
tgsi_util_get_shadow_ref_src_index() function.
To verify the new code, I added some temp/debug code which looped
over all TGSI_TEXTURE_x values, calling the old function and new and
checking that the returned indexes matched.
Also tested piglit "shadow" tests with softpipe/llvmpipe.
No testing of ilo and radeonsi changes.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Should fix the assertion in piglit
spec@arb_gpu_shader5@texturegather@fs-r-none-shadow-2d when the
TXQ instruction specifies a 2D target but the sampler view was
declared as SHADOW2D.
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a null pointer dereference during the register allocation pass,
if a function had arguments.
Functions arguments get a definition from the function itself, a definition
which is therefore not linked to any instruction. If a value ends up having
a definition but no linked instruction, the register allocation pass doesn't
need to consider whether that value is generated by an instruction that
can only handle "short" registers (on nv50).
Signed-off-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
The extension is identical to GL_OES_copy_image. But dEQP has tests that
want the EXT variant.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We require the full ARB_gpu_shader5 for now, but in the future some
other CAP could get exposed to indicate that only the multisample-related
behavior of ARB_gpu_shader5 is available.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Ken did this earlier, and this is just me reimplementing his patch a
little differently.
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of removing every instruction in add_insts_from_block(), just
move the instruction to its scheduled location. This is a step towards
doing both bottom-up and top-down scheduling without conflicts.
Note that this patch changes cycle counts for programs because it begins
including control flow instructions in the estimates.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think when this code was written, basic blocks were always ended by a
control flow instruction or an end-of-thread message. That's no longer
the case, and removing this restriction actually helps things:
instructions in affected programs: 7267 -> 7244 (-0.32%)
helped: 4
total cycles in shared programs: 66559580 -> 66431900 (-0.19%)
cycles in affected programs: 28310152 -> 28182472 (-0.45%)
helped: 9577
HURT: 879
GAINED: 2
The addition of the is_control_flow() checks is not a functional change,
since the add_insts_from_block() does not put them in the list of
instructions to schedule. I plan to change this in a later patch.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Missing this causes an assertion failure in the scheduler with the next
patch.
Additionally, this gives cmod propagation enough information to optimize
code better.
total instructions in shared programs: 7112991 -> 7112852 (-0.00%)
instructions in affected programs: 25704 -> 25565 (-0.54%)
helped: 139
total cycles in shared programs: 64812898 -> 64810674 (-0.00%)
cycles in affected programs: 127224 -> 125000 (-1.75%)
helped: 139
Acked-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d0e1d6b7e27bf5f05436e47080d326d7daa63af2.
The change in the vec4 code is a mistake -- there's never an
FS_OPCODE_FB_WRITE in vec4 code.
The change in the fs code had the (harmless) effect of not recognizing
an FB_WRITE as a scheduling barrier even if it was marked EOT --
harmless because the scheduler marked the last instruction of a block as
a barrier, something I'm changing in the following patches.
This will be reimplemented later in the series.
|
|
|
|
|
|
|
| |
All of these were simply code for "architecture register file" (and in
the case of destinations, "not the null register").
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
These printed the cycle count the last basic block (sched.time is set
per basic block!). We have accurate, full program, data printed
elsewhere.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
Failed to update state tracker with new buffer interface.
Reviewed-by: Timothy Arceri <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
A latter patch will use XFB for buffers.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enables in shader defined transform feedback mode even if the
only place xfb_stride is defined is on the global out.
We don't worry about xfb_buffer since Issue 22 c) in the spec says:
"If the shader has an "xfb_buffer" qualifier identifying a buffer,
but doesn't declare "xfb_offset" on anything associated with it,
what happens?
...
variables not qualified with "xfb_offset" are not captured, which
makes the associated "xfb_buffer" qualifier irrelevant."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This will be used when checking if xfb should attempt to capture
a varying.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
When we move to the next buffer we need to reset the stream
so that we don't generate an error message about streams not
matching.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This moves the check until after we have done the stride
calculation and applies it to the xfb_* qualifiers.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the ARB_enhanced_layous spec:
"It is a compile-time or link-time error to have any *xfb_offset*
that overflows *xfb_stride*, whether stated on declarations before
or after the *xfb_stride*, or in different compilation units.
...
When no *xfb_stride* is specified for a buffer, the stride of a
buffer will be the smallest needed to hold the variable placed at
the highest offset, including any required padding."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we use the built-in validation in
ast_layout_expression::process_qualifier_constant() to check for mismatching
global out strides on buffers in a single shader.
From the ARB_enhanced_layouts spec:
"While *xfb_stride* can be declared multiple times for the same buffer,
it is a compile-time or link-time error to have different values
specified for the stride for the same buffer."
For intrastage validation a new helper link_xfb_stride_layout_qualifiers()
is created. We also take this opportunity to make sure stride is at least
a multiple of 4, we will validate doubles at a later stage.
From the ARB_enhanced_layouts spec:
"If the buffer is capturing any double-typed outputs, the stride must
be a multiple of 8, otherwise it must be a multiple of 4, or a
compile-time or link-time error results."
Finally we update store_tfeedback_info() to apply the strides to
LinkedTransformFeedback and update the buffers bitmask to mark any global
buffers with a stride as active. For example a shader with:
layout (xfb_buffer = 0, xfb_offset = 0) out vec4 gs_fs;
layout (xfb_buffer = 1, xfb_stride = 64) out;
Is expected to have a buffer bound to both 0 and 1.
From the ARB_enhanced_layouts spec:
"A binding point requires a bound buffer object if and only if its
associated stride in the program object used for transform feedback
primitive capture is non-zero."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This will be used in a following patch to implement interface
query support for TRANSFORM_FEEDBACK_BUFFER.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to print the correct binding point when not all
buffers declared in the shader are bound.
For example if we use a single buffer:
layout(xfb_buffer=2, offset=0) out vec4 v;
We now print '2' when the buffer is not bound rather than '0'.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
The existing transform feedback code expects to receive the list
of varyings in increasing buffer order.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds the initial infrastructure for enabling transform feedback
mode via in shader qualifiers and adds initial buffer support.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
We also apply any array/struct offsets.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This function checks for any xfb_* qualifiers which will enable
transform feedback mode and cause any API defined xfb varyings
to be ignored.
It also counts the number of varyings that have a xfb_offset
qualifier and finally it calls the create_xfb_varying_names()
helper to generate the names of varyings to be caputured.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This will be used to get a count of the number of varying name
strings we are required to generate for use with the query api.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we have an interface block like:
layout (xfb_buffer = 0, xfb_offset = 0) out Block {
vec4 var1;
layout (xfb_stride = 32) vec4 var2;
vec4 var3;
};
We take into account the stride of var2 when calculating the offset
for var3.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the ARB_enhanced_layouts spec:
"The *xfb_stride* qualifier specifies how many bytes are consumed
by each captured vertex. It applies to the transform feedback
buffer for that declaration, whether it is inherited or explicitly
declared. It can be applied to variables, blocks, block members,
or just the qualifier out. If the buffer is capturing any
double-typed outputs, the stride must be a multiple of 8, otherwise
it must be a multiple of 4, or a compile-time or link-time error
results.
...
The resulting stride (implicit or explicit) must be less than or
equal to the implementation-dependent constant
gl_MaxTransformFeedbackInterleavedComponents."
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
We also copy the qualifier values to the IR in this step.
Reviewed-by: Dave Airlie <[email protected]>
|