| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This force surface allocated from ddx to be consider as height
aligned on 8 and fix 1D->2D tiling transition that result from
this.
Signed-off-by: Jerome Glisse <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_emit_vertices contains special case logic to handle the case where
a vertex shader doesn't read any inputs. This special case logic was
incorrectly activating in the case were the only vertex input is
gl_VertexID. As a result, if a shader used gl_VertexID but used no
other inputs, then all vertices got a gl_VertexID of zero.
Fixes oglconform test "ubo-usage advanced.transform_feedback".
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The rather unweildy logic for determining this condition was repeated
in a large number of places. This patch consolidates it to a single
inline function.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements the following behaviours, which are mandated by
the GL 4.3 and GLES3 specs.
1. Regarding the GL_TRANSFORM_FEEDBACK_BUFFER_SIZE query: "If the
... size was not specified when the buffer object was bound
(e.g. if it was bound with BindBufferBase), ... zero is returned."
(GL 4.3 section 6.7.1 "Indexed Buffer Object Limits and Binding
Queries").
2. "BindBufferBase binds the entire buffer, even when the size of the
buffer is changed after the binding is established. It is
equivalent to calling BindBufferRange with offset zero, while size
is determined by the size of the bound buffer at the time the
binding is used." (GL 4.3 section 6.1.1 "Binding Buffer Objects to
Indexed Targets"). I interpret "at the time the binding is used"
to mean "at the time of the call to glBeginTransformFeedback".
3. "Regardless of the size specified with BindBufferRange, or
indirectly with BindBufferBase, the GL will never read or write
beyond the end of a bound buffer. In some cases this constraint may
result in visibly different behavior when a buffer overflow would
otherwise result, such as described for transform feedback
operations in section 13.2.2." (GL 4.3 section 6.1.1 "Binding
Buffer Objects to Indexed Targets").
Item 1 has been part of the spec all the way back to the inception of
the EXT_transform_feedback extension. Items 2 and 3 were added in GL
4.2 and GLES 3.
Prior to GL 4.2, in place of items 2 and 3, the spec simply said
"BindBufferBase is equivalent to calling BindBufferRange with offset
zero and size equal to the size of buffer." For transform feedback,
Mesa behaved as though this meant "...equal to the size of buffer at
the time of the call to BindBufferBase". However, this was
problematic because it left it ambiguous what to do if the buffer is
shrunk between the call to BindBuffer{Base,Range} and the call to
BeginTransformFeedback. Prior to this patch, Mesa's behaviour was to
try to write beyond the end of the buffer, likely resulting in memory
corruption. In light of this, I'm interpreting the spec change as a
clarification, not an intended behavioural change, so I'm making the
change apply regardless of API version.
Fixes GLES3 conformance test transform_feedback2_pause_resume.test.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In desktop GL, if a draw call would cause transform feedback buffers
to overflow, the draw call should succeed, and the extra primitives
should simply not be recorded in the transform feedback buffers.
In GLES3, however, if a draw call would cause transform feedback
buffers to overflow, the draw call is supposed to produce an
INVALID_OPERATION error and no drawing should occur.
This patch implements the GLES3-required behaviour.
Fixes GLES3 conformance test "transform_feedback_overflow.test".
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
In GLES3, only glDrawArrays() and glDrawArraysInstanced() calls are
allowed when transform feedback is active.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the i965 driver contained code to compute the maximum
number of vertices that could be written without overflowing any
transform feedback buffers. This code wasn't driver-specific, and for
GLES3 support we're going to need to use it in core mesa. So this
patch moves the code into a core mesa function,
_mesa_compute_max_transform_feedback_vertices().
Reviewed-by: Ian Romanick <[email protected]>
v2: Eliminate C++-style variable declarations, since these won't work
with MSVC.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No functional change--this simply paves the way to allow futures
patches to call vbo_count_tessellated_primitives() during error
checking, before the _mesa_prim struct has been constructed.
This will be needed for GLES3, which requires draw calls to fail if
there is not enough space available in transform feedback buffers to
accommodate the primitives to be drawn.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Add support for TEX2, TXB2, TXL2, fix SHADOWCUBE
Signed-off-by: Vadim Girlin <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since the idea is to just expand or shrink the bit width but not otherwise do
conversion we also need to adjust the sign bit according to src, otherwise
the conversion code will incorrectly clamp the values. (Since this only works
for casting to ordinary floats the norm and fixed bits should always be fine.)
This fixes the remaining piglit attribs GL3 failures.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Dave found some, but there were more.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=58039
|
|
|
|
|
|
|
|
| |
frees the object handle when a OpenVG
is destroyed.
Signed-off-by: Andreas Pokorny <[email protected]>
Signed-off-by: Brian Paul <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
| |
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=58380
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
a460aea3f14222af46f88d1bc686f82180b8a872 wasn't entirely correct,
since all coords are already ints hence need to skip the iround.
Passes piglit texelFetch with sampler1DArray/sampler2DArray.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Make sure drivers initialize the version before:
* _mesa_initialize_exec_table is called
* _mesa_initialize_exec_table_vbo is called
* A context is made current
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The driver should call _mesa_initialize_vbo_vtxfmt after
computing the context version.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Drivers must compute the context version, and then call
_mesa_initialize_exec_table themselves.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In a future patch the exec functions will no longer set up
by _mesa_initialize_context and _vbo_CreateContext.
Therefore we must call _mesa_initialize_exec_table and
_mesa_initialize_exec_table_vbo.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This change forces the context version to be computed before
initilizing the exec dispatch tables.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This function initializes the exec/save dispatch tables
for VBO vtxfmt.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In glapi/gl_genexec.py:
* Remove _mesa_alloc_dispatch_table call
In glapi/gl_genexec.py and api_exec.h:
* Rename _mesa_create_exec_table to _mesa_initialize_exec_table
In context.c:
* Call _mesa_alloc_dispatch_table instead of _mesa_create_exec_table
* Call _mesa_initialize_exec_table (this is temporary)
Once all drivers have been modified to call
_mesa_initialize_exec_table, then the call to
_mesa_initialize_context can be removed from context.c.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This allows the debug code to at least show the sign properly.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
util_blitter_blit_generic().
This is used by st_BlitFramebuffer() / r600_blit(), and ARB_fbo allows
overlapped blits, even though the result is undefined. No piglit regressions
on r600g / CYPRESS.
Signed-off-by: Henri Verbeet <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This should fix:
https://bugs.freedesktop.org/show_bug.cgi?id=58039
Tested-by: Darxus on bug 58039
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
These don't really belong in brw_structs.h.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
struct brw_instruction and the related instruction emitting code won't
be useful on Gen8+, as the instruction encoding changed. However, the
struct brw_reg code is still extremely valuable.
While we're at it, fix up some style points:
- s/GLuint/unsigned/g
- s/GLint/int/g
- s/GLshort/int16_t/g
- s/GLushort/uint16_t/g
- s/INLINE/inline/g
- Replace tabs with spaces
- Put return types on a separate line from the function name/parameters
- Remove trailing whitespace
- Remove extraneous whitespace around function parameters
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This checks if the pipe driver can support RGB32 formats.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This adds the extensions + the tex buffer support for checking
the formats.
There is a piglit test enhancement sent to that list.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Not sure what was going on here, but running piglit with debug builds
might be a good plan :-)
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Tested-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No statistically significant performance difference on glbenchmark 2.7
(n=60). It reduces cycles spent in the vertex shader by 3.3% +/- 0.8%
(n=5), but that's only about .3% of all cycles spent according to the
fixed shader_time.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The way our visitor works, scalar expression/swizzle results that get
stored in channels other than .x will have an intermediate MOV from
their result in the .x channel to the real .y (or whatever) channel, and
similarly for vec2/vec3 results.
By knowing how to adjust DP4-type instructions for optimizing out a
swizzled MOV, we can reduce instructions in common matrix multiplication
cases.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The compute-to-mrf code is really twitchy, and it's hard to construct
GLSL testcases for it. This unit test is also really hard to work with
(for example, if your instruction is removed by dead code elimination,
you end up inspecting something irrelevant), but I did use it for
debugging some of the commits to follow.
I called it test_vec4_register_coalesce because the compute-to-mrf code
is about to morph into that.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The final halt of the fragment shader turns off the remaining channels,
then jumps such that everything is turned back on. So, we can have our
last ENDIF of the shader point at that directly.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Ivybridge PRM, Volume 4, Part 3, section 6.24 (page 172):
"The endif instruction is also used to hop out of nested conditionals by
jumping to the end of the next outer conditional block when all
channels are disabled."
Also:
"Pseudocode:
Evaluate(WrEn);
if ( WrEn == 0 ) { // all channels false
Jump(IP + JIP);
}"
First, ENDIF re-enables any channels that were disabled because they
didn't match the conditional. If any channels are active, it proceeds
to the next instruction (IP + 16). However, if they're all disabled,
there's no point in walking through all of the instructions that have no
effect---it can jump to the next instruction that might re-enable some
channels (an ELSE, ENDIF, or WHILE).
Previously, we always set JIP on ENDIF instructions to 2 (which is
measured in 8-byte units). This made it do Jump(IP + 16), which just
meant it would go to the next instruction even if all channels were off.
It turns out that walking over instructions while all the channels are
disabled like this is worse than just instruction dispatch overhead: if
there are texturing messages, it still costs a couple hundred cycles to
not-actually-read from the texture results.
This patch finds the next instruction that could re-enable channels and
sets JIP accordingly.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
V3: Put enable in an existing block rather than making a new
one for no good reason.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Caught by tex_grad-01.frag.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The cube map array code adds another caller of emit_math(), which
needs this check.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V2: Moved up into emit(ir_texture *) to avoid duplication and fix
ordering for Gen7; Gen6 math quirks moved into previous patches.
Tested on Gen6 only; passes all the cube_map_array piglits.
V3: Fixed weird whitespace
V4: Use sampler->type; otherwise broken on arrays of samplers.
v5: Minor style fixes (by anholt)
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
V4: Fix various style nits as pointed out by Eric, and expand IMM
operands on both Gen6 and Gen7.
v5: minor style nits (by anholt)
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
V3: Fixed weird whitespace
V4: Use sampler's type rather than variable's type; otherwise broken
with arrays of samplers. (Thanks Eric)
v5: Fix a couple more style nits (by anholt)
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This causes immediate values to get moved to a temp on gen7, which is needed
for an upcoming change but hadn't happened in the visitor until then.
v2: Drop gen > 7 checks (doesn't exist), and style-fix comments (changes by
anholt).
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
V4: Fixed style nits
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Actually switch on the other math instructions mentioned in the
comment.
v3: Add timing data for textureSize(), and clean up some long comment
lines.
Testing shader_time of fs16 shaders on a few frames of various apps:
nexuiz improved by 2.9% +/- 1.5% (n=10)
no difference on GLB2.5 (n=36, outliers removed)
no difference on GLB2.7 (n=25)
etqw improved by 2.6% +/- 2.2% (n=25)
no difference on lightsmark (n=25)
Acked-by: Kenneth Graunke <[email protected]>
|