| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]
|
|
|
|
|
|
|
| |
This will be used in a following patch to implement interface
query support for TRANSFORM_FEEDBACK_BUFFER.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to print the correct binding point when not all
buffers declared in the shader are bound.
For example if we use a single buffer:
layout(xfb_buffer=2, offset=0) out vec4 v;
We now print '2' when the buffer is not bound rather than '0'.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
A while back, we moved to directly emitting the Gen7+ state when
constructing the binding tables. These flags are only used on
Gen4-6, which emit all the binding table pointers at once.
We gain nothing by having separate flags, so combine them.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now, we only use ctx->NewDriverState.
I used this bash & sed command in the i965 directory:
for file in *.[ch] *.[ch]pp; do
sed -i -e 's/state\.dirty\.brw/ctx.NewDriverState/g' $file
done
Followed by manual changes to brw_state_upload.c.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sandybridge requires the post-sync non-zero workaround in a ton of
places, and if you ever miss one, the GPU usually hangs.
Currently, we try to track exactly when a workaround flush is
necessary (via the brw->batch.need_workaround_flush flag). This is
tricky to get right, and we've botched it several times in the past.
This patch unconditionally performs the post-sync non-zero flush at the
start of each primitive's state upload (including BLORP). We drop the
needs_workaround_flush flag, and drop all the other callers, as the
flush has already been performed.
We have no data to indicate that simply flushing all the time will
hurt performance, and it has the potential to help stability.
v2: Add post-sync workaround to initial GPU state upload to be extra
cautious (suggested by Chad Versace).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rendering with a GS and then using transform feedback with a program that does
not have a GS can crash in gen6. The reason for this is that
brw_begin_transform_feedback checks brw->geometry_program to decide if there
is a GS program, but this is not correct: brw->geometry_program is updated when
issuing drawing commands, so after rendering with a GS it will be non-NULL
until we draw again with a program that does not have a GS. If the next
program uses TF, we will call glBegintransformFeedback before issuing
the drawing command and hence brw->geometry_program will be non-NULL if
the previous rendering used a GS. The right thing to do here is to check
ctx->_Shader->CurrentProgram[MESA_SHADER_GEOMETRY] instead. This is what the
gen7 code path does too.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=87694
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's been merged into brw_state_flags::brw for simplicity and
efficiency.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the dirty flags were listed in some arbitrary order. Some used
bonus parenthesis. Some put multiple flags on one line, others put one
per line. Some used tabs instead of spaces...but only on some lines.
This patch settles on one flag per line, in alphabetical order, using
spaces instead of tabs, and sheds the unnecessary parentheses.
Sorting was mostly done with vim's visual block feature and !sort,
although I alphabetized short lists by hand; it was pretty manual.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For gen6 geometry shaders we use the first BRW_MAX_SOL_BINDINGS entries of the
binding table for transform feedback surfaces. However, vec4_visitor will
setup the binding table so that textures use the same space in the binding
table. This is done when calling assign_common_binding_table_offsets(0) as
part if its run() method.
To fix this clash we add a virtual method to the vec4_visitor hierarchy to
assign the binding table offsets, so that we can change this behavior
specifically for gen6 geometry shaders by mapping textures right after the
first BRW_MAX_SOL_BINDINGS entries.
Also, when there is no user-provided geometry shader, we only need to upload
the binding table if we have transform feedback, however, in the case of a
user-provided geometry shader, we can't only look into transform feedback
to make that decision.
This fixes multiple piglit tests for textureSize() and texelFetch() when these
functions are called from a geometry shader in gen6, like these:
bin/textureSize gs sampler2D -fbo -auto
bin/texelFetch gs usampler2D -fbo -auto
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Update gen6_gs_binding_table and gen6_sol_surface to use user-provided
geometry program information when present. This is necessary to implement
transform feedback support.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Reverts
* "i965: Modify state upload to allow 2 different sets of state atoms."
8e27a4d2b3e4e74e9a77446bce49607433d86be3
* "i965: Modify dirty bit handling to support 2 pipelines."
373143ed9187c4d4ce1e3c486b5dd0880d18ec8b
* "i965: Create a macro for checking a dirty bit."
c5bdf9be1eca190417998d548fd140c1eca37a54
Conflicts:
src/mesa/drivers/dri/i965/brw_context.h
* "i965: Create a macro for setting all dirty bits."
6f56e1424d923fd80c84090fbf4506c9eaaffea1
Conflicts:
src/mesa/drivers/dri/i965/brw_blorp.cpp
src/mesa/drivers/dri/i965/brw_state_cache.c
src/mesa/drivers/dri/i965/brw_state_upload.c
* "i965: Create a macro for setting a dirty bit."
88e3d404dad009d8cff5124cf8acee7daeaceb64
Signed-off-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
This will make it easier to extend dirty bit handling to support
compute shaders.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically a sed but shaderapi.c and get.c.
get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior
shaderapi.c => the old api stil update the Shader object directly
V2: formatting improvement
V3 (idr):
* Rebase fixes after a block of code was moved from ir_to_mesa.cpp to
shaderapi.c.
* Trivial reformatting.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
array.
These are replaced with
ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}].
In patches to follow, this will allow us to replace a lot of ad-hoc
logic with a variable index into the array.
With the exception of the changes to mtypes.h, this patch was
generated entirely by the command:
find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \
-print0 | xargs -0 sed -i \
-e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \
-e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \
-e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g'
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementing the GetTransformFeedbackVertexCount() driver hook allows
the VBO module to call us with the right number of vertices.
The hardware doesn't directly count the number of vertices written by
SOL, so we instead use the SO_NUM_PRIMS_WRITTEN(n) counters and multiply
by the number of vertices per primitive.
Unfortunately, counting the number of primitives generated is tricky:
a program might pause a transform feedback operation, start a second one
with a different object, then switch back and resume. Both transform
feedback operations share the SO_NUM_PRIMS_WRITTEN counters.
To work around this, we save the counter values at Begin, Pause, Resume,
and End. This "bookends" each section where transform feedback is
active for the current object. Adding up differences of pairs gives
us the number of primitives generated. (This is similar to what we
do for occlusion queries on platforms without hardware contexts.)
v2: Fix missing parenthesis in assertion (caught by Eric Anholt).
v3: Reuse prim_count_bo rather than freeing it and immediately
allocating a new one (suggested by Topi Pohjolainen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARB_transform_feedback2 extension introduces the ability to pause
and resume transform feedback sessions. Although only one can be active
at a time, it's possible to switch between multiple transform feedback
objects while paused.
In order to facilitate this, we need to save/restore the SO_WRITE_OFFSET
registers so that after resuming, the GPU continues writing where it
left off.
This functionality also exists in ES 3.0, but somehow we completely
forgot to implement it.
v2: Reduce alignment from 4096 to 64 (it seemed excessive).
v3: Use I915_GEM_DOMAIN_INSTRUCTION instead of RENDER, for consistency
with other writes. It shouldn't matter on IVB+.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the basic driver hooks to allocate/free the brw variant.
It doesn't contain any additional information yet, but it will soon.
v2: Use the new _mesa_init_transform_feedback_object helper function
(requested by Eric and Ian).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the comments above intel_emit_post_sync_nonzero_flush:
"[DevSNB-C+{W/A}] Before any depth stall flush (including those
produced by non-pipelined state commands), software needs to first
send a PIPE_CONTROL with no bits set except Post-Sync Operation != 0."
This suggests that every non-pipelined (0x79xx) command needs a
post-sync non-zero flush before it.
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Xinkai Chen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Defines that previously referred to VS now refer to VEC4, since they
will be shared by the user-programmable vertex shader and geometry
shader stages.
Defines that previously referred to the Gen6 geometry shader stage
(which is only used for transform feedback) are now renamed to
explicitly refer to Gen6, to avoid confusion with the Gen7
user-programmable geometry shader stage.
Based on work by Eric Anholt <[email protected]>.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
"ff" is for "fixed function". This frees up the name "gs" to refer to
user-defined geometry shaders.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context available in every function that used
intel_context. This makes it possible to start migrating fields from
intel_context to brw_context.
Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Most of the work in BeginTransformFeedback is only necessary on Gen6.
We may as well just skip it on Gen7+.
v2: Add an intel->gen == 6 assert.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have hardware contexts, we don't need to continually
reprogram the GS_SVBI_INDEX registers. They're automatically saved and
restored with the context, so they can just increment over time. We
only need to reset them when starting transform feedback.
There's also no reason to delay until the next drawing operation; we can
just emit the packet immediately. However, this means we must drop the
initialization in brw_invariant_state, as BeginTransformFeedback may
occur before the first drawing in a context.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
This was only used for the the non-hardware context code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
We can just do it ourselves with MI_LOAD_REGISTER_IMM.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_NEW_TRANSFORM_FEEDBACK is not used by core Mesa, so it can be removed.
Instead, an new private flag is added to i965 to serve the same purpose.
If you're new to this:
* When creating a context. you can set private dirty flags
in gl_context::DriverFlags, eg.:
ctx->DriverFlags.NewStateX = BRW_NEW_STATE_X;
* When StateX is changed, core Mesa does:
ctx->NewDriverState |= ctx->DriverFlags.NewStateX;
* When you have to draw, read and clear ctx->NewDriverState.
* Pros: not touching NewState, the driver decides the mapping between
GL states and hw state groups, unlimited number of flags in core Mesa
(still limited number of flags in the driver though)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we were actually
updating it with a bogus value if the batch wrapped and we emitted the
packet again during a single transform feedback. By reducing state
emission, we avoid the bug.
Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
|
|
| |
The software-tracked transform feedback offsets (svbi_0_starting_index)
are incorrect in the presence of primitive restart, so we can't reliably
compute offsets for our buffer pointers after a batch flush. Thanks to HW
contexts, our transform feedback offsets are now saved, so we can just
keep using the ones from before the batch wrap.
Fixes piglit OpenGL 3.1/primitive-restart-xfb flush
Reviewed-by: Paul Berry <[email protected]>
NOTE: This is a candidate for the 9.1 branch.
|
|
|
|
|
|
|
|
|
|
| |
The rather unweildy logic for determining this condition was repeated
in a large number of places. This patch consolidates it to a single
inline function.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the i965 driver contained code to compute the maximum
number of vertices that could be written without overflowing any
transform feedback buffers. This code wasn't driver-specific, and for
GLES3 support we're going to need to use it in core mesa. So this
patch moves the code into a core mesa function,
_mesa_compute_max_transform_feedback_vertices().
Reviewed-by: Ian Romanick <[email protected]>
v2: Eliminate C++-style variable declarations, since these won't work
with MSVC.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In the gen6 GS case, we were under-counting and so other state would
get smashed. In the VS case, we were over-counting, so everything was
fine.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was copy and paste from the VS where I had similar code. We're
only looking at things derived from BRW_NEW_VERTEX_PROGRAM in this
block.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Improves VS state change microbenchmark performance by 7.08729% +/-
1.22289% (n=10) on gen7, because we don't upload the 64 dwords of
unused binding table any more.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes piglit EXT_transform_feedback/intervening-read
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When updating SOL indices, we were accidentally putting the starting
index in dword 1 and the SVBI number to increment in dword 2--these
should be reversed. Usually both of these values are zero, so we
didn't see any problem. However, if a transform feedback operation
spans multiple batch buffers, the starting index will be nonzero.
Fixes piglit test "EXT_transform_feedback/intervening-read output".
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
After creating new binding table entries for transform feedback, we
need to set the dirty flag BRW_NEW_SURFACES, so that a new binding
table pointer will be sent to the hardware. Otherwise the new binding
table entries will not take effect.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Although i965 gen6 does not yet support ARB_transform_feedback2 or
NV_transform_feedback2, it needs to support pause/resume functionality
so that meta-ops will work correctly.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't currently have kernel support for saving GPU registers on a
context switch, so if multiple processes are performing transform
feedback at the same time, their SVBI registers will interfere with
each other. To avoid this situation, we keep a software shadow of the
state of the SVBI 0 register (which is the only register we use), and
re-upload it on every new batch.
The function that updates the shadow state of SVBI 0 is called
brw_update_primitive_count, since it will also be used to update the
counters for the PRIMITIVES_GENERATED and
TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A common use case for transform feedback is to perform one draw
operation that writes transform feedback output to a buffer, followed
by a second draw operation that consumes that buffer as vertex input.
Since vertex input is consumed at an earlier pipeline stage than
writing transform feedback output, we need to flush the pipeline to
ensure that the transform feedback output is completely written before
the data is consumed.
In an ideal world, we would do some dependency tracking, so that we
would only flush the pipeline if the next draw call was about to
consume data generated by a previous draw call in the same batch.
However, since we don't have that sort of dependency tracking
infrastructure right now, we just unconditionally flush the buffer
every time glEndTransformFeedback() is called. This will cause a
performance hit compared to the ideal case (since we will sometimes
flush the pipeline unnecessarily), but fortunately the performance hit
will be confined to circumstances where transform feedback is in use.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
This patch adds basic transform feedback capability for Gen6 hardware.
This consists of several related pieces of functionality:
(1) In gen6_sol.c, we set up binding table entries for use by
transform feedback. We use one binding table entry per transform
feedback varying (this allows us to avoid doing pointer arithmetic in
the shader, since we can set up the binding table entries with the
appropriate offsets and surface pitches to place each varying at the
correct address).
(2) In brw_context.c, we advertise the hardware capabilities, which
are as follows:
MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS 64
MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS 4
MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS 16
OpenGL 3.0 requires these values to be at least 64, 4, and 4,
respectively. The reason we advertise a larger value than required
for MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS is that we have already
set aside 64 binding table entries, so we might as well make them all
available in both separate attribs and interleaved modes.
(3) We set aside a single SVBI ("streamed vertex buffer index") for
use by transform feedback. The hardware supports four independent
SVBI's, but we only need one, since vertices are added to all
transform feedback buffers at the same rate. Note: at the moment this
index is reset to 0 only when the driver is initialized. It needs to
be reset to 0 whenever BeginTransformFeedback() is called, and
otherwise preserved.
(4) In brw_gs_emit.c and brw_gs.c, we modify the geometry shader
program to output transform feedback data as a side effect.
(5) In gen6_gs_state.c, we configure the geometry shader stage to
handle the SVBI pointer correctly.
Note: ordering of vertices is not yet correct for triangle strips
(alternate triangles are improperly oriented). This will be addressed
in a future patch.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|