| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eventually we're going to use functions to set bits on an instruction.
Putting 'default' in the name of functions that alter default state will
help distinguins them.
This patch was generated entirely mechanically, by the following:
for file in brw*.{cpp,c,h}; do
sed -i \
-e 's/brw_set_mask_control/brw_set_default_mask_control/g' \
-e 's/brw_set_saturate/brw_set_default_saturate/g' \
-e 's/brw_set_access_mode/brw_set_default_access_mode/g' \
-e 's/brw_set_compression_control/brw_set_default_compression_control/g' \
-e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \
-e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \
-e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \
-e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \
$file;
done
No manual changes were done after running that command.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Will be used to print disassembly after jump targets are set and
instructions are compacted, while still retaining higher-level IR
annotations and basic block information.
An array of 'struct annotation' will live along side the generated
assembly. The generators will populate the array with their IR
annotations, and basic block pointers if the instructions began or ended
a basic block pointer.
We'll then update the instruction offset when we compact instructions
and then using the annotations print the disassembly.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Let's us avoid recompacting the SIMD8 instructions when we compact the
SIMD16 program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
"Disassemble" is an accurate description of what this function does.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Looping over the instructions and calling brw_disasm doesn't handle
compacted instructions. In most cases, this hasn't been a problem since
we don't compact prior to Sandybridge.
However, Sandybridge's transform feedback GS program should already be
compacted, and so this ought to fix decoding of that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
brw_disasm doesn't disassemble compacted instructions, so we uncompact
before disassembling them which would unset the compaction control bit.
Instead pass it as a separate argument.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Basically a sed but shaderapi.c and get.c.
get.c => GL_CURRENT_PROGAM always refer to the "old" UseProgram behavior
shaderapi.c => the old api stil update the Shader object directly
V2: formatting improvement
V3 (idr):
* Rebase fixes after a block of code was moved from ir_to_mesa.cpp to
shaderapi.c.
* Trivial reformatting.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
array.
These are replaced with
ctx->Shader.CurrentProgram[MESA_SHADER_{VERTEX,FRAGMENT,GEOMETRY}].
In patches to follow, this will allow us to replace a lot of ad-hoc
logic with a variable index into the array.
With the exception of the changes to mtypes.h, this patch was
generated entirely by the command:
find src -type f '(' -iname '*.c' -o -iname '*.cpp' ')' \
-print0 | xargs -0 sed -i \
-e 's/\.CurrentVertexProgram/.CurrentProgram[MESA_SHADER_VERTEX]/g' \
-e 's/\.CurrentGeometryProgram/.CurrentProgram[MESA_SHADER_GEOMETRY]/g' \
-e 's/\.CurrentFragmentProgram/.CurrentProgram[MESA_SHADER_FRAGMENT]/g'
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.
This was the sed script I used:
$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#
# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g
# Rename emails
s/[email protected]/[email protected]/
s/[email protected]/[email protected]/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\[email protected]/[email protected]/g
s/keithw\[email protected]/[email protected]/g
s/[email protected]/[email protected]/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/[email protected]/[email protected]/
# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g
# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Performed via:
$ for file in *; do sed -i 's/ *//g'; done
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
"ff" is for "fixed function". This frees up the name "gs" to refer to
user-defined geometry shaders.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context inherit directly from gl_context; that was the
only thing left in intel_context.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Most functions no longer use intel_context, so this patch additionally
removes the local "intel" variables to avoid compiler warnings.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This has more of a negative impact than the previous patch, as on Gen6
passing primitives through to the clipper means we actually have to make
the GS thread write them to the URB.
I don't see another good solution though, and rasterizer discard is not
the most common of cases, so hopefully it won't be too terrible.
v2: Add a perf_debug; resolve rebase conflicts on the brw dirty flags;
remove the rasterizer_discard field from brw_gs_prog_key.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_NEW_TRANSFORM_FEEDBACK is not used by core Mesa, so it can be removed.
Instead, an new private flag is added to i965 to serve the same purpose.
If you're new to this:
* When creating a context. you can set private dirty flags
in gl_context::DriverFlags, eg.:
ctx->DriverFlags.NewStateX = BRW_NEW_STATE_X;
* When StateX is changed, core Mesa does:
ctx->NewDriverState |= ctx->DriverFlags.NewStateX;
* When you have to draw, read and clear ctx->NewDriverState.
* Pros: not touching NewState, the driver decides the mapping between
GL states and hw state groups, unlimited number of flags in core Mesa
(still limited number of flags in the driver though)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow the generic parts to be re-used for geometry shaders.
Reviewed-by: Jordan Justen <[email protected]>
v2: Put urb_read_length and urb_entry_size in the generic struct.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Future patches will allow for there to be separate VUE maps when both
a geometry shader and a vertex shader are in use. When this happens,
we will want to have correspondingly separate outputs_written
bitfields. Moving outputs_written into the VUE map will make this
easy.
For consistency with the terminology used in the VUE map, the bitfield
is renamed to "slots_valid" in the process.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The new name clarifies that it represents *one more* than the maximum
possible brw_varying_slot value.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch makes the following search-and-replace changes:
gl_vert_result -> gl_varying_slot
VERT_RESULT_* -> VARYING_SLOT_*
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
brw_vs_prog_data::userclip hasn't been used since commit f0cecd4
(i965: Move VUE map computation to once at VS compile time).
brw_gs_prog_key::userclip_active hasn't been used since commit 9f3d321
(i965: Make the userclip flag for the VUE map come from VS prog data).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The rather unweildy logic for determining this condition was repeated
in a large number of places. This patch consolidates it to a single
inline function.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Commit 774fb90db3e83d5e7326b7a72e05ce805c306b24 introduced a ralloc context to
each user of struct brw_compile, but for this one a NULL context was used,
causing the later ralloc_free(mem_ctx) to not do anything.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175
NOTE: This is a candidate for the stable branches.
|
|
|
|
|
|
|
|
|
| |
This was a debug option during gen6 transform feedback bringup (and a
similar one existed during gen4 bringup). However, it looks like
we're done with that, and we don't anticipate it being used again,
either for geometry shaders or transform feedback.
Suggested by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
With this and the previous patch, 640x480 nexuiz is running 0.169118%
+/- 0.0863696% faster (n=121). On a VS state change microbenchmark,
performance is increased 8.28645% +/- 0.460478% (n=52).
v2: Fix CACHE_NEW_VS comment.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This reduces recomputation of state based on non-clipping-related
transform changes, and is a step toward removing VUE map
recomputation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Although i965 gen6 does not yet support ARB_transform_feedback2 or
NV_transform_feedback2, it needs to support pause/resume functionality
so that meta-ops will work correctly.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes it easier to keep track of which dirty bits correspond to
which pieces of context, since it makes _NEW_RASTERIZER_DISCARD
correspond with ctx->RasterDiscard.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were storing the RasterDiscard flag (for
GL_RASTERIZER_DISCARD) in gl_context::TransformFeedback. This was
confusing, because we use the _NEW_TRANSFORM flag (not
_NEW_TRANSFORM_FEEDBACK) to track state updates to it, and because
rasterizer discard has effects even when transform feedback is not in
use.
This patch makes RasterDiscard a toplevel element in gl_context rather
than a subfield of gl_context::TransformFeedback.
Note: We can't put RasterDiscard inside gl_context::Transform, since
all items inside gl_context::Transform need to be pieces of state that
are saved and restored using PushAttrib and PopAttrib.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables rasterizer discard functionality (a part of
transform feedback) in Gen6, by generating an alternate GS program
when rasterizer discard is active. Instead of forwarding vertices
down the pipeline, the alternate GS program uses a URB Write message
to deallocate the URB entry that was allocated by FF sync and
terminate the thread.
Note: parts of the Sandy Bridge PRM seem to imply that we could do
this more efficiently, by clearing the GEN6_GS_RENDERING_ENABLE bit,
and not allocating a URB entry at all. However, it's not clear how we
are supposed to terminate the thread if we do that. Volume 2 part 1,
section 4.5.4, says "GS threads must terminate by sending a URB_WRITE
message with the EOT and Complete bits set.", and my experiments so
far confirm that.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds basic transform feedback capability for Gen6 hardware.
This consists of several related pieces of functionality:
(1) In gen6_sol.c, we set up binding table entries for use by
transform feedback. We use one binding table entry per transform
feedback varying (this allows us to avoid doing pointer arithmetic in
the shader, since we can set up the binding table entries with the
appropriate offsets and surface pitches to place each varying at the
correct address).
(2) In brw_context.c, we advertise the hardware capabilities, which
are as follows:
MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS 64
MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS 4
MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS 16
OpenGL 3.0 requires these values to be at least 64, 4, and 4,
respectively. The reason we advertise a larger value than required
for MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS is that we have already
set aside 64 binding table entries, so we might as well make them all
available in both separate attribs and interleaved modes.
(3) We set aside a single SVBI ("streamed vertex buffer index") for
use by transform feedback. The hardware supports four independent
SVBI's, but we only need one, since vertices are added to all
transform feedback buffers at the same rate. Note: at the moment this
index is reset to 0 only when the driver is initialized. It needs to
be reset to 0 whenever BeginTransformFeedback() is called, and
otherwise preserved.
(4) In brw_gs_emit.c and brw_gs.c, we modify the geometry shader
program to output transform feedback data as a side effect.
(5) In gen6_gs_state.c, we configure the geometry shader stage to
handle the SVBI pointer correctly.
Note: ordering of vertices is not yet correct for triangle strips
(alternate triangles are improperly oriented). This will be addressed
in a future patch.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch stores the geometry shader VUE map from a local variable in
compile_gs_prog() to a field in the brw_gs_compile struct, so that it
will be available while compiling the geometry shader. This is
necessary in order to support transform feedback on Gen6, because the
Gen6 geometry shader code that supports transform feedback needs to be
able to inspect the VUE map in order to find the correct vertex data
to output.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Gen6, transform feedback is accomplished by having the geometry
shader send vertex data to the data port using "Streamed Vertex Buffer
Write" messages, while simultaneously passing vertices through to the
rest of the graphics pipeline (if rendering is enabled).
This patch adds a geometry shader program that simply passes vertices
through to the rest of the graphics pipeline. The rest of transform
feedback functionality will be added in future patches.
To make the new geometry shader easier to test, I've added an
environment variable "INTEL_FORCE_GS". If this environment variable
is enabled, then the pass-through geometry shader will always be used,
regardless of whether transform feedback is in effect.
On my Sandy Bridge laptop, I'm able to enable INTEL_FORCE_GS with no
Piglit regressions.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, GS generation code contained a lookup table that mapped
primitive types POLYGON, TRISTRIP, and TRIFAN to TRILIST, mapped
LINESTRIP to LINELIST, and left all other primitives unchanged. This
was silly, because we never generate a GS program for those primitive
types anyhow.
This patch removes the unnecessary lookup table.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Only 4 other prepare() functions are left, which don't rely on this.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chad Versace <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For example, GL_TRIANLGES is converted to _3DPRIM_TRILIST.
The conversion is necessary because HiZ and MSAA resolve operations emit
a 3DPRIM_RECTLIST, which cannot be conveyed by GLenum.
As a consequence, brw_gs_prog_key.primitive is also converted.
v2
----
- [anholt] Split brw_set_prim into brw/gen6 variants in previous commit,
since not much code is really shared between the two.
- [anholt] Replace switch statements with table lookups, since this is
a hot path.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, brw_compute_vue_map required an argument indicating the
number of clip planes in use, but all it did with it was check if it
was nonzero.
This patch changes brw_compute_vue_map to take a boolean instead.
This allows us to avoid some unnecessary recompilation of the Gen4/5
GS and SF threads.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The i965 driver already had a function to count bits in a 64-bit uint
(brw_count_bits()), but it was buggy (it only counted the bottom 32
bits) and it was clumsy (it had a strange and broken fallback for
non-GCC-like compilers, which fortunately was never used). Since Mesa
already has a _mesa_bitcount() function, it seems better to just
create a _mesa_bitcount_64() function rather than special-case this in
the i965 driver.
This patch creates the new _mesa_bitcount_64() function and rewrites
all of the old brw_count_bits() calls to refer to it.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we now lay out the VUE the same way regardless of whether
two-sided color is enabled, brw_compute_vue_map() no longer needs to
know whether two-sided color is enabled. This allows the two-sided
color flag to be removed from the clip, GS, and VS keys, so that fewer
GPU programs need to be recompiled when turning two-sided color on and
off.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The previous computation had two bugs: (a) it used a formula based on
Gen5 for Gen6 and Gen7 as well. (b) it failed to account for the fact
that PSIZ is stored in the VUE header. Fortunately, both bugs caused
it to compute a URB size that was too large, which was benign. This
patch computes the URB size directly from the VUE map, so it gets the
result correct in all circumstances.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace each occurence of
#include "../glsl/*.h"
with
#include "glsl/*.h"
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There will be a little bit of thrashing of the program cache BO as the
cache warms up, but once the application is in steady state, this
reduces relocations on gen5 and later.
On my T420 laptop, cairogl firefox-talos-gfx performance improves 2.6%
+/- 1.3% (n=6). No statistically significant performance difference
on nexuiz (n=5).
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This would be so much easier if we were using C++; we could simply use
constructors and destructors. Instead, we have to update all the
callers.
While we're at it, ralloc various brw_wm_compile fields rather than
explicitly calloc/free'ing them.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On Sandybridge, we don't need to break down primitives. There's no need
to bother setting up brw_compile and such if it's not going to be used;
bail as early as possible.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|