| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now, nothing else can get allocated over them. That may change at
some point in the future.
This also means that base_reg_count can be computed without knowing the
number of registers used for the payload, which is required if we want
to allocate the register set once at context creation time.
See commit 551e1cd44f6857f7e29ea4c8f892da5a97844377, which implemented
virtually identical code in the FS backend.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Arrays, structures, and matrices use large VGRFs of arbitrary sizes.
However, split_virtual_grfs() breaks those down into VGRFs of size 1.
For reference, commit 5d90b988791e51cfb6413109271ad102fd7a304c is the
analogous change to the FS backend.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(From a suggestion by Francisco Jerez)
If an enum represents a bitfield of flags, e.g.:
enum E {
A = 1,
B = 2,
C = 4,
D = 8,
};
then C++ normally prohibits statements like this:
enum E x = A | B;
because A and B are implicitly converted to ints before OR-ing them,
and an int can't be stored in an enum without a type cast. C, on the
other hand, allows an int to be implicitly converted to an enum
without casting.
In the past we've dealt with this situation by storing flag bitfields
as ints. This avoids ugly casting at the expense of some type safety
that C++ would normally have offered (e.g. we get no warning if we
accidentally use the wrong enum type).
However, we can get the best of both worlds if we override the |
operator. The ugly casting is confined to the operator overload, and
we still get the benefit of C++ making sure we don't use the wrong
enum type.
v2: Remove unnecessary comment and unnecessary use of "enum" keyword.
Use static_cast.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We never noticed that this field was uninitialized because it is only
used in an error path that reports internal Mesa errors.
But it's silly to have it around anyway because &brw->ctx is
equivalent.
Should fix Coverity defect CID 1063351: Uninitialized pointer field
(UNINIT_CTOR) /src/mesa/drivers/dri/i965/brw_vec4_emit.cpp: 148
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If brwNewProgram is asked to create a program for an unrecognized
target, don't bother falling back on _mesa_new_program(). That just
hides bugs.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
v2: Use assert() rather than _mesa_problem().
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This functionality will need to be reused by geometry shaders.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We will need access to this array in order to configure the geometry
shader.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch introduces the vec4_gs_visitor class, which translates
geometry shaders from GLSL IR to back-end opcodes.
This class is derived from vec4_visitor (which is also the base class
for vec4_vs_visitor), so as a result most of the back end code is
shared. The only parts that differ are:
- Geometry shaders use a different input payload organization, since
the inputs need to match up with the outputs of the previous
pipeline stage (vec4_gs_visitor::setup_payload() and
vec4_gs_visitor::setup_varying_inputs()).
- Geometry shader input array dereferences need a special stride
computation, since all geometry shader inputs are interleaved into
one giant array (vec4_gs_visitor::compute_array_stride()).
- There are no geometry shader system values
(vec4_gs_visitor::make_reg_for_system_value()).
- At the beginning of a geometry shader, extra data in R0 needs to be
zeroed out, and a vertex counter needs to be initialized
(vec4_gs_visitor::emit_prolog()).
- When EmitVertex() appears in the shader, the current contents of
output variables need to be emitted to the URB, and the vertex
counter needs to be incremented
(vec4_gs_visitor::visit(ir_emit_vertex *)).
- When generating a URB_WRITE message to output vertex data, the
current state of the vertex counter needs to be used to store a
write offset in the message header
(vec4_gs_visitor::emit_urb_write_header()).
- The URB_WRITE message that outputs vertex data needs to be sent
using GS_OPCODE_URB_WRITE, since VS_OPCODE_URB_WRITE would overwrite
the offsets in the message header
(vec4_gs_visitor::emit_urb_write_opcode()).
- At the end of a geometry shader, the final vertex count needs to be
delivered using a URB WRITE message
(vec4_gs_visitor::emit_thread_end()).
- EndPrimitive() functionality is not implemented yet
(vec4_gs_visitor::visit(ir_end_primitive *)).
- There is no support for assembly shaders
(vec4_gs_visitor::emit_program_code()).
v2: Make num_input_vertices const. Refer to registers as rN rather
than gN, for consistency with the PRM. Fix misspelling. Improve
comment in the ir_emit_vertex visitor explaining why we emit vertices
inside a conditional. Enclose the conditional code in the
ir_emit_vertex visitor between curly braces.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Added a comment to vec4_generator::generate_gs_set_write_offset().
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will be used by geometry shaders to implement the EmitVertex()
function, since it requires writing data to a dynamically-determined
offset within the geometry shader's URB entry.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The arguments to brw_urb_WRITE() were getting pretty unwieldy, and we
have to add more flags to support geometry shaders anyhow.
Also plumb these flags through brw_clip_emit_vue(),
brw_set_urb_message(), and the vec4_instruction class.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Make id "unsigned" rather than "GLuint".
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I initially generalized the vec4_visitor class in preparation for
geometry shaders, I assumed that the setup_attributes() function would
need to be different between vertex and geometry shaders, but its
caller, setup_payload(), could be shared. So I made
setup_attributes() a virtual function.
It turns out this isn't true; setup_payload() needs to be different
too, since the geometry shader payload sometimes includes an extra
register (primitive ID) that has to come before uniforms.
So setup_payload() needs to be the virtual function instead of
setup_attributes().
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control,
which determines the register where the hardware delivers data sourced
from the URB (push constants followed by per-vertex input data).
For vertex shaders, we always set dispatch_grf_start_reg to 1, since
R1 is always the first register available for push constants in vertex
shaders.
For geometry shaders, we'll need the flexibility to set
dispatch_grf_start_reg to different values depending on the behvaiour
of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to
set it to 2 to allow the primitive ID to be delivered to the thread in
R1.
This patch eliminates the assumption that dispatch_grf_start_reg is
always 1. In vec4_visitor, we record the regnum that was passed to
vec4_visitor::setup_uniforms() in prog_data for later use. In
vec4_generator, we consult this value when converting an abstract
UNIFORM register to a concrete hardware register. And in the code
that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the
value recorded in prog_data.
This will allow us to set dispatch_grf_start_reg to the appropriate
value when compiling geometry shaders. Vertex shaders will continue
to always use a dispatch_grf_start_reg of 1.
v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint".
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves the following things into brw_vec4.{cpp,h}:
- struct brw_vec4_compile
- struct brw_vec4_prog_key
- brw_vec4_prog_data_compare()
- brw_vec4_prog_data_free()
This will allow us to avoid having to include brw_vs.h in
geometry-shader-specific files.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch that follows will move the definition of struct
brw_vec4_prog_key from brw_vs.h to brw_vec4.h, making it necessary for
brw_vs.h to include brw_vec4.h (because brw_vs.h defines struct
brw_vs_prog_key, which contains brw_vec4_prog_key as a member). Since
brw_vs.h is included from C source files, that means that brw_vec4.h
will need to be safe to include from C. Same for brw_shader.h, since
it is included by brw_vec4.h.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is backwards from what we are going to want in the long term, which is:
- brw_vec4.h declares general-purpose vec4 infrastructure needed by
both VS and GS
- brw_vs.h includes brw_vec4.h and adds VS-specific parts.
- brw_gs.h includes brw_vec4.h and adds GS-specific parts.
Note that at the moment brw_vec.h contains a fair amount of
VS-specific declarations--I plan to address that in a later patch.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise any GS that requires lowering (e.g. one that uses
gl_ClipDistance as an input or output) will fail to work.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_meta_begin() sets up an orthographic project and initializes the
viewport based on the current drawbuffer's width and height. This is
likely the window size, since it occurs before the meta operation binds
any temporary buffers.
decompress_texture_image needs the viewport to be the size of the image
it's trying to draw. Otherwise, it may only draw part of the image.
v2: Actually set the projection properly too.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68250
Signed-off-by: Kenneth Graunke <[email protected]>
Cc: Mak Nazecic-Andrlon <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes inconsistent failure of gles2conform/GL2Tests/glUniform/glUniform.test
under gnome-shell. What follows is a description of the bug and its fix.
When intel_update_renderbuffers() allocates a miptree for a winsys
renderbuffer, it propagates the renderbuffer's format to become also the
miptree's format.
If the winsys color buffer format is SARGB, then, in the first call to
eglMakeCurrent, intel_gles3_srgb_workaround() changes the renderbuffer's
format to ARGB. That is, it changes the format from sRGB to non-sRGB.
However, it changes the renderbuffer's format *after*
intel_update_renderbuffers() has allocated the renderbuffer's miptree.
Therefore, when eglMakeCurrent returns, the miptree format (SARGB)
differs from the renderbuffer format (ARGB).
If the X server reallocates the color buffer,
intel_update_renderbuffers() will create a new miptree for the
renderbuffer. The new miptree's format (ARGB) will differ from old
miptree's format (SARGB). This mismatch between old and new miptrees
causes bugs.
Fix the bug by moving intel_gles3_srgb_workaround() to occur *before*
intel_update_renderbuffers().
CC: "9.2" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67934
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was invaluable when debugging the global copy propagation
algorithm. We may as well commit it in case someone needs to print
out the sets in the future.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
Cc: 9.2 <[email protected]>
Tested-by: Brian Paul <brianp at vmware.com>
Reviewed-by: Brian Paul <brianp at vmware.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.
pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most
other things show less gains/no regressions, except furmark which
loses some 10 points.
I didn't have a BYT at hand for testing.
v2: Don't check (brw->gen == 7) in gen7 functions. (chadv)
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Just spotted these unpopulated MOCS fields when comparing the code
against BSpec. Set the MOCS to the same as everywhere else in Haswell:
L3-cacheable.
v2: Annotate state packet fields (chadv).
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have the number of samplers available, we don't need to
iterate over all 16. This should be particularly helpful for vertex
shaders.
v2: Use the correct shader program (caught by Paul Berry).
This needs to initialize the exact same set of sampler swizzles as
the actual key setup, or else we end up doing recompiles due to some
being XYZW and others being 0.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
For some reason, we didn't use this information even though the VS
backend has computed it (albeit poorly) for ages.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike the FS, the VS backend already computed the binding table size.
However, it did so poorly: after compilation, it looked to see if any
pull constants/textures/UBOs were in use, and set num_surfaces to the
maximum surface index for that category. If the VS only used a single
texture or UBO, this overcounted by quite a bit.
The shader time surface was also noted at state upload time (during
drawing), not at compile time, which is inefficient. I believe it also
had an off by one error.
This patch computes it accurately, while also simplifying the code.
It also renames num_surfaces to binding_table_size, since num_surfaces
wasn't actually the number of surfaces used. For example, a VS that
used one UBO and no other surfaces would have set num_surfaces to
SURF_INDEX_VS_UBO(1) == 18, rather than 1. A bit of a misnomer there.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
This will be useful for the next commit.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Computing the minimum size was easy, and done at compile-time for no
extra overhead here. Making the binding table smaller wastes less batch
space.
Adding the CACHE_NEW_WM_PROG dirty bit isn't strictly necessary, since
other atoms depend on it and flag BRW_NEW_SURFACES. However, it's best
to add it for clarity and safety. It shouldn't add any new overhead.
v2: Use binding_table_size, rather than max_surface_index.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By tracking the maximum surface index used by the shader, we know just
how small we can make the binding table.
Since it depends entirely on the shader program, we can just compute
it once at compile time, rather than at binding table emit time (which
happens during drawing).
v2: Store binding_table_size, rather than max_surface_index, for
consistency with the VS (which needs to be able to represent 0
surfaces).
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
SURF_INDEX_DRAW() has been the identity function since the dawn of time,
and both the shader code and binding table upload code relied on that,
simply using X rather than SURF_INDEX_DRAW(X).
Even if that continues to be true, using the macro clarifies the code.
The comment about draw buffers needing to be first in order for
headerless render target writes to work turned out to be wrong; with
this change, SURF_INDEX_DRAW can be changed to arbitrary indices and
everything continues working.
The confusion was over the "Render Target Index" field in the FB write
message header. If it were a binding table index, then RT 0 would have
to be at index 0 for headerless FB writes to work. However, it's
actually an index into the blend state table, so there's no problem.
Signed-off-by: Kenneth Graunke <[email protected]>
Cc: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we have the number of samplers available, we don't need to
iterate over all 16. This should be particularly helpful for vertex
shaders.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously, we computed sampler counts when generating the SAMPLER_STATE
table. By computing it earlier, we should be able to shorten a bunch of
loops.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows us to avoid uploading the VS sampler state table if only the
fragment program changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now, each shader stage has a sampler state table that only refers to the
samplers actually used by that problem. This should make the VS table
non-existant or very small.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
This allows us to coalesce the brw_samplers and gen7_samplers atoms.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also upload separate sampler default/texture border color entries.
At the moment, this is completely idiotic: both tables contain exactly
the same contents, so we're simply wasting batch space and CPU time.
However, soon we'll only upload data for textures actually /used/ in
a particular stage, which will usually make the VS table empty and
very likely eliminate all redundancy. This is just a stepping stone.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Like the previous patch, this simply pushes direct access to brw->wm up
one level in the call chain. Rather than passing the whole array, we
just pass a pointer to the correct spot in the array, similar to what we
do for the actual sampler state structure.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When we begin uploading separate sampler state tables for VS and FS,
we won't be able to use &brw->wm.sdc_offset[ss_index]. By passing it in
as a parameter, we push the problem up to the caller.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we only have a single sampler state table shared among all
stages, so we just copy wm.sampler_count into vs.sampler_count.
In the future, each shader stage will have its own SAMPLER_STATE table,
at which point we'll need these separate sampler counts.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|