| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
It's possible that nir_shader was cloned and it no longer contains
a pointer to the shader_info in gl_program. So we need to copy
shader_info back to gl_program if that is the case.
Fixes a regression with NIR_TEST_CLONE=true
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98840
|
|
|
|
|
|
|
|
| |
Since we started releasing GLSL IR after linking the only time we can
print GLSL IR is during linking. When regenerating variants only NIR
will be available.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
These will be used by the on disk shader cache.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two distinctly different uses of this struct. The first
is to store GL shader objects. The second is to store information
about a shader stage thats been linked.
The two uses actually share few fields and there is clearly confusion
about their use. For example the linked shaders map one to one with
a program so can simply be destroyed along with the program. However
previously we were calling reference counting on the linked shaders.
We were also creating linked shaders with a name even though it
is always 0 and called the driver version of the _mesa_new_shader()
function unnecessarily for GL shader objects.
Acked-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
This way it's no longer part of libi965_compiler.la since it depends on
GLSL and ARB program stuff.
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_ARB_separate_shader_objects allows the application to mix-and-match
TCS and TES programs separately. This means that the interface between
the two stages isn't known until the final SSO pipeline is in place.
This isn't a great match for our hardware: the TCS and TES have to agree
on the Patch URB entry layout. Since we store data as per-patch slots
followed by per-vertex slots, changing the number of per-patch slots can
significantly alter the layout. This can easily happen with SSO.
To handle this, we store the [Patch]OutputsWritten and [Patch]InputsRead
bitfields in the TCS/TES program keys, introducing program recompiles.
brw_upload_programs() decides the layout for both TCS and TES, and
passes it to brw_upload_tcs/tes(), which store it in the key.
When creating the NIR for a shader specialization, we override
nir->info.inputs_read (and friends) to the program key's values.
Since everything uses those, no further compiler changes are needed.
This also replaces the hack in brw_create_nir().
To avoid recompiles, brw_precompile_tes() looks to see if there's a
TCS in the linked shader. If so, it accounts for the TCS outputs,
just as brw_upload_programs() would. This eliminates all recompiles
in the non-SSO case. In the SSO case, there should only be recompiles
when using a TCS and TES that have different input/output interfaces.
Fixes Piglit's mix-and-match-tcs-tes test.
v2: Pull the brw_upload_programs code into a brw_upload_tess_programs()
helper function (requested by Jordan Justen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TCS is the first tessellation shader stage, and the most
complicated. It has access to each of the control points in the input
patch, and computes a new output patch. There is one logical invocation
per output control point; all invocations run in parallel, and can
communicate by reading and writing output variables.
One of the main responsibilities of the TCS is to write the special
gl_TessLevelOuter[] and gl_TessLevelInner[] output variables which
control how much new geometry the hardware tessellation engine will
produce. Otherwise, it simply writes outputs that are passed along
to the TES.
We run in SIMD4x2 mode, handling two logical invocations per EU thread.
The hardware doesn't properly manage the dispatch mask for us; it always
initializes it to 0xFF. We wrap the whole program in an IF..ENDIF block
to handle an odd number of invocations, essentially falling back to
SIMD4x1 on the last thread.
v2: Update comments (requested by Jordan Justen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The TES is essentially a post-tessellator VS, which has access to the
entire TCS output patch, and a special gl_TessCoord input. Otherwise,
they're very straightforward.
This patch implements SIMD8 tessellation evaluation shaders for Gen8+.
The tessellator can generate a lot of geometry, so operating in SIMD8
mode (8 vertices per thread) is more efficient than SIMD4x2 mode (only
2 vertices per thread). I have another patch which implements SIMD4x2
mode for older hardware (or via an environment variable override).
We currently handle all inputs via the pull model.
v2: Improve comments (suggested by Jordan Justen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
We were including it in headers, which then caused it to be included in
tons of places it wasn't needed.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
At this point, the compiler API has been substantially simplified. In the
spirit of Kristian's making a compiler library, this commit makes a single
header file that contains, more-or-less, the entire compiler API.
There's still a bit of cleanup to do particularly in the area of geometry
shaders. However, this gets us much closer to having a separate compiler.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
They are no longer used.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because we only support geometry shaders in core profile, we can safely
ignore any driver-extending of VS outputs.
Those are:
- Legacy userclipping (doesn't exist in core profile)
- Edgeflag copying (Gen4-5 only, no GS support)
- Point coord replacement (Gen4-5 only, no GS support)
- front/back color hacks (Gen4-5 only, no GS support)
v2: Rebase; leave a comment about why SSO works.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The legacy userclip fields are only used for the vertex shader, and at
that point there's only program_string_id and the tex struct, which are
common to all keys. So there's no need for a "VUE" key base class.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are two uses of this flag.
The primary use is checking whether we need to emit code to convert
legacy gl_ClipVertex/gl_Position clipping to clip distances. In this
case, we also have to upload the clip planes as uniforms, which means
setting nr_userclip_plane_consts to a positive value. Checking if it's
> 0 works for detecting this case.
Gen4-5 also wants to know whether we're doing clipping at all, so it can
emit user clip flags. Checking if output_reg[VARYING_SLOT_CLIP_DIST0]
is set to a real register suffices for this.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This living in brw_fs.{h,cpp} is a historical artifact of us supporting
texturing for fragment shaders before any other stages. It's kind of
awkward given that we use it for all stages.
This avoids having to include brw_fs.h in geometry shader code in order
to access this function.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These structs aren't vec4 specific, they are shared by shader stages
operating on Vertex URB Entries (VUEs). VUEs are the data structures in
the URB that hold vertex data between the pipeline geometry stages.
Using vue in the name instead of vec4 makes a lot more sense, especially
when we add scalar vertex shader support.
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Vertex color clamping only applies to gl_[Secondary]{Front,Back}Color,
which are compatibility-only built-in varyings. We only support GS in
core profile, so they can't exist in geometry shaders.
We can drop several dirty bits from the GS program key - they're
unnecessary for a core profile implementation.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
Requested by Matt Turner.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This is really far removed from the API; we should just use C types.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With fs_visitor/fs_generator being reused for SIMD8 VS/GS programs,
we're running into weird #include patterns, where scalar code #includes
brw_vec4.h and such.
Program keys aren't really related to SIMD4X2/SIMD8 execution - they
mostly capture NOS for a particular shader stage. Consolidating them
all in one place that's vec4/scalar neutral should help avoid problems.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just remove the parameter. Silences:
brw_program.c: In function 'brw_dump_ir':
brw_program.c:566:33: warning: unused parameter 'brw' [-Wunused-parameter]
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the `high` 16 samplers on Haswell+ would not get sampler
workarounds applied.
Don't bother widening YUV fields, since they're ignored and going away
soon anyway.
Signed-off-by: Chris Forbes <[email protected]>
Cc: "10.1" <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was only going to get worse when tesselation shows up, and was
causing too much extra duplication in my stderr changes coming up.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
function.
This way it can be used anywhere. I need it from the visitor.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_stage_prog_data.
There doesn't seem to be any reason for nr_params, nr_pull_params,
param, and pull_param to be duplicated in the stage-specific
subclasses of brw_stage_prog_data. Moving their definition to the
common base class will allow some code sharing in a future commit, the
removal of brw_vec4_prog_data_compare and brw_*_prog_data_free, and
the simplification of the stage-specific brw_*_prog_data_compare.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to emit extra shader code in this case to sample the
MCS surface first; we can't just blindly do this all the time
since IVB will sometimes try to access the MCS surface even if
disabled.
V3: Use actual MSAA layout from the texture's mt, rather
then computing what would have been used based on the format.
This is simpler and less fragile - there's at least one case where
we might want to have a texture's MSAA layout change based on what
the app does (CMS SINT falling back to UMS if the app ever attempts
to render to it with a channel disabled.)
This also obsoletes V2's 1/10 -- compute_msaa_layout can now remain
an implementation detail of the miptree code.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
V4: Only flag quirks if there are any uses of gather in the shader,
to avoid spurious recompiles just because someone happened to use
RG32F.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we have the number of samplers available, we don't need to
iterate over all 16. This should be particularly helpful for vertex
shaders.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes brw_context available in every function that used
intel_context. This makes it possible to start migrating fields from
intel_context to brw_context.
Surprisingly, this actually removes some code, as functions that use
OUT_BATCH don't need to declare "intel"; they just use "brw."
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chris Forbes <[email protected]>
Acked-by: Paul Berry <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
I tried to ensure that performance in the non-debug case doesn't change
(we still just check one condition up front), and I think the impact is
small enough in the debug context case to warrant including all of it.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If adding scale parameters during program compile caused a realloc of
ParameterValues, then the driver uniform storage set up by
_mesa_associate_uniform_storage() would point to potentially freed
memory.
Note that this uses TexturesUsed, which may change at runtime for GLSL
when sampler uniforms change. This is a flaw in our handling of texrect
in general, and not one I'm fixing currently.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that most things are based on the linker-assigned index, it makes
sense to convert the arrays in the VS/WM program key as well. It seems
silly to leave them indexed by texture unit.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The whole reason I avoided this was because it might operate on a
brw_vertex_program or a brw_fragment_program. However, that isn't a
problem: all we need is the gl_program base type.
This avoids awkwardly passing the loop counter 'i' as a parameter,
simplifies both callers, and also plumbs prog in place for future use.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It had many problems:
- The shadow comparison was done post-filtering.
- It required state-dependent recompiles whenever the comparison
function changed.
- It didn't even work: many cases hit assertion failures.
- I never implemented it for the VS.
The new lowering pass which converts textureGrad to textureLod by
computing the LOD value works much better.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
The idea is to reuse this for the VS and (in the future) GS as well.
v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
Reviewed-by: Ian Romanick <[email protected]>
|