| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes GL45-CTS.gpu_shader_fp64.built_in_functions.
v2: use DDIV unconditionally (Roland)
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]> (v1)
Tested-by: Glenn Kennard <[email protected]>
Tested-by: James Harvey <[email protected]>
Cc: 17.0 <[email protected]>
(cherry picked from commit cfabbbcfd778cc404813c9f05a9ef79efe531980)
|
|
|
|
|
|
|
|
|
| |
This implements support for emitting FBFETCH ops, using the existing
lowering pass for advanced blend logic, and disabling hw blend when
advanced blending is enabled.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This is so that we can differentiate between flushing any framebuffer
reading caches from regular sampler caches.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Annoyingly, SPIR-V lets you specify all of these fields in either the
TCS or TES, which means that we need to be able to store all of them
for either shader stage. Putting them in a union won't work.
Combining both is an easy solution, and given that the TCS struct only
had a single field, it's pretty inexpensive.
This patch renames the combined struct to "tess" to indicate that it's
for tessellation in general, not one of the two stages.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
We no longer need anything from gl_linked_shader.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
We no longer need anything from gl_linked_shader.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
We now get everything we need from the gl_program param.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This change also removes the now duplicate NumImages field.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
We no longer need to pass gl_shader_program.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It feels weird using GL_* enums in a Vulkan driver.
v2: Fix the TESS_SPACING -> PIPE_TESS_SPACING conversion.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The vertex order is either clockwise or counterclockwise. We can just
store a "ccw" boolean rather than GLenum values. I don't want to use
GLenums in a Vulkan driver, and even in GL a simple boolean works fine.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This will help allow us to simplify the handling of samplers by
storing them in a single location rather than duplicating them in
both gl_linked_shader and gl_program.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set the flag via the _mesa_init_gl_program() and NewProgram()
helpers.
In i965 we currently check for the existance of gl_shader_program
to decide if this is an ARB assembly style program or not.
Adding a flag makes the code clearer and will help removes a
dependency on gl_shader_program in the i965 codegen functions.
Also this will allow use to skip initialising sampler units for
linked shaders, we currently memset it to zero again during linking.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Here we also remove the duplicate field in gl_linked_shader and always
get the value from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This will help allow us to store pointers to gl_program structs in the
CurrentProgram array resulting in a bunch of code simplifications.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
This also removes the duplicate field in gl_linked_shader, and
gets num_ubos from shader_info instead.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Having it here rather than in gl_linked_shader allows us to simplify
the code.
Also it is error prone to depend on the gl_linked_shader for programs
in current use because a failed linking attempt will free infomation
about the current program. In i965 we could be trying to recompile
a shader variant but may have lost some required fields.
We drop the memset on ImageUnits because gl_program is already
created using rzalloc().
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
It's redundant with the source modifier.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
It's redundant with the source modifier.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Broken by:
st/mesa: get Version from gl_program rather than gl_shader_program
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL compilation now takes 24% less time with the Gallium noop driver.
I used my shader-db for the measurement. The difference for the whole
radeonsi driver can be ~10%.
The generated TGSI is mostly the same. For example, the compilation success
rate with a TGSI->GCN bytecode converter without any optimizations is
the same. Note that glsl_to_tgsi does its own copy propagation and simple
register allocation.
shader-db GCN report:
- Talos spills fewer SGPRs.
- DOTA 2 spills more SGPRs.
- The average shader-db score is better, but it's just due to randomness.
29045 shaders in 17564 tests
Totals:
SGPRS: 1325929 -> 1325017 (-0.07 %)
VGPRS: 1010808 -> 1010172 (-0.06 %)
Spilled SGPRs: 1432 -> 1399 (-2.30 %)
Spilled VGPRs: 93 -> 92 (-1.08 %)
Private memory VGPRs: 688 -> 688 (0.00 %)
Scratch size: 2540 -> 2484 (-2.20 %) dwords per thread
Code Size: 39336732 -> 39342936 (0.02 %) bytes
Max Waves: 217937 -> 217969 (0.01 %)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
so that backends don't have to run it manually
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
gl_shader_program
This will allow us to make the CurrentProgram array store gl_program which allows
us to do a bunch of simplifications.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will help allow us to store gl_program in the CurrentProgram array rather
than gl_shader_program which will allow a bunch of simplifications.
Note that we make LinkedTransformFeedback a pointer so we don't waste
memory creating a struct for each stage. We also store a pointer to
the gl_program that will contain the pointer in gl_shader_program so
we can get easy access to the correct stage.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
git grep -l comparitor | xargs sed -i 's/comparitor/comparator/g'
Just happened to notice this in a patch that was sent and included one
of the tokens in question.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Allow drivers to emit GS outputs in a smarter way.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This reduces the number of sampler states 3.6x in Batman Arkham: Origins.
(from ~7200 to ~2000)
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 6bf63b011992dbbc899a28bde5692070dbcf965a.
A patch that adds a reference to gl_shader_program_data to gl_program
needs to land befor this one.
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is mostly just used during linking however the st uses it
when updating textures.
In order to store gl_program in the CurrentProgram array
rather than gl_shader_program we need to move this field to
the shared gl_shader_program_data struct.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is required for reading directly from fragment shader stencil and depth
outputs.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
gl_shader_program
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to directly store metadata we want to retain in
gl_program this metadata is currently stored in gl_linked_shader and
will be lost if relinking fails even though the program will remain
in use and is still valid according to the spec.
"If a program object that is active for any shader stage is re-linked
unsuccessfully, the link status will be set to FALSE, but any existing
executables and associated state will remain part of the current
rendering state until a subsequent call to UseProgram,
UseProgramStages, or BindProgramPipeline removes them from use."
This change will also help avoid the double handing that happens in
_mesa_copy_linked_program_data().
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In i965 we were calling _mesa_reference_program() after creating
gl_program and then later calling it again with NULL as a param
to get the refcount back down to 1. This changes things to not
use _mesa_reference_program() at all and just have gl_linked_shader
take ownership of gl_program since refcount starts at 1.
The st and ir_to_mesa linkers were worse as they were both getting
in a state were the refcount would never get to 0 and we would leak
the program.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Mark variables and static functions that only occur in assert()s as
MAYBE_UNUSED.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
We called it immediately prior, so re-use the previously returned value.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's common for games to compile 2000 programs or more so at
32bits x 2000 programs x 22 fields x 2 (at least) stages
This should give us something like 352 kilobytes in savings
once we add some more glsl only fields.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Since gl_program is now created with rzalloc() they should
already be initialised.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A number of drivers report useful debug/perf information accessible
through GL_ARB_debug_output and with debug contexts (i.e. setting the
GLX_CONTEXT_DEBUG_BIT_ARB flag). But few applications actually use
the GL_ARB_debug_output extension.
This change lets one set the MESA_DEBUG env var to "context" to force-set
a debug context and report debug/perf messages to stderr (or whatever
file MESA_LOG_FILE is set to). This is a useful debugging tool.
The small change in st_api_create_context() is needed so that
st_update_debug_callback() gets called to hook up the driver debug
callbacks when ST_CONTEXT_FLAG_DEBUG was not set, but MESA_DEBUG=context.
v2: use %.*s format string instead of allocating temporary buffer.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By using _mesa_image_address, the code becomes simpler _and_ fixes the bug
that GL_PACK_SKIP_IMAGES was applied even on non-3D textures.
Also, converting a whole slice at a time simplifies the format translation
fallback path.
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore.
v2: fix a silly mistake during code movement
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes use cases like glReadPixels from an RGBA8I framebuffer into
a PBO with type GL_INT by clamping values appropriately when they fall
outside the range of the destination format.
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
For consistency with st_pbo_get_download_fs.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|