| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
to indicate write usage per buffer.
This is just a hint (it will be used by radeonsi).
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Documentation for glDrawPixels with GL_COLOR_INDEX says:
"If the GL is in color index mode, and if GL_MAP_COLOR is true,
the index is replaced with the value that it references in
lookup table GL_PIXEL_MAP_I_TO_I"
We are always in RGBA mode and there is nothing in documentation
about GL_MAP_COLOR in RGBA mode for GL_COLOR_INDEX.
Scale and bias are also only applicable for RGBA format and not
mentioned for GL_COLOR_INDEX.
Thus the behaviour will be on par with i965.
Fixes: gl-1.0-drawpixels-color-index
Signed-off-by: Danylo Piliaiev <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
CID: 1444309
Fixes: 9ab1b1d0227 "st/nir: Move 64-bit lowering later"
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
I need this part of lower_all_io_to_temps but without the actual
lowering to temps part.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes assertion failures in Piglit's "framebuffer-blit-levels
{draw,read} stencil" tests on iris. Also fixes assert failures in
frameretrace, which tries to ReadPixels the stencil values (only)
from a Z24S8 depth/stencil attachment.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
i965 does this, and st's tgsi path does this. st/nir did not.
Cuts 138MB of memory from a DiRT Rally trace, which is about 44%
of the total GLSL IR memory.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need to determine the number of uniform slots here, it's
already available as prog->Parameters->NumParameterValues. The way we
previously determined the number of slots was also broken for
PackedDriverUniformStorage, where we would add loc (in dwords) and
type_size() (in vec4s).
Signed-off-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we destroy a context, we need to temporarily make that context
the current one for the thread.
That's because during context tear-down we make many calls to
_mesa_reference_texobj(&texObj, NULL). Note there's no context
parameter. If the texture's refcount goes to zero and we need to
delete it, we use the thread's current context. But if that context
isn't the context we're tearing down, we get into trouble when
deallocating sampler views. See patch 593e36f956 ("st/mesa:
implement "zombie" sampler views (v2)") for background information.
Also, we need to release any sampler views attached to the fallback
textures.
Fixes a crash on exit with a glretrace of the Nobel Clinician
application.
v2: at end of st_destroy_context(), check if save_ctx == ctx and
unbind the context if so.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
These enums match but compiler warns about implicit conversion.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
(warning: 'const' type qualifier on return type has no effect)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For GLES2+ contexts, enable EXT_gpu_shader5 if the driver exposes a
sufficiently high ESSL feature level, even if the GLSL feature level
isn't high enough.
This allows drivers to support EXT_gpu_shader5 in GLES contexts before
they support all the additional features of ARB_gpu_shader5 in GL
contexts.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glMemoryBarrier() function makes shader memory stores ordered with
respect to things specified by the given bits. Until now, st/mesa has
ignored GL_TEXTURE_UPDATE_BARRIER_BIT and GL_BUFFER_UPDATE_BARRIER_BIT,
saying that drivers should implicitly perform the needed flushing.
This seems like a pretty big assumption to make. Instead, this commit
opts to translate them to new PIPE_BARRIER bits, and adjusts existing
drivers to continue ignoring them (preserving the current behavior).
The i965 driver performs actions on these memory barriers. Shader
memory stores go through a "data cache" which is separate from the
render cache and other read caches (like the texture cache). All
memory barriers need to flush the data cache (to ensure shader memory
stores are visible), and possibly invalidate read caches (to ensure
stale data is no longer visible). The driver implicitly flushes for
most caches, but not for data cache, since ARB_shader_image_load_store
introduced MemoryBarrier() precisely to order these explicitly.
I would like to follow i965's approach in iris, flushing the data cache
on any MemoryBarrier() call, so I need st/mesa to actually call the
pipe->memory_barrier() callback.
Fixes KHR-GL45.shader_image_load_store.advanced-sync-textureUpdate
and Piglit's spec/arb_shader_image_load_store/host-mem-barrier on
the iris driver.
Roland said this looks reasonable to him.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In all instances here we can replace pipe_sampler_view_release(pipe,
view) with pipe_sampler_view_reference(view, NULL) because the views
in question are private to the state tracker context. So there's no
danger of freeing a sampler view with the wrong context.
Testing done: google chrome, misc GL demos, games
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Reviewed-By: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As with the preceding patch for sampler views, this patch does
basically the same thing but for shaders. However, reference counting
isn't needed here (instead of calling cso_delete_XXX_shader() we call
st_save_zombie_shader().
The Redway3D Watch is one app/demo that needs this change. Otherwise,
the vmwgfx driver generates an error about trying to destroy a shader
ID that doesn't exist in the context.
Note that if PIPE_CAP_SHAREABLE_SHADERS = TRUE, then we can use/delete
any shader with any context and this mechanism is not used.
Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos
and a few Linux games.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Reviewed-By: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When st_texture_release_all_sampler_views() is called the texture may
have sampler views belonging to several contexts. If we unreference a
sampler view and its refcount hits zero, we need to be sure to destroy
the sampler view with the same context which created it.
This was not the case with the previous code which used
pipe_sampler_view_release(). That function could end up freeing a
sampler view with a context different than the one which created it.
In the case of the VMware svga driver, we detected this but leaked the
sampler view. This led to a crash with google-chrome when the kernel
module had too many sampler views. VMware bug 2274734.
Alternately, if we try to delete a sampler view with the correct
context, we may be "reaching into" a context which is active on
another thread. That's not safe.
To fix these issues this patch adds a per-context list of "zombie"
sampler views. These are views which are to be freed at some point
when the context is active. Other contexts may safely add sampler
views to the zombie list at any time (it's mutex protected). This
avoids the context/view ownership mix-ups we had before.
Tested with: google-chrome, google earth, Redway3D Watch/Turbine demos
a few Linux games. If anyone can recomment some other multi-threaded,
multi-context GL apps to test, please let me know.
v2: avoid potential race issue by always adding sampler views to the
zombie list if the view's context doesn't match the current context,
ignoring the refcount.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Neha Bhende <[email protected]>
Reviewed-by: Mathias Fröhlich <[email protected]>
Reviewed-By: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Starting a glxgears and closing it, I was seeing a lot of leaked TGSI for
the fixed function VPs.
v2: drop unused delete_ir() arg.
Fixes: 3b4929ec6e64 ("st/mesa: Copy VP TGSI tokens if they exist, even for NIR shaders.")
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
GLSL NIR gets freed on relink by _mesa_delete_program(), but for ARB
programs we need to free the old NIR when PSN is used to set up new NIR in
the same gl_program. Additionally, set the base .nir field so that it
will get freed by _mesa_delete_program().
Fixes: 3d7611e9a6c6 ("st/nir: use NIR for asm programs")
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
In preparation for the definition of a function to log a formatted
string.
Reviewed-by: Erik Faye-Lund <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a segfault when we try to access the array using a
-1 when the array wasn't allocated in the first place.
Before 7536af670b75 we would just access a pre-allocated array
that was also load/stored to/from the shader cache. But now the
cache will no longer allocate these arrays if they are empty.
The change resulted in tests such as the following segfaulting
when run with a warm shader cache.
tests/spec/arb_arrays_of_arrays/execution/sampler/fs-struct-const-index.shader_test
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename st_texture_free_sampler_views() to
st_delete_texture_sampler_views() to align with
st_DeleteTextureObject(), its only caller.
Move the call to st_texture_release_all_sampler_views() from
st_DeleteTextureObject() to st_delete_texture_sampler_views()
so all the sampler view clean-up code is in one place.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
| |
To st_texture_release_context_sampler_view() to be more clear
that it's context-specific.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
| |
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
st_init_driver_functions() is only called in st_context.c so there's
no need for the prototype in st_context.h
To avoid a forward declaration of st_init_driver_functions() in
st_context.c, we need to move around several other functions.
No functional change.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To de-clutter st_context.h.
Clean up remaining function prototypes in st_context.h.
The st_vp_uses_current_values() helper is only used in st_context.c
so move it there.
The st_get_active_states() function is only used in st_context.c so
remove its prototype in st_context.h
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since the compiler may not zero-out padding in the object.
Add a couple comments about this to prevent misunderstandings in
the future.
Fixes: 67d96816ff5 ("st/mesa: move, clean-up shader variant key decls/inits")
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
| |
Move the variant key declarations inside the scope they're used.
Use designated initializers instead of memset() calls.
Reviewed-by: Neha Bhende <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR and TGSI paths are currently intertwined which makes it
not only hard to follow but also makes it hard to take advantage
of the differences in IR.
Here we take the first step to splitting that path apart. With
this we take the opportunity to no longer call the GLSL IR
optimisation passes after the final lowering calls for NIR. We
can instead just use the NIR passes which can produce better code
and should also result in faster compile times.
The speed-up can be measured in some dolphin uber shaders due to
no longer calling lower_if_to_cond_assign() for example
dolphin/ubershaders/120.shader_test goes from ~1.63 -> ~1.53
seconds on my machine.
There are some code changes as a result of not calling
lower_if_to_cond_assign(), this is because it flattens ifs that
contain UBOs where as NIR's peephole select doesn't. This is
were most of the regressions in Max Waves happens with shader-db.
shader-db results (VEGA):
Totals from affected shaders:
SGPRS: 2349056 -> 2349640 (0.02 %)
VGPRS: 1322160 -> 1323300 (0.09 %)
Spilled SGPRs: 21190 -> 21527 (1.59 %)
Spilled VGPRs: 99 -> 99 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 72 -> 72 (0.00 %) dwords per thread
Code Size: 57260904 -> 57270932 (0.02 %) bytes
Compile Time: 1107186 -> 1022942 (-7.61 %) milliseconds
LDS: 786 -> 786 (0.00 %) blocks
Max Waves: 391932 -> 391619 (-0.08 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
glsl_to_nir() is still missing support for converting certain
functions to NIR, so for those we use the GLSL IR optimisations
to remove the functions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have a loop unrolling cost function and loop unrolling isn't
going to kill us the moment we have a 64-bit op in a loop, we can go
ahead and move 64-bit lowering later. This gives us the opportunity to
do more optimizations and actually let the full optimizer run even on
64-bit ops rather than hoping one round of opt_algebraic will fix
everything. This substantially reduces both fp64 shader compile times
and the resulting code size.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of trusting the caller to already have created a softfp64
function shader and added all its functions to our shader, we simply
take the softfp64 shader as an argument and do the function inlining
ouselves. This means that there's no more nasty functions lying around
that the caller needs to worry about cleaning up.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace done using:
find ./src -type f -exec sed -i -- \
's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \;
Acked-by: Karol Herbst <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace done using:
find ./src -type f -exec sed -i -- \
's/record_location_offset(/struct_location_offset(/g' {} \;
Acked-by: Karol Herbst <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Replace was done using:
find ./src -type f -exec sed -i -- \
's/is_record(/is_struct(/g' {} \;
Acked-by: Karol Herbst <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that locations can be set in different units, and the multiplier
argument caters to supporting these different units. For example,
st_glsl_to_nir uses dwords (4 bytes) so the multiplier should be 4,
while tgsi_to_nir uses bytes, so the multiplier should be 16.
Signed-Off-By: Timur Kristóf <[email protected]>
Tested-by: Andre Heider <[email protected]>
Tested-by: Rob Clark <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The nir_lower_uniforms_to_ubo function is useful outside of
mesa/state_tracker, and in fact is needed to produce NIR for
drivers that have the PIPE_CAP_PACKED_UNIFORMS capability.
Signed-Off-By: Timur Kristóf <[email protected]>
Tested-by: Andre Heider <[email protected]>
Tested-by: Rob Clark <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Optimize mulExtended to use 32x32->64 multiplication.
Drivers which are not based on NIR, they can set the
MUL64_TO_MUL_AND_MUL_HIGH lowering flag in order to have same old
behavior.
v2: Add missing condition check (Jason Ekstrand)
Signed-off-by: Sagar Ghuge <[email protected]>
Suggested-by: Matt Turner <Matt Turner <[email protected]>
Suggested-by: Jason Ekstrand <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Trivial.
|
|
|
|
|
| |
Replace tabs w/ spaces. 80-column wrapping.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that the buffer object usage history tracks if it is
being used as vertex buffer object, we can restrict setting
the ST_NEW_VERTEX_ARRAYS bit to dirty on glBufferData calls to
buffers that are potentially used as vertex buffer object.
Also put a note that the same could be done for index arrays
used in indexed draws.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This might enough for iris and possible r600 (when it gets NIR)
This appears to work for iris.
v2:
* change cap return so DOUBLES == 2 means sw emu
v3:
* Refactor using int64/doubles lowering options which were added
into nir options
* Remove DOUBLES == 2 added in v2
[jordan: Remove "2" value on PIPE_CAP_DOUBLES]
[jordan: Use lowering options added to nir options]
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Usually the uniforms will be assigned locations and have their slots
counted automatically, but for builtin shaders the location assignment
is manual. So count them too otherwise we get num_uniforms == 0.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Remove trailing whitespace, replace tabs w/ spaces, etc.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since using bitmasks we can easily check if we have any
current value that is potentially uploaded on array setup.
So check for any potential vertex program input that is not
already a vao enabled array. Only flag array update if there is
a potential overlap.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Fröhlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This header has been unused since f8f2520e88c ("st/mesa: Remove
unnecessary headers"). And in the more than 8 years since, this
hasn't been useful. So let's just get rid of it.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|