| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This patch adds a cleanup function to clean up sampler state at
context destruction time.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes piglit spec/ext_transform_feedback/overflow-edge-cases segfaults
because the query's fence pointer was null.
Tested with Piglit, Sauerbraten, ETQW.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We don't want to flush the command buffer or sync on the fence when ending
a query (that kind of defeats the whole purpose of async queries). Do that
instead in get_query_result().
Tested with Piglit, arbocclude, Sauerbraten game, Nobel Clinician Viewer,
ETQW.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch avoid emitting redundant DXSetSamplers command.
Tested with Lightsmark2008, Heaven, MTT piglit, glretrace, viewperf.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Put all the clearing related functions in svga_init_clear_functions()
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
define svga_init_clear_functions()
and svga_clear_texture as svga->pipe.clear_texture. This is part of
ARB_clear_texture extension
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
| |
To clear texture this function can be used. This is part of
ARB_clear_texture extension. Basically this extension allows you to
clear texture with given color values.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Saving all blitter states will be done in begin_blit() so that
begin_blit() can be used before performing any blit operation.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
For opt build, add VMX86_STATS to the list of cpp defines.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
With this patch, guest statistic gathering interface is added to
svga winsys interface that can be used to gather svga driver
statistic. The winsys module can then share the statistic info with
the VMX host via the mksstats interface.
The statistic enums used in the svga driver are defined in
svga_stats_count and svga_stats_time in svga_winsys.h
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the shader has indirect access to non-indexable temporaries,
convert these non-indexable temporaries to indexable temporary array.
This works around a bug in the GLSL->TGSI translator.
Fixes glsl-1.20/execution/fs-const-array-of-struct-of-array.shader_test
on DX11Renderer.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
To silence unused var warning with MSVC, MinGW.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From about kernel 4.9, GTT mmaps are virtually unlimited. A new
parameter, I915_PARAM_MMAP_GTT_VERSION, is added to advertise the
feature so query it and use it to avoid limiting tiled allocations to
only fit within the mappable aperture.
A couple of caveats:
- fence support is still limited by stride to 262144 and the stride
needs to be a multiple of tile_width (as before, and same limitation as
the current 3D pipeline in hardware)
- the max_gtt_map_object_size forcing untiled may be hiding a few bugs
in handling of large objects, though none were spotted in piglits.
See kernel commit 4cc6907501ed ("drm/i915: Add I915_PARAM_MMAP_GTT_VERSION
to advertise unlimited mmaps").
v2: Include some commentary on mmap virtual space vs CPU addressable
space.
Signed-off-by: Chris Wilson <[email protected]>
Cc: Kenneth Graunke <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This was found by obs:
I: Program returns random data in a function
E: Mesa no-return-in-nonvoid-function main/program_resource.c:109
v2: Remove the ! on the string (Ian Romanick)
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
We always use a coherent read, and ignore the "opt out" enable flag.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds the extension enable (so drivers can advertise it) and the
extra boolean state flag, GL_BLEND_ADVANCED_COHERENT_KHR, which can
be set to request coherent blending.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
We'll do blending in the shader in this case, so just disable the
hardware blending.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GPUs cannot handle GL_KHR_blend_equation_advanced natively, and
need to emulate it in the pixel shader. This lowering pass implements
all the necessary math for advanced blending. It fetches the existing
framebuffer value using the MESA_shader_framebuffer_fetch built-in
variables, and the previous commit's state var uniform to select
which equation to use.
This is done at the GLSL IR level to make it easy for all drivers to
implement the GL_KHR_blend_equation_advanced extension and share code.
Drivers need to hook up MESA_shader_framebuffer_fetch functionality:
1. Hook up the fb_fetch_output variable
2. Implement BlendBarrier()
Then to get KHR_blend_equation_advanced, they simply need to:
3. Disable hardware blending based on ctx->Color._AdvancedBlendEnabled
4. Call this lowering pass.
Very little driver specific code should be required.
v2: Handle multiple output variables per render target (which may exist
due to ARB_enhanced_layouts), and array variables (even with one
render target, we might have out vec4 color[1]), and non-vec4
variables (it's easier than finding spec text to justify not
handling it). Thanks to Francisco Jerez for the feedback.
v3: Lower main returns so that we have a single exit point where we
can add our blending epilogue (caught by Francisco Jerez).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will be used for emulating GL_KHR_advanced_blend_equation features
in shader code. We'll pass in the blending mode that's in use, and use
that in (effectively) a switch statement in the shader.
v2: Use the new _AdvancedBlendMode field.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
v2: Add null checks (requested by Curro).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm about to add more error conditions to this function, so I wanted to
move the current spec citation above the code that checks it. Indenting
it required reformatting, so I tried to move it to our newer style.
While there, I also decided to drop some GL type usage, and drop the
unnecessary "_mesa_" prefix on a static function.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This will be useful for a number of things:
- Checking the current advanced blending mode against the shader's
blend_support_* qualifiers.
- Disabling hardware blending when emulating advanced blending.
- Uploading the current advanced blending mode as a state var.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
Don't allow them in glBlendEquationSeparate[i], though, as required
by the spec.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
Since each qualifier represents a blending mode the shader can be used
with, we take the union of all possible modes when linking.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2 (Ken): Add a BLEND_NONE enum value (no qualifiers in use).
v3 (Ken): Rename gl_blend_support_qualifier to gl_advanced_blend_mode.
v4 (Ken): Mark map[] as static const (Ilia).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2 (Ken): Fix enum values, drop _mesa_BlendBarrierKHR stub as Curro has
already implemented it.
v3 (Ken): Rework for _mesa_BlendBarrierKHR -> _mesa_BlendBarrier rename.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Note that _mesa_BlendBarrierMESA is not currently hooked up in the
glapi XML, so we can just rename it. We'll hook it up for the
KHR_blend_equation_advanced extension shortly.
We may as well use the ES 3.2 core name with no suffixes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to insert code in each of the predecessors of the end block.
This code includes a nir_if, which would split the block, altering
the set. To avoid that, I emitted a dead constant at the end of each
block before splitting it, so that the set of predecessors remained
unchanged. This was admittedly ugly.
Connor suggested instead saving a copy of the set, so we can iterate
it safely. This is also a little ugly, but a much better plan.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to insert the code at the end of the program. Looping over
all the functions (of which there was only one) was the old way of doing
this, but now we have nir_shader_get_entrypoint(), so let's use it.
Suggested by Connor Abbott.
v2: Update for nir_shader_get_entrypoint API change.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Jason suggested adding an assert(function->impl) here. All callers
of this function actually want ->impl, so I decided just to change
the API.
We also change the nir_lower_io_to_temporaries API here. All but one
caller passed nir_shader_get_entrypoint(), and with the previous commit,
it now uses a nir_function_impl internally. Folding this change in
avoids the need to change it and change it back.
v2: Fix one call I missed in ir3_compiler (caught by Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This changes the pass internals to work with a nir_function_impl
directly rather than a nir_function. The next patch will change
the API.
v2: Rebase after framebuffer fetch landed.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The reason why it was safe for the scheduler to ignore the side
effects of framebuffer write instructions was that its side effects
couldn't have had any influence on any other instruction in the
program, because we weren't doing framebuffer reads, and framebuffer
writes were always non-overlapping. We need actual memory dependency
analysis in order to determine whether a side-effectful instruction
can be reordered with respect to other instructions in the program.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We weren't checking the fs_inst::target field when comparing whether
two instructions are equal. For FB writes it doesn't matter because
they aren't CSE-able anyway, but this would have become a problem with
FB reads which are expression-like instructions.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
brw_set_dp_read_message() was setting the data cache as send message
SFID on Gen7+ hardware, ignoring the target cache specified by the
caller. Some of the callers were passing a bogus target cache value
as argument relying on brw_set_dp_read_message not to take it into
account. Fix them too.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
hardware.
This is not enabled on the original Gen4 part because it lacks surface
state tile offsets so it may not be possible to sample from arbitrary
non-zero layers of the framebuffer depending on the miptree layout (it
should be possible to work around this by allocating a scratch surface
and doing the same hack currently used for render targets, but meh...).
On Gen9+ even though it should mostly work (feel free to force-enable
it in order to compare the coherent and non-coherent paths in terms of
performance), there are some corner cases like 1D array layered
framebuffers that cannot be handled easily by the non-coherent path
because of the incompatible layout in memory of 1D and 2D miptrees (it
should be possible to work around this too by doing state-dependent
recompiles, but it's hard to care enough since Gen9 has native support
for coherent render target reads...)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is a no-op if the platform supports coherent framebuffer fetch,
-- If it doesn't we just need to flush the render cache and invalidate
the texture cache in order for previous rendering to be visible to
framebuffer fetch.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This iterates over the list of attached render buffers and binds
appropriate surface state structures to the binding table block
allocated for shader framebuffer read.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_emit_surface_state.
This allows the caller to bind a miptree using a texture target other
than the one it it was created with. The code should work even if the
memory layouts of the specified and original targets don't match, as
long as the caller only intends to access a single slice of the
miptree structure.
This will be exploited by the next commit in order to support
non-coherent framebuffer fetch of a single layer of a 3D texture
(since some generations lack the minimum array element control for 3D
textures bound to the sampler unit), and multiple layers of a 1D array
texture (since binding it as an actual 1D array texture would require
state-dependent recompiles because the same shader couldn't
simultaneously work for 1D and 2D array textures due to the different
texel fetch coordinate ordering).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit does three different things in a single pass in order to
keep the amount of churn low: Remove the for_gather boolean argument
which was unused, pass the isl_view argument by value rather than by
reference since I'll have to modify it from within the function, and
add a target argument to allow callers to bind textures using a target
other than the original. The prototype of the function now looks
like:
void brw_emit_surface_state(struct brw_context *brw,
struct intel_mipmap_tree *mt,
GLenum target, struct isl_view view,
uint32_t mocs, uint32_t *surf_offset, int surf_index,
unsigned read_domains, unsigned write_domains);
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
structures.
This surface state control has been supported by all hardware
generations since G45.
Reviewed-by: Kenneth Graunke <[email protected]>
|