| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On SKL (also fast clear is used for level 0, layer 0):
Manhattan 3.0: 3.88434% +/- 0.814659%
Manhattan 3.0 off: 3.25542% +/- 0.101149%
Trex: 3.43501% +/- 0.31223%
Trex off: 4.13781% +/- 0.0993569%
ON BDW:
Manhattan 3.0: 1.37079% +/- 0.571208%
Manhattan 3.0 off: 1.74029% +/- 0.267499%
v2 (Ben, Matt): Fix rebase error by removing the perf warning
v3 (Topi): Rebased on top of revised eligibility logic
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
One can now also delete intel_get_non_msrt_mcs_alignment().
v2 (Jason): Do not leak aux buf but allocate only after getting
ISL surfaces.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that RESOLVED is not tracked in the map explicitly. Absence
of item implicitly means RESOLVED state.
v2: Added intel_resolve_map_clear() into intel_miptree_release()
v3 (Jason): Properly handle the assumption of resolve map not
containing any items with state RESOLVED. Removed
unnecessary intel_miptree_set_fast_clear_state() call
in brw_blorp_resolve_color() preventing
intel_miptree_set_fast_clear_state() from asserting
against RESOLVED.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Status is still tracked per miptree. Next patch will switch to
resolve map per slice/level.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now fast clear has been supported only for non-layered and
non-mipmapped buffers. However, from gen8 onwards there is hardware
support also for layered/mipmapped. Once this is enabled, fast clear
operations target specific layer/level and call for the state to be
tracked in the same granularity. This is the first step providing
the details from callers to the state tracking.
Patch introduces new interface for reading and writing the state
hiding the upcoming bookkeeping changes in the call sites. There is
bunch of sanity checks added that will be relaxed per hardware
generation later on when the actual functionality is enabled.
v2: Rebased on top current master setting the state in
blorp_surf_for_miptree().
v3: Replace open-coded resolved check in surface state emission
with intel_miptree_has_color_unresolved().
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This patch also introduces getter and setter for fast clear state
preparing for tracking the state per slice.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the status bits for fast clear include the flag telling
if non-multisampled mcs buffer should be used at all. Once the
state tracking is changed to follow individual levels/layers one
still needs to have the mcs enabling information in the miptree.
Therefore simply split it out to its own boolean.
Possible follow-up work is to combine disable_aux_buffers and
no_ccs into single enum.
v2 (Jason): Changed no_msrt_mcs to no_ccs and updated comment
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Make intel_miptree_resolve_color() take start layer and
layer count.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Upcoming patches will introduce fast clear in level/layer
granularity like the driver does already for depth/hiz. This patch
introduces equivalent full resolve option.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Essentially this moves fast clear state update away from surface
state setup into brw_postdraw_set_buffers_need_resolve() that gets
called just after draw submission.
Calling intel_miptree_used_for_rendering() can be drop for gen6
and earlier as it is no-op.
v2: Rebased on top current master setting the state in
blorp_surf_for_miptree().
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an specific list of texture targets that can be used with
glGetTextureImage. From OpenGL 4.5 spec, section '8.11 Texture Queries',
page 234 of the PDF:
"An INVALID_ENUM error is generated if the effective target is
not one of TEXTURE_1D , TEXTURE_2D , TEXTURE_3D , TEXTURE_1D_-
ARRAY , TEXTURE_2D_ARRAY , TEXTURE_CUBE_MAP_ARRAY , TEXTURE_-
RECTANGLE , one of the targets from table 8.19 (for GetTexImage
and GetnTexImage only), or TEXTURE_CUBE_MAP (for GetTextureImage
only)."
We are currently not validating the target for glGetTextureImage. As
an example, calling this function on a texture with target
GL_TEXTURE_2D_MULTISAMPLE should return INVALID_ENUM, but instead it
hits an assertion down the road in the i965 driver.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_lookup_texture_err() is not currently checking that the
texture-id can be zero, but _mesa_HashLookup() doesn't expect the key
to be zero, and will fail an assertion.
Considering that _mesa_lookup_texture_err() is called from
_mesa_GetTextureImage and _mesa_GetTextureSubImage with user provided
arguments, we must validate the texture-id before looking it up in the
hash-table.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes rendering in Dolphin on Vulkan since we enabled clip
distances. (Dolphin on GL has a similar bug because the linker
fails to eliminate unused clip distance built-in arrays, but it
isn't using SSO...so that needs more fixing.)
Also fixes a Piglit test:
spec/glsl-1.50/execution/geometry.clip-distance-vs-gs-out-sso
Signed-off-by: Kenneth Graunke <[email protected]>
Tested-by: Emmanuel Gil Peyrot <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen6-7.5 specify the user clip distance enable bitmask in 3DSTATE_CLIP.
Gen8+ normally uses the new internal signalling mechanism to select the
one specified in the last enabled shader stage (3DSTATE_VS, DS, or GS).
This is a pretty good fit for Vulkan, or even newer GL, where the
bitmask comes entirely from the shader. But with glClipPlane(),
this is dynamic state, and we have to listen to _NEW_TRASNFORM.
Clip plane enables are the only reason the VS/DS/GS atoms need to
listen to _NEW_TRANSFORM. 3DSTATE_CLIP already has to listen to it
in order to support ARB_clip_control settings.
Setting the "Use the 3DSTATE_CLIP bitmask" force enable bit allows
us to drop _NEW_TRANSFORM from all the shader stage atoms, so we can
re-emit them less often.
Improves performance of OglBatch7 (version 6) by 2.70773% +/- 0.491257%
(n = 38) at 1024x768 on Cherryview.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We can't render to 8x MSAA if the width is greater than 64 bits. (see
brw_render_target_supported)
Fixes ES31-CTS.sample_variables.mask.rgba32f.samples_8.mask_*
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2 (Jason):
- Use PRM citation for SKL now that it is available
- Also return false for gen < 8 mipmapped/arrayed
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
instead of in intel_miptree_init_mcs(). For lossless compression
the status is immediately overwritten in
intel_miptree_alloc_non_msrt_mcs() while the status for
non-compressed non-msaa miptrees is explicitly set in
do_blorp_clear().
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This path is not yet taken for fast cleared or compressed buffers
but later patches will enable it.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This is useful when checking if any slice is in unresolved state.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
And fix a mangled comment while at it.
v2 (Ben): Return the converted color.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was detected when examining CCS_E failures with piglit test:
"fbo-generatemipmap-formats". Test creates a 2D texture with
dimensions 293x277. It manually loops over all levels and calls
glTexImage2D(). Level one triggers creation of full miptree:
intel_alloc_texture_image_buffer() realizes that there is only one
level in the miptree and calls intel_miptree_create_for_teximage()
to re-allocate the miptree with all 9 levels. However, the end result
is a miptree with level zero dimensions of 292x276.
Related, and possibly calling for treatment of its own is mip-map
generation:
After calling glTexImage2D() against every level test continues by
replacing content for levels one to eight with data derived from level
zero by calling glGenerateMipmapEXT(). This results into the miptree
being allocated anew for every level:
Mip-map generation goes thru meta which ends up validating the texture
(brw_validate_textures()->intel_finalize_mipmap_tree()->
intel_miptree_match_image()) where one finds texture with base level
size 292:276. This results into new miptree being created for the npot
size 293:277. Only here intel_finalize_mipmap_tree() is asked for only
one level, and therefore such is created. Generation for level one in
turn finds right base level size but only one level when two is needed.
And the same goes on for all eight levels.
This patch prevents the shrink maintaining the NPOT size of 293x277.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In get_tex_memcpy, when copying texture data directly from source
to destination (when row strides match for both src and dst), the
copy size is currently calculated using the full texture height
instead of the sub-region height parameter that was passed.
This can cause a read past the end of the mapped buffer when y-offset
is greater than zero, leading to a segfault.
Fixes CTS test (from crash to pass):
* GL45-CTS/get_texture_sub_image/functional_test
v2: (Jason) Use the passed 'height' instead of copying til the
end of the buffer (tex-height - yoffset).
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
There are actually 6 of them according to the GL_KHR_vulkan_glsl spec.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
for debugging
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
for debugging
v2: wrap all checksums in #ifdef DEBUG
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The precision of our trig instructions appears to have been fixed on Kaby
Lake. Neither Ben nor I can find any documentation for this. However, the
dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on
Sky Lake.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
This has been unused since 943b69cddd
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old approach works fine, and this approach isn't necessarily better.
But it at least has the advantage that Vulkan and GL use the same
approach. I originally wrote it to gain additional testing for the
new paths.
shader-db statistics show 0 instruction count changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ARB_enhanced_layouts only requires component qualifier support for
generic varyings, so this is all the vec4 backend knew how to handle.
This patch extends the backend to handle it for all varyings, so we
can use store_output intrinsics with a component set for things like
clip/cull distances. We may want to use that for other VUE header
fields in the future as well.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
We need to calculate the number of vec4 slots correctly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes:
ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot
ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot
ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil
ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil
Cc: "13.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
As seen a couple of lines above - there's no way for the assert to
trigger.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
| |
Compile tested only.
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
| |
And include "util/simple_list.h" where it is needed in r200_state.c
Compile tested only.
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
| |
Compile tested only.
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
| |
This isn't useful for legacy GL, but will be used in Vulkan.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This also allows us to move it from a GL specific location to a
part of the compiler shared by both GL and Vulkan.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|