summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Enable fast clears for multi-lodBen Widawsky2016-11-251-15/+0
| | | | | | | | | | | | | | | | | | | | On SKL (also fast clear is used for level 0, layer 0): Manhattan 3.0: 3.88434% +/- 0.814659% Manhattan 3.0 off: 3.25542% +/- 0.101149% Trex: 3.43501% +/- 0.31223% Trex off: 4.13781% +/- 0.0993569% ON BDW: Manhattan 3.0: 1.37079% +/- 0.571208% Manhattan 3.0 off: 1.74029% +/- 0.267499% v2 (Ben, Matt): Fix rebase error by removing the perf warning v3 (Topi): Rebased on top of revised eligibility logic Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Allow single-sampled miptree to be resolved and sharedTopi Pohjolainen2016-11-251-1/+1
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/gen8: Relax asserts prohibiting arrayed/mipmapped fast clearsTopi Pohjolainen2016-11-253-14/+18
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use ISL for CCS layoutsTopi Pohjolainen2016-11-252-104/+38
| | | | | | | | | | One can now also delete intel_get_non_msrt_mcs_alignment(). v2 (Jason): Do not leak aux buf but allocate only after getting ISL surfaces. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Resolve non-compressed fast clears prior layered renderingTopi Pohjolainen2016-11-251-0/+13
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Restrict fast color clear on first slice onlyTopi Pohjolainen2016-11-251-0/+8
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Track fast color clear state in level/layer granularityTopi Pohjolainen2016-11-253-30/+68
| | | | | | | | | | | | | | | | Note that RESOLVED is not tracked in the map explicitly. Absence of item implicitly means RESOLVED state. v2: Added intel_resolve_map_clear() into intel_miptree_release() v3 (Jason): Properly handle the assumption of resolve map not containing any items with state RESOLVED. Removed unnecessary intel_miptree_set_fast_clear_state() call in brw_blorp_resolve_color() preventing intel_miptree_set_fast_clear_state() from asserting against RESOLVED. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Move fast clear state enumeration into resolve mapTopi Pohjolainen2016-11-253-65/+68
| | | | | | | | Status is still tracked per miptree. Next patch will switch to resolve map per slice/level. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Refactor check if color resolve is neededTopi Pohjolainen2016-11-251-15/+28
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add plumbing for fast clear layer/level detailsTopi Pohjolainen2016-11-252-19/+34
| | | | | | | | | | | | | | | | | | | | | | Until now fast clear has been supported only for non-layered and non-mipmapped buffers. However, from gen8 onwards there is hardware support also for layered/mipmapped. Once this is enabled, fast clear operations target specific layer/level and call for the state to be tracked in the same granularity. This is the first step providing the details from callers to the state tracking. Patch introduces new interface for reading and writing the state hiding the upcoming bookkeeping changes in the call sites. There is bunch of sanity checks added that will be relaxed per hardware generation later on when the actual functionality is enabled. v2: Rebased on top current master setting the state in blorp_surf_for_miptree(). v3: Replace open-coded resolved check in surface state emission with intel_miptree_has_color_unresolved(). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add interface for checking multiple slices if any is unresolvedTopi Pohjolainen2016-11-252-0/+13
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Provide slice details to renderbuffer fast clear state trackerTopi Pohjolainen2016-11-254-16/+68
| | | | | | | | This patch also introduces getter and setter for fast clear state preparing for tracking the state per slice. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Split per miptree and per slice/level fast clear bitsTopi Pohjolainen2016-11-253-16/+16
| | | | | | | | | | | | | | | | Currently the status bits for fast clear include the flag telling if non-multisampled mcs buffer should be used at all. Once the state tracking is changed to follow individual levels/layers one still needs to have the mcs enabling information in the miptree. Therefore simply split it out to its own boolean. Possible follow-up work is to combine disable_aux_buffers and no_ccs into single enum. v2 (Jason): Changed no_msrt_mcs to no_ccs and updated comment Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Provide slice details to color resolverTopi Pohjolainen2016-11-256-18/+51
| | | | | | | | v2: Make intel_miptree_resolve_color() take start layer and layer count. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add new interface for full color resolvesTopi Pohjolainen2016-11-258-11/+23
| | | | | | | | | Upcoming patches will introduce fast clear in level/layer granularity like the driver does already for depth/hiz. This patch introduces equivalent full resolve option. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Refactor lossless compression state trackingTopi Pohjolainen2016-11-254-15/+7
| | | | | | | | | | | | | | Essentially this moves fast clear state update away from surface state setup into brw_postdraw_set_buffers_need_resolve() that gets called just after draw submission. Calling intel_miptree_used_for_rendering() can be drop for gen6 and earlier as it is no-op. v2: Rebased on top current master setting the state in blorp_surf_for_miptree(). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa/getteximage: Add validation of target to glGetTextureImageEduardo Lima Mitev2016-11-241-0/+5
| | | | | | | | | | | | | | | | | | | | There is an specific list of texture targets that can be used with glGetTextureImage. From OpenGL 4.5 spec, section '8.11 Texture Queries', page 234 of the PDF: "An INVALID_ENUM error is generated if the effective target is not one of TEXTURE_1D , TEXTURE_2D , TEXTURE_3D , TEXTURE_1D_- ARRAY , TEXTURE_2D_ARRAY , TEXTURE_CUBE_MAP_ARRAY , TEXTURE_- RECTANGLE , one of the targets from table 8.19 (for GetTexImage and GetnTexImage only), or TEXTURE_CUBE_MAP (for GetTextureImage only)." We are currently not validating the target for glGetTextureImage. As an example, calling this function on a texture with target GL_TEXTURE_2D_MULTISAMPLE should return INVALID_ENUM, but instead it hits an assertion down the road in the i965 driver. Reviewed-by: Nicolai Hähnle <[email protected]>
* main/texobj: Check that texture id > 0 before looking it up in hash-tableEduardo Lima Mitev2016-11-241-2/+3
| | | | | | | | | | | | | _mesa_lookup_texture_err() is not currently checking that the texture-id can be zero, but _mesa_HashLookup() doesn't expect the key to be zero, and will fail an assertion. Considering that _mesa_lookup_texture_err() is called from _mesa_GetTextureImage and _mesa_GetTextureSubImage with user provided arguments, we must validate the texture-id before looking it up in the hash-table. Reviewed-by: Nicolai Hähnle <[email protected]>
* i965: Always reserve clip distance VUE slots in SSO mode.Kenneth Graunke2016-11-231-0/+13
| | | | | | | | | | | | | | This fixes rendering in Dolphin on Vulkan since we enabled clip distances. (Dolphin on GL has a similar bug because the linker fails to eliminate unused clip distance built-in arrays, but it isn't using SSO...so that needs more fixing.) Also fixes a Piglit test: spec/glsl-1.50/execution/geometry.clip-distance-vs-gs-out-sso Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use 3DSTATE_CLIP's User Clip Distance Enable bitmask on Gen8+.Kenneth Graunke2016-11-235-18/+17
| | | | | | | | | | | | | | | | | | | | | | | | Gen6-7.5 specify the user clip distance enable bitmask in 3DSTATE_CLIP. Gen8+ normally uses the new internal signalling mechanism to select the one specified in the last enabled shader stage (3DSTATE_VS, DS, or GS). This is a pretty good fit for Vulkan, or even newer GL, where the bitmask comes entirely from the shader. But with glClipPlane(), this is dynamic state, and we have to listen to _NEW_TRASNFORM. Clip plane enables are the only reason the VS/DS/GS atoms need to listen to _NEW_TRANSFORM. 3DSTATE_CLIP already has to listen to it in order to support ARB_clip_control settings. Setting the "Use the 3DSTATE_CLIP bitmask" force enable bit allows us to drop _NEW_TRANSFORM from all the shader stage atoms, so we can re-emit them less often. Improves performance of OglBatch7 (version 6) by 2.70773% +/- 0.491257% (n = 38) at 1024x768 on Cherryview. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/gen7: Only advertise 4 samples for RGBA32F on GLESJordan Justen2016-11-231-3/+19
| | | | | | | | | | | We can't render to 8x MSAA if the width is greater than 64 bits. (see brw_render_target_supported) Fixes ES31-CTS.sample_variables.mask.rgba32f.samples_8.mask_* Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965: Restructure fast clear eligibility decisionBen Widawsky2016-11-231-14/+37
| | | | | | | | | v2 (Jason): - Use PRM citation for SKL now that it is available - Also return false for gen < 8 mipmapped/arrayed Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Set initial msaa fast clear status explicitlyTopi Pohjolainen2016-11-231-1/+1
| | | | | | | | | | | instead of in intel_miptree_init_mcs(). For lossless compression the status is immediately overwritten in intel_miptree_alloc_non_msrt_mcs() while the status for non-compressed non-msaa miptrees is explicitly set in do_blorp_clear(). Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Declare read-only input to level/layer check constTopi Pohjolainen2016-11-231-1/+1
| | | | | Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fbo: Prepare layer multiplier for render buffer compressionTopi Pohjolainen2016-11-231-1/+1
| | | | | | | | This path is not yet taken for fast cleared or compressed buffers but later patches will enable it. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add multi-slice getter for resolve mapsTopi Pohjolainen2016-11-232-7/+27
| | | | | | | This is useful when checking if any slice is in unresolved state. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/meta: Split conversion of color and setting itTopi Pohjolainen2016-11-233-19/+36
| | | | | | | | | And fix a mangled comment while at it. v2 (Ben): Return the converted color. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/miptree: Don't shrink textures when augmenting for more levelsTopi Pohjolainen2016-11-231-4/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This was detected when examining CCS_E failures with piglit test: "fbo-generatemipmap-formats". Test creates a 2D texture with dimensions 293x277. It manually loops over all levels and calls glTexImage2D(). Level one triggers creation of full miptree: intel_alloc_texture_image_buffer() realizes that there is only one level in the miptree and calls intel_miptree_create_for_teximage() to re-allocate the miptree with all 9 levels. However, the end result is a miptree with level zero dimensions of 292x276. Related, and possibly calling for treatment of its own is mip-map generation: After calling glTexImage2D() against every level test continues by replacing content for levels one to eight with data derived from level zero by calling glGenerateMipmapEXT(). This results into the miptree being allocated anew for every level: Mip-map generation goes thru meta which ends up validating the texture (brw_validate_textures()->intel_finalize_mipmap_tree()-> intel_miptree_match_image()) where one finds texture with base level size 292:276. This results into new miptree being created for the npot size 293:277. Only here intel_finalize_mipmap_tree() is asked for only one level, and therefore such is created. Generation for level one in turn finds right base level size but only one level when two is needed. And the same goes on for all eight levels. This patch prevents the shrink maintaining the NPOT size of 293x277. Signed-off-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* main/getteximage: Use the height argument to calculate memcpy copy sizeEduardo Lima Mitev2016-11-231-1/+1
| | | | | | | | | | | | | | | | | | In get_tex_memcpy, when copying texture data directly from source to destination (when row strides match for both src and dst), the copy size is currently calculated using the full texture height instead of the sub-region height parameter that was passed. This can cause a read past the end of the mapped buffer when y-offset is greater than zero, leading to a segfault. Fixes CTS test (from crash to pass): * GL45-CTS/get_texture_sub_image/functional_test v2: (Jason) Use the passed 'height' instead of copying til the end of the buffer (tex-height - yoffset). Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Implement load_layer_id for fragment shadersJason Ekstrand2016-11-221-0/+5
| | | | Reviewed-by: Jordan Justen <[email protected]>
* compiler: Add the rest of the subpassInput typesJason Ekstrand2016-11-221-0/+1
| | | | | | | There are actually 6 of them according to the GL_KHR_vulkan_glsl spec. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* mesa: use special checksums for unset checksums and fixed-func shadersMarek Olšák2016-11-222-0/+6
| | | | | | for debugging Reviewed-by: Timothy Arceri <[email protected]>
* glsl: add gl_linked_shader::SourceChecksumMarek Olšák2016-11-223-1/+17
| | | | | | | | for debugging v2: wrap all checksums in #ifdef DEBUG Reviewed-by: Timothy Arceri <[email protected]>
* mesa: use util_hash_crc32 instead of _mesa_str_checksumMarek Olšák2016-11-223-26/+2
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* i965/compiler: Disable trig workarounds on KBL+Jason Ekstrand2016-11-222-4/+8
| | | | | | | | | | The precision of our trig instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa/glsl: remove unused uses_builtin_functions fieldTimothy Arceri2016-11-232-2/+0
| | | | | | This has been unused since 943b69cddd Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Use NIR-based clip/cull lowering for OpenGL as well.Kenneth Graunke2016-11-222-1/+2
| | | | | | | | | | | | The old approach works fine, and this approach isn't necessarily better. But it at least has the advantage that Vulkan and GL use the same approach. I originally wrote it to gain additional testing for the new paths. shader-db statistics show 0 instruction count changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Handle component qualifiers on non-generic varyings.Kenneth Graunke2016-11-225-73/+53
| | | | | | | | | | | | | ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965/fs: Handle compact outputs.Kenneth Graunke2016-11-221-1/+3
| | | | | | | We need to calculate the number of vec4 slots correctly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/hsw: Set integer mode in sampling state for stencil texturingJordan Justen2016-11-212-18/+9
| | | | | | | | | | | | | | | Fixes: ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth24_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_pot ES31-CTS.functional.texture.border_clamp.formats.depth32f_stencil8_sample_stencil.nearest_size_npot ES31-CTS.functional.texture.border_clamp.unused_channels.depth24_stencil8_sample_stencil ES31-CTS.functional.texture.border_clamp.unused_channels.depth32f_stencil8_sample_stencil Cc: "13.0" <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: fold always true conditionalEmil Velikov2016-11-211-4/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: drop unneeded assertEmil Velikov2016-11-211-1/+0
| | | | | | | | As seen a couple of lines above - there's no way for the assert to trigger. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tnl: remove unneeded #include "util/simple_list.h"Brian Paul2016-11-202-2/+0
| | | | Reviewed-by: Vinson Lee <[email protected]>
* radeon: remove unneeded #include "util/simple_list.h"Brian Paul2016-11-205-5/+0
| | | | | | Compile tested only. Reviewed-by: Vinson Lee <[email protected]>
* r200: remove unneeded #include "util/simple_list.h"Brian Paul2016-11-205-5/+1
| | | | | | | And include "util/simple_list.h" where it is needed in r200_state.c Compile tested only. Reviewed-by: Vinson Lee <[email protected]>
* i915: remove unneeded #include "util/simple_list.h"Brian Paul2016-11-202-2/+0
| | | | | | Compile tested only. Reviewed-by: Vinson Lee <[email protected]>
* mesa: remove unneeded #includes in errors.cBrian Paul2016-11-201-6/+0
| | | | Reviewed-by: Vinson Lee <[email protected]>
* mesa: remove trailing whitespace in errors.cBrian Paul2016-11-201-6/+6
| | | | Reviewed-by: Vinson Lee <[email protected]>
* i965: Store a clip_distance_mask field similar to cull_distance_mask.Kenneth Graunke2016-11-194-0/+7
| | | | | | | This isn't useful for legacy GL, but will be used in Vulkan. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use shader_info for brw_vue_prog_data::cull_distance_mask.Kenneth Graunke2016-11-196-12/+12
| | | | | | | | This also allows us to move it from a GL specific location to a part of the compiler shared by both GL and Vulkan. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>