summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* compiler/lower_64bit_packing: rename the pass to be more genericIago Toral Quiroga2018-05-031-1/+1
| | | | | | It can do 32-bit packing too now. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Always try to create a logical contextChris Wilson2018-05-031-15/+14
| | | | | | | | | | | | | | | | | Always enable use of HW logical contexts to preserve GPU state between batches when the kernel supports such constructs, continuing to enforce the required support for gen6+. At runtime, this effectively removes the BRW_NEW_CONTEXT flag (and the upload of invariant state) from the start of every batch for any kernel supporting contexts. So long as the older atoms are correctly listening to the right flag (NEW_CONTEXT rather than NEW_BATCH) this should eliminate a few redundant state uploads for the older platforms. No piglits were harmed on ctg and ilk, both with and without logical contexts. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: decoder: limit to the number decoded lines from VBOLionel Landwerlin2018-05-021-0/+1
| | | | | | | | By default we set no limit, but the debug batch decoder in i965 sets it to 100. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse batch decoder infrastructure rather than open coding it.Kenneth Graunke2018-05-023-223/+55
| | | | | | | | | With the new callback, Jason's newer batch decoder infrastructure should be able to do just as well as the old open coded INTEL_DEBUG=bat handling, with much less code. If there are any limitations, we'd like to improve the common code rather than doing one-off hacks here. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Allocate shadow batches to explicitly be the BO size.Kenneth Graunke2018-05-021-7/+5
| | | | | | | | | | | | | | This unfortunately makes it malloc/realloc on every new batch, rather than once at startup. But it ensures that the shadow buffer's size will absolutely match the BO size. Otherwise, as we tune BATCH_SZ/STATE_SZ or bufmgr cache bucket sizes, we may get a BO size that's rounded up, and fail to allocate the shadow buffer large enough. This doesn't fix any bugs today, as BATCH_SZ/STATE_SZ are the size of a cache bucket, but it's better to be safe than sorry. Reported-by: James Xiong <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: activate the gl_BaseVertex loweringAntia Puentes2018-05-023-11/+3
| | | | | | | | | | | | | | | | | | | | | | | | | Surplus code related to the basevertex is removed. The Vertex Elements contain now: * VE 1: <firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is_indexed_draw, 0, 0> Also fixes unreachable message. Fixes OpenGL CTS tests: * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysInstancedParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.MultiDrawArraysIndirectCountParameters * KHR-GL46.shader_draw_parameters_tests.ShaderDrawArraysParameters * KHR-GL46.shader_draw_parameters_tests.ShaderMultiDrawArraysIndirectParameters Fixes Piglit tests: * arb_shader_draw_parameters-drawid-indirect baseinstance * arb_shader_draw_parameters-basevertex Reviewed-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102678
* intel: emit is_indexed_draw in the same VE than gl_DrawIDAntia Puentes2018-05-024-40/+59
| | | | | | | | | | | The Vertex Elements are now: * VE 1: <BaseVertex/firstvertex, BaseInstance, VertexID, InstanceID> * VE 2: <DrawID, is-indexed-draw, 0, 0> VE1 is it kept as it was before, VE2 additionally contains the new system value. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Drop unused gen5 sampler default color struct.Kenneth Graunke2018-05-011-9/+0
| | | | Trivial.
* i965: Make brw_vs_outputs_written static.Kenneth Graunke2018-05-012-5/+1
| | | | Drop a prototype. Trivial.
* i965/tex_image: Avoid the ASTC LDR workaround on gen9lpNanley Chery2018-05-011-1/+1
| | | | | | | | | | | Both the internal documentation and the results of testing this in the CI suggest that this is unnecessary. Add the fixes tag because this reduces an internal benchmark's startup time by about 17 seconds (reported by Eero). Fixes: 710b1d2e665 "i965/tex_image: Flush certain subnormal ASTC channel values" Tested-by: Eero Tamminen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* st/mesa: add support for nvidia conservative rasterization extensionsRhys Perry2018-04-303-0/+51
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: add support for nvidia conservative rasterization extensionsRhys Perry2018-04-3015-11/+478
| | | | | | | | Although the specs are written against compatibility GL 4.3 and allows core profile and GLES2+, it is exposed for GL 1.0+ and GLES1 and GLES2+. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nir: move GL specific passes to src/compiler/glslTimothy Arceri2018-05-013-5/+8
| | | | | | | With this we should have no passes in src/compiler/nir with any dependencies on headers from core GL Mesa. Reviewed-by: Alejandro Piñeiro <[email protected]>
* i965/tiled_memcpy: ytiled_to_linear a cache line at a timeScott D Phillips2018-04-301-6/+66
| | | | | | | | | | | | Similar to the transformation applied to linear_to_ytiled, also align each readback from the ytiled source to a cacheline (i.e. transfer a whole cacheline from the source before moving on to the next column). This will allow us to utilize movntqda (_mm_stream_si128) in a subsequent patch to obtain near WB readback performance when accessing the uncached ytiled memory, an order of magnitude improvement. Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Record mipmap resolver for unmappingChris Wilson2018-04-302-17/+22
| | | | | | | | | | | | When mapping a region of the mipmap_tree, record which complementary method to use to unmap it afterwards. By doing so we can avoid duplicating the decision tree used when mapping and thereby eliminate trivial errors that can be introduced if the two if-chains become out of sync. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_depthstencil before map_depthstencilChris Wilson2018-04-301-57/+57
| | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_etc before map_etcChris Wilson2018-04-301-21/+21
| | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_s8 before map_s8Chris Wilson2018-04-301-30/+30
| | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_movntdqa before map_movntdqaChris Wilson2018-04-301-12/+12
| | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_blit before map_blitChris Wilson2018-04-301-22/+22
| | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Move unmap_gtt before map_gttChris Wilson2018-04-301-6/+6
| | | | | | | | Reorder code to avoid a forward declaration in the next patch. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Don't stomp initial kflags for program cache.Kenneth Graunke2018-04-301-2/+2
| | | | | | | | | We want to flag EXEC_OBJECT_CAPTURE, but we ought to preserve any existing kflags. Today, there are none (as the program cache doesn't support 48-bit addressing), but once we start using softpin, we'll need to preserve EXEC_OBJECT_PINNED. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Let batchbuffers be placed anywhere in the 48-bit address space.Kenneth Graunke2018-04-301-1/+1
| | | | | | | | | We were trying to mark batch buffers with EXEC_OBJECT_CAPTURE, and accidentally stomped EXEC_OBJECT_SUPPORTS_48B_ADDRESS in the process. There's no reason to restrict batch buffers to the lower 4GB. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: fix check for 48b ppgtt supportScott D Phillips2018-04-301-13/+15
| | | | | | | | | | | | | | | | | The previous logic of the supports_48b_addresses wasn't actually checking if i915.ko was running with full_48bit_ppgtt. The ENOENT it was checking for was actually coming from the invalid context id provided in the test execbuffer. There is no path in the kernel driver where the presence of EXEC_OBJECT_SUPPORTS_48B_ADDRESS leads to an error. Instead, check the default context's GTT_SIZE param for a value greater than 4 GiB v2 (Ken): Fix in i965 as well. v3 Check GTT_SIZE instead of HAS_ALIASING_PPGTT (Chris Wilson) Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add TBO support for GL_EXT_texture_norm16Tapani Pälli2018-04-271-3/+3
| | | | | | | | Earlier plumbing missed interaction with texture buffer objects. Fixes: 7f467d4f73 "mesa: GL_EXT_texture_norm16 extension plumbing" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: drop the buffer mode param from the DrawBuffer driver functionTimothy Arceri2018-04-278-11/+10
| | | | | | No drivers used it. Reviewed-by: Brian Paul <[email protected]>
* st: Choose a 2101010 format for GL_RGB/GL_RGBA with a 2_10_10_10 type.Eric Anholt2018-04-261-0/+13
| | | | | | | | | | | | GLES's GL_EXT_texture_type_2_10_10_10_REV allows uploading this type to an unsized internalformat, and it should be non-color-renderable. fbobject.c's implementation of the check for color-renderable is checks that the texture has a 2101010 mesa format, so make sure that we have chosen a 2101010 format so that check can do what it meant to. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgb on vc5. Reviewed-by: Marek Olšák <[email protected]>
* st/mesa: fix missing setting of _ElementSize in new_draw_rasterpos_stageCharmaine Lee2018-04-261-0/+5
| | | | | | | | | With this patch, _ElementSize is initialized along with the rest of the vertex array attributes in new_draw_rasterpos_stage(). This fixes a crash in st_pipe_vertex_format() when running topogun-1.06-orc-84k-resize trace file with VMware svga driver. Reviewed-by: Brian Paul <[email protected]>
* radeon: Drop broken front_buffer_reading/drawing optimizationIan Romanick2018-04-263-46/+18
| | | | | Signed-off-by: Ian Romanick <[email protected]> Acked-by: Timothy Arceri <[email protected]>
* radeon: Use _mesa_is_front_buffer_drawingIan Romanick2018-04-264-25/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Acked-by: Timothy Arceri <[email protected]>
* mesa: GL_EXT_texture_norm16 extension plumbingTapani Pälli2018-04-256-9/+78
| | | | | | | | | | | | | | | | | | | | | Patch enables use of short and unsigned short data for texture uploads, rendering and reading of framebuffers within the restrictions specified in GL_EXT_texture_norm16 spec. Patch also enables those 16bit format layout qualifiers listed in GL_NV_image_formats that depend on EXT_texture_norm16. v2: expose extension with dummy_true fix layout qualifier map changes (Ilia Mirkin) v3: use _mesa_has_EXT_texture_norm16, other fixes and cleanup (Ilia Mirkin) v4: fix rest of the issues found Signed-off-by: Tapani Pälli <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: call DrawBufferAllocate driver hook in update_framebuffer for ↵Boyan Ding2018-04-251-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | windows-system FB When draw buffers are changed on a bound framebuffer, DrawBufferAllocate() hook should be called. However, it is missing in update_framebuffer with window-system framebuffer, in which FB's draw buffer state should match context state, potentially resulting in a change. Note: This is needed because gallium delays creating the front buffer, i965 works fine without this change. V2 (Timothy Arceri): - Rebased on merged/simplified DrawBuffer driver function - Move DrawBuffer call outside fb->ColorDrawBuffer[0] != ctx->Color.DrawBuffer[0] check to make piglit pass. v3 (Timothy Arceri): - Call new DrawBuffaerAllocate() driver function. Tested-by: Dieter Nützel <[email protected]> (v2) Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99116
* st/mesa: add new driver function DrawBufferAllocateTimothy Arceri2018-04-253-7/+11
| | | | | | | | | Unlike some of the classic drivers the st was only using DrawBuffer() to allocated some buffers on-demand. Creating a separate function will allow us to call it from update_framebuffer() in the following patch without regressing some of the older classic drivers. Reviewed-by: Marek Olšák <[email protected]>
* mesa: some C99 tidy ups for framebuffer.cTimothy Arceri2018-04-251-13/+5
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* meson: remove dummy_cppDylan Baker2018-04-241-1/+1
| | | | | | | | | | meson has gotten pretty smart about tracking C and C++ dependencies (internal and external), and using the right linker. This wasn't always the case and we created empty c++ files to force the use of the c++ linker. We don't need that any more. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: don't build classic mesa tests without dri_driversDylan Baker2018-04-241-1/+1
| | | | | | | | | | Since mesa_classic is build-on-demand the tests will create a demand and add a bunch of extra compilation. Fixes: 43a6e84927e3b1290f6f211f5dfb184dfe5a719e ("meson: build mesa test.") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/meta_util: Re-enable sRGB-encoded fast-clears on CNLNanley Chery2018-04-241-11/+0
| | | | | | | | | | | | | | | | | | | | | | | | The paths which sample with the clear color are now using a getter which performs the sRGB decode needed to enable this fast clear. This path can be exercised by fast-clearing a texture, then performing an operation which requires sRGB decoding. Test coverage for this feature is provided with the following tests: * Shader texture calls: - spec@ext_texture_srgb@tex-srgb * Shader texelfetch calls: - spec@arb_framebuffer_srgb@fbo-fast-clear - spec@arb_framebuffer_srgb@msaa-fast-clear * Blending: - spec@arb_framebuffer_srgb@arb_framebuffer_srgb-fast-clear-blend * Blitting: - spec@arb_framebuffer_srgb@blit texture srgb msaa enabled clear Reviewed-by: Jason Ekstrand <[email protected]>
* i965/miptree: Extend the sRGB-blending WA to future platformsNanley Chery2018-04-241-2/+2
| | | | | | The blending issue seems to be present on CNL as well. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add and use a getter for the clear colorNanley Chery2018-04-244-11/+51
| | | | | | | | | | | | | | | It returns both the inline clear color and a clear address which points to the indirect clear color buffer (or NULL if unused/non-existent). This getter allows CNL to sample from fast-cleared sRGB textures correctly by doing the needed sRGB-decode on the clear color (inline) and making the indirect clear color buffer unused. v2 (Rafael): * Have a more detailed commit message. * Add a comment on the sRGB conversion process. Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/wm_surface_state: Use the clear address if clear_bo is non-NULLNanley Chery2018-04-241-11/+6
| | | | | | | | | | We want to add and use a getter that turns off the indirect path by returning zero for the clear color bo and offset. v2: Fix usage of "clear address" in commit message (Jason). Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add and use a single miptree aux_buf fieldNanley Chery2018-04-2410-114/+96
| | | | | | | | | | | | | We want to add and use a function that accesses the auxiliary buffer's clear_color_bo and doesn't care if it has an MCS or HiZ buffer specifically. v2 (Jason Ekstrand): * Drop intel_miptree_get_aux_buffer(). * Mention CCS in the aux_buf field. Reviewed-by: Rafael Antognolli <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add and use a getter for the miptree aux bufferNanley Chery2018-04-244-41/+25
| | | | | | | | | Make the next patch easier to read by eliminating most of the would-be duplicate field accesses now. v2: Update the HiZ comment instead of deleting it (Rafael). Reviewed-by: Rafael Antognolli <[email protected]>
* i965: expose MESA_FORMAT_R8G8B8A8_SRGB visualTapani Pälli2018-04-241-3/+7
| | | | | | | | | | | | Exposing the visual makes following dEQP tests pass on Android: dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb Visual is exposed only when DRI_LOADER_CAP_RGBA_ORDERING is set. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* dri: Add __DRI_IMAGE_FORMAT_SABGR8Tapani Pälli2018-04-242-0/+5
| | | | | | | | | | Add format definition and required plumbing to create images. Note that there is no match to drm_fourcc definition, just like with existing _DRI_IMAGE_FOURCC_SARGB8888. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: rename api_validate.{c,h} -> draw_validate.{c,h}Timothy Arceri2018-04-249-10/+10
| | | | | Reviewed-by: Mathias Fröhlich <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65422
* i965: perf: enable GPA query statisticsLionel Landwerlin2018-04-233-1/+68
| | | | | | | | | | | | The combinaison of GPA/MDAPI components expects a particular name & layout for their pipeline statistics query. v2: Limit the query GPA/MDAPI statistics to gen7->9 (Lionel) v3: Add curly braces (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: add support for raw queriesLionel Landwerlin2018-04-236-27/+474
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The INTEL_performance_query extension provides a list of queries that a user can select to monitor a particular workload. Each query reports different sets of counters (roughly looking at different parts of the hardware, i.e. caches/fixed functions/etc...). Each query has an associated configuration that we need to program into the hardware before using the query. Up to now, we provided predefined queries. This change allows the user to build its own query (and associated configuration) externally, and have the i965 driver use that configuration through a new query named : Intel_Raw_Hardware_Counters_Set_0_Query When this query is selected, the i965 driver will report raw counters deltas (meaning their values need to be interpreted by the user, as opposed to existing queries that provide human readable values). This change is also useful for debug purposes for building new pre-defined queries and verifying the underlying numbers make sense before writing equations for user readable output. This change's purpose is also to enable GPA. GPA uses a library called MDAPI that processes raw counter data. MDAPI expects raw data to have a certain layout (per generation which is a bit unfortunate...). This change also embeds the expected data layouts. v2: Enable raw queries on gen 7->11, v1 had 7->9 (Lionel) v3: Don't assert on cherryview for gen7... (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: read slice/unslice frequencies from OA reportsLionel Landwerlin2018-04-232-0/+71
| | | | | | | | | | v2: Add comment breaking down where the frequency values come from (Ken) v3: More documentation (Ken/Lionel) Adjust clock ratio multiplier to reflect the divider's behavior (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: snapshot RPSTAT registerLionel Landwerlin2018-04-233-0/+67
| | | | | | | | | | | This register contains the current/previous frequency of the GT, it's one of the value GPA would like to have as part of their queries. v2: Don't use this register on baytrail/cherryview (Ken) Use GET_FIELD() macro (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: extract utility functionsLionel Landwerlin2018-04-236-252/+294
| | | | | | | | | | | | We would like to reuse a number of the functions and structures in another file in a future commit. We also move the previous content of brw_performance_query.h into brw_performance_query_metrics.h to be included by generated metrics files. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>