summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* egl: polish EXT_image_dma_buf_import attr parsingEmil Velikov2017-07-121-29/+22
| | | | | | | | Simplify the existing if/else + temporary variable into if (foo) return X. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: simplify EXT_image_dma_buf_import_modifiers attr parsingEmil Velikov2017-07-121-26/+4
| | | | | | | Move the common extension check at the top. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: split _eglParseImageAttribList into per extension functionsEmil Velikov2017-07-121-186/+260
| | | | | | | | | | Will allow us to simplify existing code and make further improvements short and simple. No functional change intended. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: call _eglError within _eglParseImageAttribListEmil Velikov2017-07-123-30/+18
| | | | | | | | | | | | | | | | | | | | | | | As per EGL_KHR_image_base: If an attribute specified in <attrib_list> is not one of the attributes listed in Table bbb, the error EGL_BAD_PARAMETER is generated. We should set the error as opposed to simply log it. Currently we have a partial solution, whereby only some of the callers call _eglError(). Since that has proven to be less robust, simply set the error by the function itself and change the return type to EGLBoolean, updating the callers. So now the code is slightly simpler. Plus the follow-up fixes will be easier to manage. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl: move eglCreateDRMImageMESA's malloc laterEmil Velikov2017-07-121-29/+23
| | | | | | | | | | | | | | | | Don't bother allocating any memory until we're finished parsing and sanitising all the attributes. As a nice side effect we now consistently set eglError when any of the attrib/values are not correct. Strangely enough the spec does not mention _anything_ about what error should be set where, even if the implementation already sets the odd one. Cc: Kristian Høgsberg <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* svga: fix texture swizzle writemaskingBrian Paul2017-07-111-0/+2
| | | | | | | | | | | | | | | Commit bfe1e7737a76e3b046 changed how texture swizzles are set up. This exposed a latent bug in the VMware driver: we were ignoring the texture instruction's writemask when applying the 0 and 1 swizzle terms. This wasn't caught by the Piglit texture swizzle test because it only exercises fixed function (no write masking). Fixes issues seen with ETQW apitrace. CC: <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* i965: Use VALGRIND_MAKE_MEM_x in place of MALLOCLIKE/FREELIKEChris Wilson2017-07-111-7/+27
| | | | | | | | | | | | | | | | | | | | | Valgrind doesn't actually implement VALGRIND_FREELIKE_BLOCK as the exact inverse of VALGRIND_MALLOCLIKE_BLOCK. It makes the block inaccessible, but still leaves it defined in its allocation tracker i.e. it will report the mmap as lost despite the call to FREELIKE! Instead of treating the mmap as an allocation, treat it as changing the access bits upon the memory, i.e. that it becomes defined (because of the buffer objects always contain valid content from the user's perspective) upon mmap and inaccessible upon munmap. This makes memcheck happy without leaving it thinking there is a very large leak. Finally for consistency, we treat all the mmap/munmap paths the same even though valgrind can intercept the regular mmap used for GTT. We could move this in the drm_mmap/drm_munmap macros, but that quickly looks ugly given the desire for those to support different OSes, but I didn't try that hard! Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix asynchronous mappings on !LLC platforms.Kenneth Graunke2017-07-111-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | When using a read-only CPU mapping, we may encounter stale buffer contents. For example, the Piglit primitive-restart test offers the following scenario: 1. Read data via a CPU map. 2. Destroy that buffer. 3. Create a new buffer - obtaining the same one via the BO cache. 4. Call BufferSubData, which does a GTT map with MAP_WRITE | MAP_ASYNC. (We avoid set_domain for async mappings, so no flushing occurs.) 5. Read data via a CPU map. (Without explicit clflushing, this will contain data from step 1!) Otherwise, everything ought to work, keeping in mind that we never use CPU maps for writing - just read-only CPU maps. This restores the performance gains after Matt's revert in commit 71651b3139c501f50e6547c21a1cdb816b0a9dde. v2: Do the invalidate later, and even when asking for a brand new map. v3: Add more comments from Chris. Reviewed-by: Chris Wilson <[email protected]>
* i965: Don't use PREAD for glGetBufferSubData().Kenneth Graunke2017-07-113-28/+10
| | | | | | | | | | | | | | | Just map the buffer and memcpy. This will do a CPU mmap, which should be reasonably efficient, and doing this gives us full control over the domains and caching instead of leaving it to the kernel. This prevents regressions on Braswell in the next commit. Specifically GL45-CTS.shader_atomic_counters.basic-buffer-operations. Because async maps start skipping set-domain, the pread thought everything was nicely still in the CPU domain, and returned stale data. v2: Use _mesa_error_no_memory() if the map fails instead of crashing. Reviewed-by: Chris Wilson <[email protected]>
* swr: build driver proper separate from rasterizerTim Rowley2017-07-115-39/+36
| | | | | | | | | | | | | | | | | | swr used to build and link the rasterizer to the driver, and to support multiple architectures we needed to have multiple versions of the driver/rasterizer combination, which needed to link in much of mesa. Changing to having one instance of the driver and just building architecture specific versions of the rasterizer gives a large reduction in disk space. libGL.so 6464 Kb -> 7000 Kb libswrAVX.so 10068 Kb -> 5432 Kb libswrAVX2.so 9828 Kb -> 5200 Kb Total 26360 Kb -> 17632 Kb Reviewed-by: Emil Velikov <[email protected]>
* swr: switch to using SwrGetInterface api tableTim Rowley2017-07-1110-65/+72
| | | | | | | | | Use the SWR rasterizer API through the table returned from SwrGetInterface rather than referencing the functions directly. This will allow us to move to a model of having the driver dynamically load the appropriate swr architecture library. Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: make SWR_VISIBLE attribute work for windowsGeorge Kyriazis2017-07-111-1/+1
| | | | | | Needed to expose SwrGetInterface Reviewed-by: Bruce Cherniak <[email protected]>
* i965: perf: use new subslices numbers from device infoLionel Landwerlin2017-07-111-32/+17
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* intel: add number of subslices to device infoLionel Landwerlin2017-07-112-8/+54
| | | | | | | | | | | We could have used a single integer to store that value, but Cannonlake has different number of subslices per slice depending on the GT. v2: Add CFL subslice numbers (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* i965: Use already existing eu_totalBen Widawsky2017-07-111-8/+1
| | | | | | | | Reduces IOCTL calls by 1, and provides a centralized place to override such configurations if we have a need to do so. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Resolve framebuffers before signaling the fenceChris Wilson2017-07-111-0/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From KHR_fence_sync: When the condition of the sync object is satisfied by the fence command, the sync is signaled by the associated client API context, causing any eglClientWaitSyncKHR commands (see below) blocking on <sync> to unblock. The only condition currently supported is EGL_SYNC_PRIOR_COMMANDS_COMPLETE_KHR, which is satisfied by completion of the fence command corresponding to the sync object, and all preceding commands in the associated client API context's command stream. The sync object will not be signaled until all effects from these commands on the client API's internal and framebuffer state are fully realized. No other state is affected by execution of the fence command. If clients are passing the fence fd (from EGL_ANDROID_native_fence_sync) to a compositor, that fence must only be signaled once the framebuffer is resolved and not before as is currently the case. v2: fixup assert to use GL_SYNC_GPU_COMMANDS_COMPLETE (Chad) Reported-by: Sergi Granell <[email protected]> Fixes: c636284ee8ee ("i965/sync: Implement DRI2_Fence extension") Signed-off-by: Chris Wilson <[email protected]> Cc: Sergi Granell <[email protected]> Cc: Rob Clark <[email protected]> Cc: Chad Versace <[email protected]> Cc: Daniel Stone <[email protected]> Cc: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* svga: s/unsigned/enum tgsi_texture_type/Brian Paul2017-07-111-3/+4
| | | | | Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: s/unsigned/enum tgsi_swizzleBrian Paul2017-07-111-4/+4
| | | | | Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: s/unsigned/enum tgsi_interpolate_mode/Brian Paul2017-07-111-1/+2
| | | | | | | And s/unsigned/enum tgsi_interpolate_loc/ Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: s/unsigned/enum tgsi_file_type/Brian Paul2017-07-111-7/+7
| | | | | Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: s/unsigned/enum tgsi_semantic/Brian Paul2017-07-114-8/+10
| | | | | | | Makes gdb debugging a little nicer. Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* i965: Assert that we don't use CPU write maps to non-coherent buffers.cros-mesa-17.1.1-r3-vanillachadv/cros-mesa-17.1.1-r3-vanillaKenneth Graunke2017-07-101-0/+6
| | | | | | | | | Using CPU maps of non-coherent buffers can get us in a lot of trouble, and WC maps are a reasonable alternative anyway. Guard against shooting ourselves in the foot by adding an assert, and comment. Reviewed-by: Chris Wilson <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Disable access to CPU mmap for async access on non-LLC machinesChris Wilson2017-07-101-4/+12
| | | | | | | | | | | | | | | | | If the user triggers an implicit batch flush while holding access to a CPU mapped buffer, that mmapping will be invalidated by the kernel for non-LLC devices. (The kernel when executing a batch will change the cache domain of the buffers in that batch, which for non-LLC CPU access will cause that buffer to be clflushed and any further CPU access to be discarded.) To prevent this, simply disallow any CPU async mmap access. The cases where async CPU access to a non-LLC buffer should continue to be allowed via their preferred snooping path. v2 (Ken): Reword the comment slightly. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Track when a bo is shared with an external clientChris Wilson2017-07-102-0/+9
| | | | | | | | | | | | | If the buffer is being shared with an external client, our own state tracking may be stale and in some cases we may wish to double check with the kernel/hw state. At the moment, this is synonymous with not being reusable, but the semantics between reusable and external are quite different and we will have more examples of non-reusable buffers in the near future. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: Fix clflushing on modern (Baytrail+) Atom CPUs.Kenneth Graunke2017-07-101-0/+12
| | | | | | | | Thanks to Chris Wilson for pointing this out. Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel: Move clflush helpers from anv to common/gen_clflush.h.Kenneth Graunke2017-07-107-34/+63
| | | | | | | | | I want to use these in the OpenGL driver as well. v2: Add to COMMON_FILES in Makefile.sources (caught by Emil) Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* spirv: Fix reaching unreachable for compare exchange on imagesJames Legg2017-07-101-0/+1
| | | | | | | | | | | | | | | We were hitting the unreachable("Invalid image opcode") near the end of vtn_handle_image when parsing the SpvOpAtomicCompareExchange opcode. v2: Add stable CC. v3: Ignore SpvOpAtomicCompareExchangeWeak. It requires the Kernel capability which is not exposed in Vulkan, and spirv_to_nir is not used for OpenCL which does support it. Reviewed-by: Jason Ekstrand <[email protected]> CC: <[email protected]>
* gallium: use "ull" number suffix to keep the QtCreator parser happyMarek Olšák2017-07-1010-38/+38
| | | | | | | It can't parse "llu". Reviewed-by: Thomas Helland <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965: Use brw_bo_wait() for brw_bo_wait_rendering()Chris Wilson2017-07-105-8/+10
| | | | | | | | | | | | | | Currently, we use set_domain() to cause a stall on rendering. But the set-domain ioctl has the side-effect of changing the kernel's cache domain underneath the struct_mutex, which may perturb state if there was no rendering to wait upon and in general is much heavier than the lockless wait-ioctl. Historically libdrm used set-domain as we did not have an explicit wait-ioctl (and the patches to teach it to use wait if available were lost in the mists). Since mesa already depends upon a kernel support the wait-ioctl, we do not need to supply a fallback. Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* svga: fix PIPE_CAP_MAX_TEXTURE_BUFFER_SIZE valueBrian Paul2017-07-101-1/+4
| | | | | | | | | | | | This query is supposed to return the max texture buffer size/width in texels, not size in bytes. Divide by 16 (the largest format size) to return texels. Fixes Piglit arb_texture_buffer_object-max-size test. Cc: [email protected] Reviewed-by :Charmaine Lee <[email protected]>
* svga: fix breakage in create_backed_surface_view()Brian Paul2017-07-101-4/+3
| | | | | | | | This fixes a regression in some piglit tests since commit 5e5d5f1a2eb. I think I mis-resolved the merge conflict when cherry-picking that commit to master. Reviewed-by: Charmaine Lee <[email protected]>
* anv: Stop setting domains to RENDER on EXEC_OBJECT_WRITEJason Ekstrand2017-07-101-5/+2
| | | | | | | | | | | | The reason we were doing this was to ensure that the kernel did the appropriate cross-ring synchronization and flushing. However, the kernel only looks at EXEC_OBJECT_WRITE to determine whether or not to insert a fence. It only cares about the domain for determining whether or not it needs to clflush the BO before using it for scanout but the domain automatically gets set to RENDER internally by the kernel if EXEC_OBJECT_WRITE is set. Reviewed-by: Chris Wilson <[email protected]>
* a5xx: fix condition for updating *_FS_OUTPUT_CNTLIlia Mirkin2017-07-091-1/+1
| | | | | | | | | | | The register values depend on the currently set program, so make sure to revalidate when the program changes. Fixes glsl-1.10-fragdepth as well as dEQP-GLES3.functional.shaders.fragdepth.compare.* Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* radv/ac: drop setting xnackDave Airlie2017-07-091-2/+1
| | | | | | | | | | | Since radv uses compute rings and we can't know when we are setting up the shaders what ring they are to be used on, we should just use the default xnack setting. This may be suboptimal in some places, but if we hit a problem, we likely should try and address this between llvm and mesa. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: add support for using addrlib max alignment.Dave Airlie2017-07-096-5/+15
| | | | | | | | Rather than using 64k, use what addrlib returns as the base alignment for vulkan allocations. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: copy front interpolation when creating fake back color inputIlia Mirkin2017-07-081-2/+6
| | | | | | | | Fixes a bunch of gl_BackColor interpolation tests that had explicit interpolation specified on the fragment shader gl_Color. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* a5xx: remove no-longer-accurate border color layout commentIlia Mirkin2017-07-081-32/+1
| | | | | | | | Better to just point at the bcolor_entry struct which has our current understanding encoded into it. Also add an assert to ensure that the struct remains the expected size. Signed-off-by: Ilia Mirkin <[email protected]>
* a5xx: fix border color for depth formatsIlia Mirkin2017-07-081-1/+5
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* a5xx: add border color clamping, add packed border color formatsIlia Mirkin2017-07-081-11/+59
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* a5xx: fix border colors for swizzled texture formatsIlia Mirkin2017-07-081-14/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* a5xx: fix integer texture border colorsIlia Mirkin2017-07-081-4/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* a5xx: fix primitive restartIlia Mirkin2017-07-082-12/+23
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nir/spirv: Remove unnecessary comment.Andres Gomez2017-07-081-5/+0
| | | | | | | | | It should have been removed after 00c47e111c. Cc: Jason Ekstrand <[email protected]> Cc: Connor Abbott <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: Add compute htile clear for combined depth+stencil surfaces.Bas Nieuwenhuizen2017-07-081-9/+7
| | | | | | | | Figured out the clear value when we have a combined depth stencil surface. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* draw: handle more TGSI_SEMANTIC_COLOR indicesRoland Scheidegger2017-07-083-10/+27
| | | | | | | | | | | | | It could only handle indices 0/1, otherwise what happened was bad (accessing array out of bounds, no crash but kind of random). This is enough for the gl state tracker (primary/secondary color) but not enough for some other state trackers (d3d9 has no limits on the number of color interpolants). The complexity with color semantics are all due to the front/back mapping (2 outputs in the vs map to one input in the fs) so this isn't extended to indices > 1 - d3d9 has no use for back colors, therefore this isn't needed and still only 2 back colors can be handled correctly. Reviewed-by: Brian Paul <[email protected]>
* st/mesa: Fix grabbing the wrong variant if glDrawPixels is calledMatias N. Goldberg2017-07-082-4/+22
| | | | | | | | | | | | | | | | | | | | | | | | | | | | By design pixel shaders can have up to 3 variants: * The standard one. * glDrawPixels variant. * glBitmap variant. However "shader_has_one_variant" ignores this fact, and therefore st_update_fp would select the wrong variant if glDrawPixels or glBitmap was ever called. This patch fixes the problem. If the standard variant has been created, calling glDrawPixels or glBitmap will append the variant to the second entry of the linked list, so that st_update_fp still selects the right one if shader_has_one_variant is set. If the standard variant hasn't been created yet and glDrawPixel/Bitmap has been called, st_update_fp will will see this and take the slow path instead. The standard variant will then be added at the front of the linked list, so that the next time the fast path is taken. Blender in particular is hit by this bug. v2: Marek - cosmetic changes Fixes https://bugs.freedesktop.org/show_bug.cgi?id=101596 Signed-off-by: Marek Olšák <[email protected]>
* Revert "intel/isl: Only create a CCS buffer if the image supports rendering"Nanley Chery2017-07-071-1/+1
| | | | | | | | | This reverts commit 8aaa13467dc289d35dc7900ab9fab9a7689c4178, which was based on an incorrect assumption. Unlike the restriction placed on image views in the Vulkan API, OpenGL allows you to render to texture views whose formats differ from the originals. Bugzilla: https://bugzilla.freedesktop.org/show_bug.cgi?id=101677
* mesa: finish implementing glPrimitiveRestartNV() for display listsBrian Paul2017-07-071-1/+21
| | | | | | | | | | | | | If we try to build a display list with just a glPrimitiveRestartNV() call, we'd crash because of a null GLvertexformat::PrimitiveRestartNV pointer. This change fixes that case. The previous patch fixed the case of calling glPrimitiveRestartNV() inside a glBegin/End pair. v2: minor clean-up in save_PrimitiveRestartNV(), per Charmaine. Reviewed-by: Charmaine Lee <[email protected]>
* vbo: fix glPrimitiveRestartNV crash inside a display listOlivier Lauffenburger2017-07-071-5/+15
| | | | | | | | | | | | | | | | | | | | glPrimitiveRestartNV crashes when it is called during the compilation of a display list. There are two reasons: - ctx->Driver.CurrentSavePrimitive is not set to the current primitive - save_PrimitiveRestartNV() calls _save_Begin() which only sets an OpenGL error, instead of calling vbo_save_NotifyBegin(). This patch correctly calls vbo_save_NotifyBegin() but it detects the current primitive mode by looking at the latest saved primitive. Additional work by Brian Paul Signed-off-by: Olivier Lauffenburger <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101464 Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* st/mesa: remove unused st_framebuffer::Private fieldBrian Paul2017-07-071-1/+0
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>