summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* egl: store the native surface pointer in struct _egl_surfacePaulo Zanoni2019-05-1411-12/+24
| | | | | | | | | | | | | | | | | | | | | | | Each platform stores this in a different place: - platform_drm uses dri2_surf->gbm_surf->base - platform_android uses dri2_surf->window - platform_wayland uses dri2_surf->wl_win - platform_x11 uses dri2_surf->drawable - platform_x11_dri3 uses dri3_surf->loader_drawable.drawable - haiku doesn't even store it! We need access to the native surface since the specification asks us to refuse creating a new surface if there's already an EGLSurface associated with native_surface. An alternative to this patch would be to create a new API.GetNativeWindow callback that each platform would have to implement. While that's something we can definitely do, I prefer this approach. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Signed-off-by: Paulo Zanoni <[email protected]>
* radv: add support for VK_KHR_uniform_buffer_standard_layoutSamuel Pitoiset2019-05-142-0/+7
| | | | | | | Nothing to do. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* softpipe/buffer: load only as many components as the the buffer resource ↵Gert Wollny2019-05-141-2/+5
| | | | | | | | | | | | | | | type provides Otherwise we risk to read past the end of the buffer. In addition, change the loop counters to unsigned to be consistent with the types. Fixes: afa8707ba93a7d226a76319acda2a8dd89524db7 softpipe: add SSBO/shader atomics support. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* panfrost: ci: Reduce batch size to 3000Tomeu Vizoso2019-05-141-1/+1
| | | | | | | | As with the previous value of 5000 we seemed to be reaching OOM in some circumstances. Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* panfrost: ci: Update expectationsTomeu Vizoso2019-05-141-2/+0
| | | | | | | | | | Since last Friday, these two tests have been fixed: dEQP-GLES2.functional.shaders.functions.control_flow.return_in_nested_loop_fragment dEQP-GLES2.functional.shaders.linkage.varying_7 Signed-off-by: Tomeu Vizoso <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* freedreno: Fix warning on printing a uint64_t using %llx.Eric Anholt2019-05-131-1/+1
| | | | Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Silence compiler warnings about "*" in boolean context.Eric Anholt2019-05-132-2/+2
| | | | | | | It sure looks like we just want both of them to be nonzero, and && is probably going to be cheaper than * anyway. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Silence compiler warnings about uninit 'layers'Eric Anholt2019-05-133-3/+3
| | | | | | | My gcc can't see that the uninitialized value from the PIPE_BUFFER case isn't used from the !PIPE_BUFFER cases later. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Quiet compiler warnings on 64-bit.Eric Anholt2019-05-131-1/+1
| | | | | | | __u64 is a ulonglong on x86_64, not uint64_t, so my gcc was complaining about the wrong type being passed in. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Make emacs indent the way robclark's eclipse does.Eric Anholt2019-05-132-0/+6
| | | | | | | | | The .editorconfig helps with the tabs, but we've got this two-tabs-from-previous-indentation line continuation style that requires whacking the c-file-offsets. This will throw emacs warnings when first opening a file in the directory, press '!' to shut it up for the future. Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: Make .editorconfig match .dir-locals.el.Eric Anholt2019-05-132-0/+8
| | | | | | | | The editorconfig takes precedence over dir-locals in emacs26 with editorconfig enabled, so the /.editorconfig was affecting these directories. Reviewed-by: Kristian H. Kristensen <[email protected]>
* anv: Implement VK_KHR_uniform_buffer_standard_layoutJason Ekstrand2019-05-132-0/+8
| | | | | | | | There's no real work to do here since we already support scalar block layout which is a direct superset of what this extension allows. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* vulkan: Update the XML and headers to 1.1.108Jason Ekstrand2019-05-132-57/+223
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* tu/entrypoints: Import copyJason Ekstrand2019-05-131-0/+1
| | | | It's used without being imported
* nv50/ir/nir: make use of SYSTEM_VALUE_MAX when iterating read sysvalsKarol Herbst2019-05-131-1/+1
| | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* nv50/ir/nir: prefer to shift 1ull instead of 1llKarol Herbst2019-05-131-2/+2
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Suggested-by: Ilia Mirkin <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* radv: Clean up signalled and submitted fields from winsys fences.Bas Nieuwenhuizen2019-05-136-41/+47
| | | | | | | | | Other types like syncobj do not need it, so lets make things a bit more uniform. Also reduce confusion what the signalled/submitted referred to (especially with imported fences) Reviewed-by: Dave Airlie <[email protected]>
* radv: bump reported version to 1.1.107Samuel Pitoiset2019-05-132-50/+1
| | | | | | | | VK_AMD_draw_indirect_count has been promoted with the suffix changed to KHR. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* v3d: Use driconf to expose non-MSAA texture limits for Xorg.Eric Anholt2019-05-1315-23/+85
| | | | | | The V3D 4.2 HW has a limit to MSAA texture sizes of 4096. With non-MSAA, we can go up to 7680 (actually probably 8138, but that hasn't been validated by the HW team). Exposing 7680 in X11 will allow dual 4k displays.
* gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.Eric Anholt2019-05-1335-93/+91
| | | | | | | | The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Replace MaxTextureLevels with MaxTextureSize.Eric Anholt2019-05-1315-34/+29
| | | | | | | | | | In most places (glGetInteger, max_legal_texture_dimensions), we wanted the number of pixels, not the number of levels. Number of levels is easily recovered with util_next_power_of_two() and ffs(). More importantly, for V3D we want to be able to expose a non-power-of-two maximum texture size to cover 2x4k displays on HW that can't quite do 8192 wide. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Remove proxy image checks for maximum level.Eric Anholt2019-05-131-18/+0
| | | | | | | We've already verified this by _mesa_legal_texture_dimensions() before this call. Reviewed-by: Marek Olšák <[email protected]>
* mesa: Reuse _mesa_max_texture_levels() instead of open-coding it.Eric Anholt2019-05-133-29/+4
| | | | | | | The shared function has some extension presence checks, but other than that has the same switch statement contents. Reviewed-by: Marek Olšák <[email protected]>
* intel/tools: Fix build with glibc < 2.27.Vinson Lee2019-05-131-0/+3
| | | | | | | | | | | | | | | | | | | | | glibc < 2.27 defines OVERFLOW in /usr/include/math.h. This patch fixes this build error. In file included from ../include/c99_math.h:37:0, from ../src/util/u_math.h:44, from ../src/mesa/main/macros.h:35, from ../src/intel/compiler/brw_reg.h:47, from ../src/intel/tools/i965_asm.h:32, from ../src/intel/tools/i965_gram.y:29: src/intel/tools/i965_gram.tab.c:562:5: error: expected identifier before numeric constant OVERFLOW = 412, ^ Fixes: 70308a5a8a80 ("intel/tools: New i965 instruction assembler tool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110656 Signed-off-by: Vinson Lee <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* st/mesa: enable the ST_DEBUG env var in release and debugoptimized buildsMarek Olšák2019-05-132-10/+0
| | | | | | Useful for dumping shaders. Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: overhaul the vertex fetch fixup mechanismNicolai Hähnle2019-05-138-280/+301
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The overall goal is to support unaligned loads from vertex buffers natively on SI. In the unaligned case, we fall back to the general case implementation in ac_build_opencoded_load_format. Since this function is fully general, we will also use it going forward for cases requiring fully manual format conversions of dwords anyway. This requires a different encoding of the fix_fetch array, which will now contain the entire format information if a fixup is required. Having to check the alignment of vertex buffers is awkward. To keep the impact on the fast path minimal, the si_context will keep track of which vertex buffers are (not) at least dword-aligned, while the si_vertex_elements will note which vertex buffers have some (at most dword) alignment requirement. Vertex buffers should be dword-aligned most of the time, which allows a fast early-out in almost all cases. Add the radeonsi_vs_fetch_always_opencode configuration variable for testing purposes. Note that it can only be used reliably on LLVM >= 9, because support for byte and short load is required. v2: - add a missing check to si_bind_vertex_elements Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: store sctx->vertex_elements in a local in si_shader_selector_key_vsNicolai Hähnle2019-05-131-7/+6
| | | | | | Purely as a shorthand in the remainder of the function. Reviewed-by: Marek Olšák <[email protected]>
* amd/common: add ac_build_opencoded_fetch_formatNicolai Hähnle2019-05-132-0/+343
| | | | | | | Implement software emulation of buffer_load_format for all types required by vertex buffer fetches. Reviewed-by: Marek Olšák <[email protected]>
* nir/validate: Use a single set for SSA def validationJason Ekstrand2019-05-131-78/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current SSA def validation we do in nir_validate validates three things: 1. That each SSA def is only ever used in the function in which it is defined. 2. That an nir_src exists in an SSA def's use list if and only if it points to that SSA def. 3. That each nir_src is in the correct use list (uses or if_uses) based on whether it's an if condition or not. The way we were doing this before was that we had a hash table which provided a map from SSA def to a small ssa_def_validate_state data structure which contained a pointer to the nir_function_impl and two hash sets, one for each use list. This meant piles of allocation and creating of little hash sets. It also meant one hash lookup for each SSA def plus one per use as well as two per src (because we have to look up the ssa_def_validate_state and then look up the use.) It also involved a second walk over the instructions as a post-validate step. This commit changes us to use a single low-collision hash set of SSA sources for all of this by being a bit more clever. We accomplish the objectives above as follows: 1. The list is clear when we start validating a function. If the nir_src references an SSA def which is defined in a different function, it simply won't be in the set. 2. When validating the SSA defs, we walk the uses and verify that they have is_ssa set and that the SSA def points to the SSA def we're validating. This catches the case of a nir_src being in the wrong list. We then put the nir_src in the set and, when we validate the nir_src, we assert that it's in the set. This takes care of any cases where a nir_src isn't in the use list. After checking that the nir_src is in the set, we remove it from the set and, at the end of nir_function_impl validation, we assert that the set is empty. This takes care of any cases where a nir_src is in a use list but the instruction is no longer in the shader. 3. When we put a nir_src in the set, we set the bottom bit of the pointer to 1 if it's the condition of an if. This lets us detect whether or not a nir_src is in the right list. When running shader-db with an optimized debug build of mesa on my laptop, I get the following shader-db CPU times: With NIR_VALIDATE=0 3033.34 seconds Before this commit 20224.83 seconds After this commit 6255.50 seconds Assuming shader-db is a representative sampling of GLSL shaders, this means that making this change yields an 81% reduction in the time spent in nir_validate. It still isn't cheap but enabling validation now only increases compile times by 2x instead of 6.6x. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* util/set: Add a helper to resize a setJason Ekstrand2019-05-132-0/+16
| | | | | | | | | Often times you don't know how big a set will be and you want the code to just grow it as needed. However, sometimes you do know and you can avoid a lot of rehashing if you just specify a size up-front. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* util/set: Add a search_and_add functionJason Ekstrand2019-05-132-5/+31
| | | | | | | | | This function is identical to _mesa_set_add except that it takes an extra out parameter that lets the caller detect if a replacement happened. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* nir/validate: Use a ralloc context for our temporary dataJason Ekstrand2019-05-131-16/+12
| | | | | | | | | All of our hash tables and sets are already using ralloc. There's really no good reason why we don't just make a ralloc context rather than try to remember to clean everything up manually. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* lima: add Allwinner H5 supportPatrick Lerda2019-05-131-2/+20
| | | | | | | | The H5 hardware variant requires a specific plb_max_blk number. This value can't be probed at the hardware level. Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: refactor plb_max_blkPatrick Lerda2019-05-135-11/+34
| | | | | | | | Move plb_max_blk to lima_screen, and add a new debug option: LIMA_PLB_MAX_BLK Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* radv: Do not use extra descriptor space for the 3rd plane.Bas Nieuwenhuizen2019-05-123-7/+26
| | | | | | | | | | | | | | | | | | | | | While ImageFormatProperties returns the number of internal descriptors, it turns out that applications do not need to actually allocate more descriptors in the descriptor pool. So if we make descriptors with more planes larger we have to be convervative and always allocate space for the larger descriptors which is a waste given the low usage of this ext. So let us make use of the fact that 3plane formats all have the same formats & dimensions for the last two planes. This way we only need the first half of the descriptor of the 3rd plane and can share the second half of the second plane. This allows us to use 16 bytes for the descriptor which nicely fits into the 16 bytes that are unused right next to the sampler. Fixes: 5564c38212a "radv: Update descriptor sets for multiple planes." Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Add support for icd loader interface v4.Bas Nieuwenhuizen2019-05-133-2/+66
| | | | | | Adds support for physical device functions unknown to the loader. Acked-by: Samuel Pitoiset <[email protected]>
* panfrost/midgard: Handle csel correctlyAlyssa Rosenzweig2019-05-125-152/+128
| | | | | | | | | | | | We use an algebraic pass for the csel optimizations, and use proper vectorized csel ops (i/fcsel_v) for mixed, rather lowering. To avoid regressions along the way, we fix an issue with the copy propagation pass (it should not attempt to propagate constants). Similarly, we take care to break bundles when using csel to fix some scheduler corner cases. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* iris: Implement ARB_indirect_parametersIllia Iorin2019-05-115-23/+156
| | | | | | | | | | | | | | | | | | | | iris_draw_vbo is divided into two functions to remove unnecessary operations from the loop. This implementation of ARB_indirect_parameters takes into account NV_conditional_render by saving MI_PREDICATE_RESULT at the start of a draw call and restoring it at the end also the result of NV_conditional_render is taken into account when computing predicates that limit draw calls for ARB_indirect_parameters in a similar way to 1952fd8d in ANV. v2: Optimize indirect draws (suggested by Kenneth Graunke) v3: (by Kenneth Graunke) - Fix an issue where indirect draws wouldn't set patch information before updating the compiled TCS. - Move some code back to iris_draw_vbo to avoid duplicating it. - Fix minor indentation issues. Signed-off-by: Illia Iorin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Split iris_update_draw_info into two functions.Kenneth Graunke2019-05-111-0/+12
| | | | | | | | Shader draw parameters need updating on each iteration of a multidraw loop, but the primitive based information only needs to be updated once. Also, patch information needs to be recorded before filling out the TCS program key, as it determines the number of HS instances.
* nir: Fix wrong sign in lower_rcpRuslan Kabatsayev2019-05-111-2/+2
| | | | | | | | | | | | | | | | The nested fma calls were supposed to implement x_new = x + x * (1 - x*src), but instead current code is equivalent to x_new = x - x * (1 - x*src). The result is that Newton-Raphson steps don't improve precision at all. This patch fixes this problem. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110435 Reviewed-by: Kenneth Graunke <[email protected]>
* intel: drop misleading driver name from gen_get_device_info()Mike Blumenkrantz2019-05-111-1/+1
|
* radv: clear vertex bindings while resetting command bufferJózef Kucia2019-05-111-1/+2
| | | | | | | | | Only vertex inputs accessed by vertex shader must have valid buffers bound. Signed-off-by: Józef Kucia <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: 5010436e09f "radv: bail out when binding the same vertex buffers"
* st/mesa: fix 2 crashes in st_tgsi_lower_yuvMarek Olšák2019-05-101-20/+28
| | | | | | | | | | | | src/mesa/state_tracker/st_tgsi_lower_yuv.c:68: void reg_dst(struct tgsi_full_dst_register *, const struct tgsi_full_dst_register *, unsigned int): assertion "dst->Register.WriteMask" failed The second crash was due to insufficient allocated size for TGSI instructions. Cc: 19.0 19.1 <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* iris: Use full ways for L3 cache setup on Icelake.Kenneth Graunke2019-05-101-0/+1
| | | | | | | | | Anuj fixed this in i965 and anv, but the fix never landed in iris. Fixes tessellation corruption on Icelake. Thanks to Rafael for bisecting this and tracking it down. Fixes: d0996d5fab6 iris: Emit default L3 config for the render pipeline Reviewed-by: Rafael Antognolli <[email protected]>
* anv: Fix limits when VK_EXT_descriptor_indexing is usedCaio Marcelo de Oliveira Filho2019-05-101-9/+14
| | | | | | | | | | | | | | | | | | | | | | Update various limits in VkPhysicalDeviceDescriptorIndexingPropertiesEXT that were previously zero to their values from VkPhysicalDeviceLimits. When using VK_EXT_descriptor_indexing, the former limits will apply to all the descriptor layout sets -- not only those using the new feature bits. For the reference, VK_EXT_descriptor_indexing says "There are new descriptor set layout and descriptor pool creation flags that are required to opt in to the update-after-bind functionality, and there are separate maxPerStage* and maxDescriptorSet* limits that apply to these descriptor set layouts which may be much higher than the pre-existing limits. The old limits only count descriptors in non-updateAfterBind descriptor set layouts, and the new limits count descriptors in all descriptor set layouts in the pipeline layout." Fixes: 6e230d7607f "anv: Implement VK_EXT_descriptor_indexing" Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan/overlay: keep allocating draw data until it can be reusedLionel Landwerlin2019-05-101-113/+135
| | | | | | | | | | | | | | | | | | | | | | | The original implementation assumed that we could allocate the same amount of command buffers as the number of images in the swapchain. But the application could potentially render much faster and rerender into images that have been submitted for presentation but not yet presented. This change keeps on allocating command buffers, vertex buffer, vertex indices as well as a semaphore and a fence for as long as we can't reuse a previously submitted one. This fixes rendering issues in the overlay at high frame rates. v2: Don't recreate semaphores constantly (Józef) v3: Drop useless surface & FreeCommandBuffers (Józef) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110655 Cc: 19.1 <[email protected]> Reviewed-by: Józef Kucia <[email protected]>
* vulkan/overlay: fix truncating error on 32bit platformsLionel Landwerlin2019-05-101-40/+36
| | | | | | | | | | | | | | Non dispatchable handles can be uint64_t. When compiling the layer on a 32bit platform, this will lead to casting uint64_t into (void *) which is 32bit, leading to incorrect handles being mapped internally in the layer. v2: Use more HKEY() (Eric) Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Józef Kucia <[email protected]> Fixes: 2d2927938f074f ("vulkan/overlay-layer: fix cast errors") Reviewed-by: Józef Kucia <[email protected]>
* i965: Fix memory leaks in brw_upload_cs_work_groups_surface().Kenneth Graunke2019-05-101-0/+5
| | | | | | | | | | | | | This was taking a reference to the 64kB upload buffer and never returning it, leaking a reference each time this atom triggered. This leaked lots of 64kB upload BOs, eventually running us out of of VMA space. This would usually happen when using mpv to watch a movie, after 20-40 minutes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110134 Fixes: 63d7b33f516 i965/cs: Setup surface binding for gl_NumWorkGroups Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* st/va: set the visible image dimensions in vlVaDeriveImageJulien Isorce2019-05-101-2/+4
| | | | | | | | | | | | | | | This fixes video being rendered incorrectly. User wants height of 360 but internally pipe_video_buffer 's height is 368 in the test below. Test: GST_GL_PLATFORM=egl gst-launch-1.0 videotestsrc ! video/x-raw, width=868, height=360, format=NV12 ! vaapipostproc ! glimagesink Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* swrast: Rename blend_func->swrast_blend_funcAlyssa Rosenzweig2019-05-101-5/+5
| | | | | | | | | | | | This avoids a conflict with the new (driver-agnostic) blend_func enum in shader_enum.h, which broke the build of swrast (and i965 by extension). My apologies :( Signed-off-by: Alyssa Rosenzweig <[email protected]> Fixes: f41be53a ("compiler: Add enums for blend state") Cc: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>