summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* draw: improve vertex fetch (v2)Roland Scheidegger2016-10-193-86/+134
| | | | | | | | | | | | | | | | | | | | | | | The per-element fetch has quite some calculations which are constant, these can be moved outside both the per-element as well as the main shader loop (llvm can figure out it's constant mostly on its own, however this can have a significant compile time cost). Similarly, it looks easier swapping the fetch loops (outer loop per attrib, inner loop filling up the per vertex elements - this way the aos->soa conversion also can be done per attrib and not just at the end though again this doesn't really make much of a difference in the generated code). (This would also make it possible to vectorize the calculations leading to the fetches.) There's also some minimal change simplifying the overflow math slightly. All in all, the generated code seems to look slightly simpler (depending on the actual vs), but more importantly I've seen a significant reduction in compile times for some vs (albeit with old (3.3) llvm version, and the time reduction is only really for the optimizations run on the IR). v2: adapt to other draw change. No changes with piglit. Reviewed-by: Jose Fonseca <[email protected]>
* draw: improved handling of undefined inputsRoland Scheidegger2016-10-191-21/+32
| | | | | | | | | | | | | | | Previous attempts to zero initialize all inputs were not really optimal (though no performance impact was measurable). In fact this is not really necessary, since we know the max number of inputs used. Instead, just generate fetch for up to max inputs used by the shader, directly replacing inputs for which there was no vertex element by zero. This also cleans up key generation, which previously would have stored some garbage for these elements. And also drop the assertion which indicates such bogus usage by a debug_printf (the whole point of initializing the undefined inputs was to make this case safe to handle). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: print out time for jitting functions with GALLIVM_DEBUG=perfRoland Scheidegger2016-10-191-0/+11
| | | | | | | | Compilation to actual machine code can easily take as much time as the optimization passes on the IR if not more, so print this out too. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Use native packs and unpacks for the lerpsRoland Scheidegger2016-10-193-13/+156
| | | | | | | | | | | | | | | | | | | | | | | | For the texturing packs, things looked pretty terrible. For every lerp, we were repacking the values, and while those look sort of cheap with 128bit, with 256bit we end up with 2 of them instead of just 1 but worse, plus 2 extracts too (the unpack, however, works fine with a single instruction, albeit only with llvm 3.8 - the vpmovzxbw). Ideally we'd use more clever pack for llvmpipe backend conversion too since we actually use the "wrong" shuffle (which is more work) when doing the fs twiddle just so we end up with the wrong order for being able to do native pack when converting from 2x8f -> 1x16b. But this requires some refactoring, since the untwiddle is separate from conversion. This is only used for avx2 256bit pack/unpack for now. Improves openarena scores by 8% or so, though overall it's still pretty disappointing how much faster 256bit vectors are even with avx2 (or rather, aren't...). And, of course, eliminating the needless packs/unpacks in the first place would eliminate most of that advantage (not quite all) from this patch. Reviewed-by: Jose Fonseca <[email protected]>
* anv: drop pointless struct decl.Dave Airlie2016-10-191-2/+0
| | | | | Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop pointless struct decl.Dave Airlie2016-10-191-2/+0
| | | | | Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: move to using shared vk_alloc inlines.Dave Airlie2016-10-1912-131/+88
| | | | | | | | This moves to the shared vk_alloc inlines for vulkan memory allocations. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv: move to using vk_alloc helpers.Dave Airlie2016-10-1918-147/+103
| | | | | | | This moves all the alloc/free in anv to the generic helpers. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* vulkan: add vk_alloc.h shared allocation inlines.Dave Airlie2016-10-192-1/+77
| | | | | | | | vulkan allocation allows for overriding the allocator used, add some macros for anv/radv to share for this. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv: drop local MIN/MAX macros.Dave Airlie2016-10-192-5/+2
| | | | | | | Use the ones from mesa, most places already did. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: drop local MIN/MAX macros.Dave Airlie2016-10-194-7/+4
| | | | | | | Use the ones in macros.h instead. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util: move min/max/clamp macros to util macros.hDave Airlie2016-10-192-13/+13
| | | | | | | | | Although the vulkan drivers include mesa macros.h, for radv I'd like to move away from that. Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: make use of shared vector helper.Dave Airlie2016-10-193-135/+8
| | | | | | | | This removes the vector code from radv in favour of sharing code with anv. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* anv: port to using new u_vector shared helper.Dave Airlie2016-10-195-154/+35
| | | | | | | This just removes the anv vector code and uses the new helper. Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* util: add vector util code.Dave Airlie2016-10-193-1/+193
| | | | | | | | This is ported from anv, both anv and radv can share this. Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* svga: minor code improvements in svga_validate_pipe_sampler_view()Brian Paul2016-10-181-8/+8
| | | | | | | Use the 'texture' local var in more places. Rename 'pFormat' to 'viewFormat'. Reviewed-by: Charmaine Lee <[email protected]>
* intel: genxml: add SAMPLER_BORDER_COLOR_STATE structuresLionel Landwerlin2016-10-185-0/+90
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/va: force to flush the last p frame in idr periodBoyuan Zhang2016-10-181-0/+3
| | | | | | | | | | | | | | During dual instance encoding submission, if the second encode task and first encode task have no reference dependency, e.g. p following with idr-frame, there is a chance the second task will use for its reconstructed picture buffer the same buffer used by first task for its reference/reconstructed picture. In this case, buffer corruption may occur depending on encoding speed. Fix is to force flush these two tasks separately to avoid race condition Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=98005 Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* egl/surfaceless: Fix segfault in eglSwapBuffersChad Versace2016-10-181-0/+12
| | | | | | | | | | | Since commit 63c5d5c6c46c8472ee7a8241a0f80f13d79cb8cd, the surfaceless platform has allowed creation of pbuffer surfaces. But the vtable entry for eglSwapBuffers has remained NULL. Discovered by running a little pbuffer test. Cc: Gurchetan Singh <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* radeonsi: rename prefixes from radeon to siMarek Olšák2016-10-184-157/+157
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: merge radeon_llvm_context and si_shader_contextMarek Olšák2016-10-184-317/+290
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: import all TGSI->LLVM code from gallium/radeonMarek Olšák2016-10-1811-462/+346
| | | | | | Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: simplify initialization of 64-bit gallivm buildersMarek Olšák2016-10-181-18/+4
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: remove unused radeon_llvm_reg_index_soaMarek Olšák2016-10-182-7/+0
| | | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* radeonsi: move LLVM ALU codegen into radeonsiMarek Olšák2016-10-186-992/+1056
| | | | | | Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Edward O'Callaghan <[email protected]>
* genxml: add generated headers to EXTRA_DISTJonathan Gray2016-10-181-0/+4
| | | | | | | | | Building the Mesa 12.0.3 distfile failed on a system without python as generated files were not included in the distfile. Cc: "12.0" <[email protected]> Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* mesa: automake: include mesa_glinterop.h in distfileJonathan Gray2016-10-181-1/+2
| | | | | | | | | | Add mesa_glinterop.h to the list of headers that will get included in the distfile as it is required to build Mesa itself. Corrects a regression introduced in a89faa2022fd995af2019c886b152b49a01f9392. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl: remove docs directory from EXTRA_DISTJonathan Gray2016-10-181-1/+0
| | | | | | | | | | | | The egl docs directory no longer exists as of 88b5c36fe1a1546bf633ee161a6715efc593acbd. Remove it from EXTRA_DIST to unbreak 'make dist' Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* genxml: avoid using a GNU make pattern ruleJonathan Gray2016-10-181-1/+4
| | | | | | | | | | | % pattern rules are a GNU extension. Convert the use of one to a inference rule to allow this to build on OpenBSD. This is a related change to the one made in e3d43dc5eae5271e2c87bab702aa7409d3dd0b23 Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: use a single require_libdrm helperEmil Velikov2016-10-181-25/+24
| | | | | | | | | Rather than having 4-5 places which do the explicit check/message just polish the gallium helper and use it everywhere. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* configure.ac: remove no longer needed *_pci_id logicEmil Velikov2016-10-181-41/+0
| | | | | | | | | | | | | | | | | | Previously it was used to differentiate between the different codepaths in the loader. Although strictly speaking the (core) of the loader is only used when a hardware device is available. The latter of which in itself requires libdrm (one of the codepaths available). That said, all the configure toggles which relate to enabling/using hw device should attribute and require libdrm, so there's no need to keep this code around. With this gallium_require_drm_loader becomes an empty stub, so nuke that one as well. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: cleanup copyright sectionEmil Velikov2016-10-181-40/+2
| | | | | | | | | | | With previous patches nearly all the original code (as seen in the various loaders) is gone. Update the copyright/license section to reflect that. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: remove loader_get_driver_for_fd() driver_typeEmil Velikov2016-10-1812-32/+22
| | | | | | | | | | | | | | Reminiscent from the pre-loader days, were we had multiple instances of the loader logic in separate places and one could build a "GALLIUM_ONLY" version. Since that is no longer the case and the loaders (glx/egl/gbm) do not (and should not) require to know any classic/gallium specific we can drop the argument and the related code. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: remove final sysfs codepath in loader_get_device_name_for_fd()Emil Velikov2016-10-182-62/+5
| | | | | | | | | | | | | | Effectively everyone with actual hardware and/or requesting the "device_name" requires a working libdrm. Thus they could/should already be using the (now only) codepath. Apart from the code simplification, we can slim down our configure.ac even further. But that will be done in separate patch(es). Cc: Gary Wong <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* travis: remove no longer needed libudev-dev dependencyEmil Velikov2016-10-181-1/+0
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* scons: remove all libudev referencesEmil Velikov2016-10-182-5/+0
| | | | | | | | Analogous to previous automake/autoconf commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* scons: loader: use libdrm when availableEmil Velikov2016-10-181-0/+4
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gbm: remove superfluous/incorrect udev commentEmil Velikov2016-10-181-1/+0
| | | | | | | | | | The gbm_device_get_backend_name() provides an (somewhat) internal name of the implementation/backend used. Is has nothing to do with the udev, one cannot and should not attempt to derive the name from it. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* automake: remove all the libudev referencesEmil Velikov2016-10-182-20/+7
| | | | | | | | As of last commit nothing in mesa depends on libudev. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: remove libudev_get_device_name_for_fd and related codeEmil Velikov2016-10-181-126/+0
| | | | | | | | With this all the libudev related code is now gone. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: reimplement loader_get_user_preferred_fd via libdrmEmil Velikov2016-10-181-141/+106
| | | | | | | | | | | | | | | | Currently not everyone has libudev and with follow-up patches we'll completely remove the divergent codepaths. Use the libdrm drm device API to construct the required ID_PATH_TAG-like string, to preserve the current functionality for libudev users and allow others to benefit from it as well. v2: Drop ranty comments, pick the correct device v3: \n -> \0 in PCI_ID_PATH_TAG_LENGTH comment (Axel). v4: Use snprintf (Nicolai) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* loader: annotate __driConfigOptionsLoader as staticEmil Velikov2016-10-181-1/+1
| | | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: separate USE_DRICONF code into separate functionEmil Velikov2016-10-181-12/+18
| | | | | | | | Improves readability and allows us to do further cleanups a lot easier. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: slim down loader_get_pci_id_for_fd implementation(s)Emil Velikov2016-10-181-156/+16
| | | | | | | | | | | | | | | | Currently mesa has three code paths in the loader - libudev, manual sysfs and drm ioctl one. Considering the issues we had with libudev - strip those down in favour of the libdrm drm device API. The latter can be implemented in any way depending on the platform and can be reused by others. v2: Use correct message on drmGetDevice failure. (Nicolai) Cc: Jonathan Gray <[email protected]> Cc: Jean-Sébastien Pédron <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* configure.ac: mark libdrm as have_pci_id providerEmil Velikov2016-10-181-4/+8
| | | | | | | | | | With follow on work, we'll untangle and simplify all the different codepaths in loader. Then again, we forget to set have_pci_id when libdrm is present (one of the codepaths available). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Axel Davy <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gm107/ir: fix bit offset of tex lod setting for indirect texturingIlia Mirkin2016-10-181-1/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* gm107/ir: fix texturing with indirect samplersIlia Mirkin2016-10-181-0/+10
| | | | | | | | | | The indirect handle has to come right after the coordinates, so if there was a sample/bias/depth compare/offset, everything would end up being shifted by one argument position. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* gallium/tgsi: add missing #includeMarek Olšák2016-10-181-0/+2
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/va: set default rt formats when calling vaCreateConfigJulien Isorce2016-10-182-2/+12
| | | | | | | | As specified in va.h, default value should be set on attributes not present in the input list. Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Mark Thompson <[email protected]>
* i965: Fix gl_InvocationID in dual object GS where invocations == 1.Kenneth Graunke2016-10-171-1/+4
| | | | | | | | | | | | | | | | | | | | dEQP-GLES31.functional.geometry_shading.instanced.geometry_1_invocations draws using a geometry shader that specifies layout(points, invocations = 1) in; and then uses gl_InvocationID. According to the Haswell PRM, the "GS Instance ID 0" (and 1) thread payload fields are undefined in dual object mode: "If 'dispatch mode' is DUAL_OBJECT this field is not valid." But there's no point in using them - if there's only one invocation, the ID will be 0. So just load a constant. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>