aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* u_queue: export util_queue_fence_signalNicolai Hähnle2017-11-092-1/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* u_queue: group fence functions togetherNicolai Hähnle2017-11-091-9/+10
| | | | Reviewed-by: Marek Olšák <[email protected]>
* util/u_atomic: add p_atomic_xchgNicolai Hähnle2017-11-091-1/+31
| | | | | | | | | The closest to it in the old-style gcc builtins is __sync_lock_test_and_set, however, that is only guaranteed to work with values 0 and 1 and only provides an acquire barrier. I also don't know about other OSes, so we provide a simple & stupid emulation via p_atomic_cmpxchg. Reviewed-by: Marek Olšák <[email protected]>
* util: move futex helpers into futex.hNicolai Hähnle2017-11-094-21/+57
| | | | | | v2: style fixes Reviewed-by: Marek Olšák <[email protected]> (v1)
* glsl: Make #pragma STDGL invariant(all) only modify outputs.Kenneth Graunke2017-11-081-24/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the GLSL ES 3.20, GLSL 4.50, and GLSL 1.20 specs: "To force all output variables to be invariant, use the pragma #pragma STDGL invariant(all) before all declarations in a shader." Notably, this is only supposed to affect output variables. Furthermore, "Only variables output from a shader can be candidates for invariance." It looks like this has been wrong since we first supported the pragma in 2011 (commit 86b4398cd158024f6be9fa830554a11c2a7ebe0c). Fixes dEQP-GLES2.functional.shaders.preprocessor.pragmas.pragma_fragment. v2: Now that all cases are identical (other than compute shaders, which have no output variables anyway), we can drop the switch statement entirely. We also don't need the current_function == NULL check; this was a hold over from when we had a single var_mode_out for both function parameters and shader varyings, in the bad old days. Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* i965: expose SRGB visuals and turn on EGL_KHR_gl_colorspaceTapani Pälli2017-11-093-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | Patch exposes sRGB visuals and adds DRI integer query support for __DRI2_RENDERER_HAS_FRAMEBUFFER_SRGB. Further changes make sure that we mark if the app explicitly wanted sRGB and for these framebuffers we don't turn sRGB off in intel_gles3_srgb_workaround. This way we keep compatibility for existing applications relying on default sRGB and ony add more visual support. With this change, following dEQP tests start to pass: dEQP-EGL.functional.wide_color.window_8888_colorspace_srgb dEQP-EGL.functional.wide_color.pbuffer_8888_colorspace_srgb v2: some code cleanup (Emil Velikov) update num_formats correctly (reported by [email protected]) v3: cleanup, remove redundant is_srgb rename explicit_srgb as 'need_srgb' to follow style better Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (v2) Reviewed-by: Kenneth Graunke <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102264 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102354 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102503
* glsl: Transform fb buffers are only active if a variable uses themNeil Roberts2017-11-091-9/+15
| | | | | | | | | | | | | | | The GL spec will soon be revised to clarify that a buffer binding for a transform feedback buffer is only required if a variable is actually defined to use the buffer binding point. Previously a declaration for the default transform buffer would make it require a binding even if nothing was declared to use the default buffer. Affects: KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list KHR-GL44/45.enhanced_layouts.xfb_stride_of_empty_list_and_api Reviewed-by: Nicolai Hähnle <[email protected]> Cc: [email protected]
* intel/nir: Use the correct indirect lowering masks in link_shadersJason Ekstrand2017-11-081-6/+4
| | | | | | | | | Previously, if we were linking a vec4 VS with a SIMD8/16 FS, we wouldn't lower indirects on the fragment shader which is wrong. Instead of using a single indirect mask, take advantage of our new little helper. Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: [email protected]
* r600g: use SIMPLE_FLOAT for blending to enable some optimizationsIlia Mirkin2017-11-082-0/+2
| | | | | | | | | | | Radeonsi also sets this flag. Seems to avoid pulling up the desintation RT value when the dst blend factor is zero if it's not otherwise being loaded. Among other things, it allows blending to overwrite infinity/NaN values in the destination RT. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50: make blending work so that zero wins in a multiplicationIlia Mirkin2017-11-081-0/+5
| | | | | | | This matches nvc0 behavior, tested with the fbo-float-nan piglit. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann<[email protected]>
* glsl: Minor cleanups after previous commitIan Romanick2017-11-081-18/+11
| | | | | | | | | | | | I think it's more clear to only call emit_access once. The only difference between the two calls is the value of size_mul used for the offset parameter... but you really have to look at it to be sure. The s/is_64bit/is_double/ change is because there are no int64_t or uint64_t matrix types. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* glsl: Use more link_calculate_matrix_stride in lower_buffer_accessIan Romanick2017-11-081-20/+2
| | | | | | | | I was going to squash this with the previous commit, but there's a lot of churn in that commit. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* glsl: Use link_calculate_matrix_stride in lower_buffer_access and friendsIan Romanick2017-11-084-70/+42
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* glsl: Refactor matrix stride calculation into a utility functionIan Romanick2017-11-082-11/+50
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* glsl/linker: Optimize swizzles again after linkingIan Romanick2017-11-081-0/+10
| | | | | | | | | | | | | | Without this, the SPIR-V generator has to deal with a bunch of junk like: (swiz z (swiz xxx (swiz x (var_ref packed:binormal.z,light_dir)))) It seems better to cull that stuff out than to add code to deal with it. The problem is the way swizzles to and from scalars have to be handled in SPIR-V. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Thomas Helland <[email protected]>
* glsl: Combine nop-swizzle optimization with swizzle-swizzle optimizationIan Romanick2017-11-087-118/+52
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: <[email protected]>
* glsl: Make the swizzle-swizzle optimization greedyIan Romanick2017-11-081-30/+29
| | | | | | | | If there is a long sequence of swizzled swizzles, compact all of them down to a single swizzle. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: <[email protected]>
* glsl: Remove program_resource_visitor::visit_field(const glsl_struct_field *)Ian Romanick2017-11-082-18/+0
| | | | | | | I could not find any remaining users. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: Silence unused parameter warningIan Romanick2017-11-081-1/+1
| | | | | | | | | | | | | | glsl/lower_shared_reference.cpp: In member function ‘virtual void {anonymous}::lower_shared_reference_visitor::insert_buffer_access(void*, ir_dereference*, const glsl_type*, ir_rvalue*, unsigned int, int)’: glsl/lower_shared_reference.cpp:244:58: warning: unused parameter ‘channel’ [-Wunused-parameter] int channel) ^~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/nir: add support for all intrinsics. (v2)Dave Airlie2017-11-091-1/+31
| | | | | | | | | | | This is derived from tgsi/radeonsi code from the GLSL intrinsics. This should pre-fix radv for the upcoming spirv patches. v2: actually use wait_cnt, sleep deprived dad time! (Bas) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* amdgpu: use simple mtxTimothy Arceri2017-11-095-44/+45
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: use simple mtx in core mesaTimothy Arceri2017-11-0914-88/+89
| | | | | | | | | Results from x11perf -copywinwin10 on Eric's SKL: 4.33338% ± 0.905054% (n=40) Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Yogesh Marathe <[email protected]>
* mesa: Add new fast mtx_t mutex type for basic use casesTimothy Arceri2017-11-096-24/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While modern pthread mutexes are very fast, they still incur a call to an external DSO and overhead of the generality and features of pthread mutexes. Most mutexes in mesa only needs lock/unlock, and the idea here is that we can inline the atomic operation and make the fast case just two intructions. Mutexes are subtle and finicky to implement, so we carefully copy the implementation from Ulrich Dreppers well-written and well-reviewed paper: "Futexes Are Tricky" http://www.akkadia.org/drepper/futex.pdf We implement "mutex3", which gives us a mutex that has no syscalls on uncontended lock or unlock. Further, the uncontended case boils down to a cmpxchg and an untaken branch and the uncontended unlock is just a locked decr and an untaken branch. We use __builtin_expect() to indicate that contention is unlikely so that gcc will put the contention code out of the main code flow. A fast mutex only supports lock/unlock, can't be recursive or used with condition variables. We keep the pthread mutex implementation around as for the few places where we use condition variables or recursive locking. For platforms or compilers where futex and atomics aren't available, simple_mtx_t falls back to the pthread mutex. The pthread mutex lock/unlock overhead shows up on benchmarks for CPU bound applications. Most CPU bound cases are helped and some of our internal bind_buffer_object heavy benchmarks gain up to 10%. Signed-off-by: Kristian Høgsberg <[email protected]> Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: rework how we free gl_shader_program_dataTimothy Arceri2017-11-093-42/+18
| | | | | | | | | | | | | | | | When I introduced gl_shader_program_data one of the intentions was to fix a bug where a failed linking attempt freed data required by a currently active program. However I seem to have failed to finish hooking up the final steps required to have the data hang around. Here we create a fresh instance of gl_shader_program_data every time we link. gl_program has a reference to gl_shader_program_data so it will be freed once the program is no longer active. Cc: "17.2 17.3" <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Neil Roberts <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102177
* glsl: use the correct parent when allocating program data membersTimothy Arceri2017-11-094-8/+8
| | | | | | Cc: "17.2 17.3" <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: drop cache_fallbackTimothy Arceri2017-11-095-77/+55
| | | | | | | | | | This turned out to be a dead end, it is much easier and less error prone to just cache the IR used by the drivers backend e.g. TGSI or NIR. Cc: "17.2 17.3" <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: properly initialize brw->cs.base.stage to MESA_SHADER_COMPUTEKenneth Graunke2017-11-081-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | This has a bit of a surprising effect: For the render pipeline, the upload_sampler_state_table atom emits 3DSTATE_BINDING_TABLE_POINTERS_XS. It tries to avoid this for compute: if (GEN_GEN >= 7 && stage_state->stage != MESA_SHADER_COMPUTE) { /* Emit a 3DSTATE_SAMPLER_STATE_POINTERS_XS packet. */ genX(emit_sampler_state_pointers_xs)(brw, stage_state); } ... However, we were failing to initialize brw->cs.base.stage, so it was left as 0 (MESA_SHADER_VERTEX), causing this condition to break. We then emitted 3DSTATE_SAMPLER_STATE_POINTERS_VS in GPGPU mode, when trying to upload CS samplers. Nothing good can come of this. Found by inspection while debugging a GPU hang. Jordan believes this helps the Deus Ex: Mankind Divided benchmark mode's stability when running with shader cache. Cc: [email protected] Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/nir: Break the linking code into a helper in brw_nir.cJason Ekstrand2017-11-083-34/+40
| | | | | Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: [email protected]
* intel/nir: Add a helper for getting the NoIndirect maskJason Ekstrand2017-11-081-14/+19
| | | | | Reviewed-by: Timothy Arceri <tarceri at itsqueeze.com> Cc: [email protected]
* nir: Don't print swizzles when there are more than 4 componentsMatt Turner2017-11-081-1/+1
| | | | | | | | | ... as can happen with various types like mat4, or else we'll smash the stack writing past the end of components_local[]. Fixes: 5a0d3e1129b7 ("nir: Print the components referenced for split or packed shader in/outs.") Reviewed-by: Jason Ekstrand <[email protected]>
* meson: Add threads dependencies to glsl_compiler executableDylan Baker2017-11-081-1/+1
| | | | | | | | Fixes compiling the optional standalone glsl compiler. Reported-by: DrNick (on irc) Signed-off-by: Dylan Baker <[email protected]> Reviewed-and-Tested-by: Eric Engestrom <[email protected]>
* glsl: Fix typo fragement -> fragmentAndreas Boll2017-11-081-1/+1
| | | | | | | | | | Fixes: 94d669b0d2f ("glsl: enforce fragment shader input restrictions in GLSL ES 3.10") Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* broadcom/vc5: Remove unused v3d_compiler.cAndreas Boll2017-11-081-43/+0
| | | | | | | | | | | Unused since original import of VC5. Fixes: ade416d0236 ("broadcom: Add VC5 NIR compiler.") Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* broadcom/vc5: Add vc5_drm.h to the release tarballAndreas Boll2017-11-081-0/+1
| | | | | | | | | | | Fixes: 45bb8f29571 ("broadcom: Add V3D 3.3 gallium driver called "vc5", for BCM7268.") Cc: 17.3 <[email protected]> Signed-off-by: Andreas Boll <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* clover: use the unified check for c++11 instead of the gcc version numberGert Wollny2017-11-082-5/+5
| | | | | | | | So far clover based its test for compiler support on the version of gcc, while in reality support for c++11 is required. This patch replaces the version check by the check unified for all modules that require c++11. Reviewed-by: Emil Velikov <[email protected]>
* swr: Replace the check for c++11 by the unified versionGert Wollny2017-11-082-6/+5
| | | | Reviewed-by: Emil Velikov <[email protected]>
* configure: check for -std=c++11 support and enable st/mesa test accordinglyGert Wollny2017-11-082-1/+59
| | | | | | | | | | | | | | | | | | Add a check that tests whether the c++ compiler supports c++11, either by default, by adding the compiler flag -std=c++11, or by adding a compiler flag that the user has specified via the environment variable CXX11_CXXFLAGS. The test only does a very shallow check of c++11 support, i.e. it tests whether the define __cplusplus >= 201103L to confirm language support by the compiler, and it checks whether the header <tuple> is available to test the availability of the c++11 standard library. A make file conditional HAVE_STD_CXX11 is provided that is used in this patch to enable the test in st/mesa if C++11 support is available. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102665 Acked-by: Emil Velikov <[email protected]>
* configure.ac: append to existing initializer override flagsEmil Velikov2017-11-081-2/+2
| | | | | | | | | | Currently we were overwriting the existing warning flags, instead of adding new [as applicable]. Fixes c5d2e2d43f6 ("configure: Test for -Wno-initializer-overrides") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* configure.ac: append to existing MSVC compat flagsEmil Velikov2017-11-081-4/+4
| | | | | | | | | | | | | Currently we were overwriting the existing warning flags, instead of adding new [as applicable]. v2: Add missing space before -Werror (Eric) Fixes e4b2b69e828 ("configure: Add and use AX_CHECK_COMPILE_FLAG") Cc: Matt Turner <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]>
* meson: Allow building glvnd with EGL and non-dri based GLXDylan Baker2017-11-081-2/+6
| | | | | | | | | | | Because meson mirrors the auototools logic, it needs the same changes to allow building glvnd based egl. v2: - change if to elif (Eric) Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Emil Velikov <[email protected]>
* configure.ac: require xcb* for the omx/va/... when using x11 platformEmil Velikov2017-11-081-1/+3
| | | | | | | | | | | | Targets such as omx and va can work w/o anything X related. Mandate the xcb* dependencies only when the X11 platform is selected. Reported-by: Lukas Rusak <[email protected]> Fixes: 63e11ac2b5c ("configure: error out if building VA w/o supported platform") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Lukas Rusak <[email protected]> (v1)
* configure.ac: loosen --enable-glvnd check to honour eglEmil Velikov2017-11-081-8/+4
| | | | | | | | | | | | | | | Currently we error out when building GLVND w/o GLX. That was the original premice before we had EGL. As the commit says, that error should be reworked to honour both - do so. v2: Drop noop *);; (Eric) Reported-by: Lukas Rusak <[email protected]> Fixes: ce562f9e3fa ("EGL: Implement the libglvnd interface for EGL (v3)") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Lukas Rusak <[email protected]> (v1)
* egl/android: add a note about .swap_buffers_with_damageEmil Velikov2017-11-081-1/+1
| | | | | | | | | | | | | Android implements the API and does the native damage handling itself. At the same time it a) does call the vendor's eglSwapBuffersWithDamageKHR b) does not implement eglSetDamageRegionKHR There's something strange happening here. For now simply note about the 'lack' of eglSwapBuffersWithDamageKHR support. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* wayland-drm: static inline wayland_drm_buffer_getEmil Velikov2017-11-085-43/+39
| | | | | | | | | | | | | | | | | | | | The function is effectively a direct function call into libwayland-server.so. Thus GBM no longer depends on the wayland-drm static library, making the build more straight forward. And the resulting binary is a bit smaller. Note: we need to move struct wayland_drm_callbacks further up, otherwise we'll get an error since the type is incomplete. v2: Rebase, beef-up commit message, update meson, move struct wayland_drm_callbacks. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Daniel Stone <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> # meson bit only Acked-by: Eric Engestrom <[email protected]> # for the rest Reviewed-by: Dylan Baker <[email protected]> # meson
* automake: intel: correctly append to the LIBADD variableEmil Velikov2017-11-081-1/+1
| | | | | | | | | | | | | | Commit 05fc62d89f5 sets the variable, yet it forgot the update the existing reference to append (instead of assign). Thus as-is the expat library was discarded from the link chain when building with Android. Fixes: 05fc62d89f5 ("automake: intel: move expat handling where it's used") Cc: Hongxu Jia <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* configure: enable the OpenCL ICD by defaultEmil Velikov2017-11-082-2/+3
| | | | | | | | | | | | | | | | | | | | Nearly all the distributions* that build Mesa OpenCL, enable the ICD. Since building a non-ICD driver has the chance of conflicting with existing OpenCL binary (libOpenCL.so). Furthermore, some applications expect the library to provide annotated/versioned symbols. https://lists.freedesktop.org/archives/mesa-dev/2017-September/171093.html *Fedora, Suse, Arch, Debian, Ubuntu, FreeBSD use the ICD Gentoo manages the conflicting files via eselect. Cc: Matt Turner <[email protected]> Cc: Jan Vesely <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-By: Aaron Watry <[email protected]>
* targets/opencl: don't hardcode the icd file install to /etc/...Emil Velikov2017-11-081-1/+1
| | | | | | | | | | | | | | | | Use $(sysconfdir) instead of hardcoding /etc. While the OpenCL spec expects the file in /etc, people building their stack can override that, esp. !Linux users. Furthermore this removes a fundamental violation, which results in the system file being overwritten even as one explicitly sets --prefix and/or DESTDIR. Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-By: Aaron Watry <[email protected]>
* amd: add amdgpu_asic_addr.h to the sources listEmil Velikov2017-11-081-0/+1
| | | | | | | Otherwise it will be missing from the release tarball Fixes: 7f33e94e43a ("amd/addrlib: update to latest version") Signed-off-by: Emil Velikov <[email protected]>
* gallivm: Use new LLVM fast-math-flags APITobias Droste2017-11-081-0/+4
| | | | | | | | | | | | | LLVM 6 changed the API on the fast-math-flags: https://reviews.llvm.org/rL317488 NOTE: This also enables the new flag 'ApproxFunc' to allow for approximations for library functions (sin, cos, ...). I'm not completly convinced, that this is something mesa should do. Signed-off-by: Tobias Droste <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-and-Tested-by: Michel Dänzer <[email protected]>
* glsl: add varying resources for arrays of complex typesJuan A. Suarez Romero2017-11-081-4/+59
| | | | | | | | | | | | | This patch is mostly a patch done by Ilia Mirkin. It fixes KHR-GL45.enhanced_layouts.varying_structure_locations. v2: fix locations for TCS/TES/GS inputs and outputs (Ilia) CC: Ilia Mirkin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103098 Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>