aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/llvmpipe
Commit message (Collapse)AuthorAgeFilesLines
* gallium: fix windows build from params change.Dave Airlie2019-07-251-1/+2
| | | | | | | | This is why we can't have nice things. I'm sure there's someway to do this with {0} but I really don't have time for that. Fixes: 2631fd3b0bf ("gallivm: rework lp_build_tgsi_soa to take a struct") Reviewed-by: Timothy Arceri <[email protected]>
* gallivm: rework lp_build_tgsi_soa to take a structDave Airlie2019-07-241-5/+17
| | | | | | | The parameters were getting messy and I have to add a few more for compute shaders, so clean it up before proceeding. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: switch boolean -> bool at the interface definitionsIlia Mirkin2019-07-224-30/+30
| | | | | | | | | | | | | | | | | | This is a relatively minimal change to adjust all the gallium interfaces to use bool instead of boolean. I tried to avoid making unrelated changes inside of drivers to flip boolean -> bool to reduce the risk of regressions (the compiler will much more easily allow "dirty" values inside a char-based boolean than a C99 _Bool). This has been build-tested on amd64 with: Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d vc4 i915 svga virgl swr panfrost iris lima kmsro Gallium st: mesa xa xvmc xvmc vdpau va Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* util: use standard name for snprintf()Eric Engestrom2019-07-197-15/+15
| | | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium: get rid of PIPE_CAP_SM3Erik Faye-Lund2019-07-101-1/+3
| | | | | | | | | | | | | | | | | | | | | PIPE_CAP_SM3 has always been an odd one out of all our caps. While most other caps are fine-grained and single-purpose, this cap encode several features in one. And since OpenGL cares more about single features, it'd be nice to get rid of this one. As it turns, this is now relatively simple. We only really care about three features using this cap, and those already got their own caps. So we can remove it, and make sure all current drivers just give the same response to all of them. The only place we *really* care about SM3 is in nine, and there we can instead just re-construct the information based on the finer-grained caps. This avoids DX9 semantics from needlessly leaking into all of the drivers, most of who doesn't care a whole lot about DX9 specifically. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* llvmpipe: enable ARB_shader_storage_buffer_objectDave Airlie2019-07-071-3/+3
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add support for shader buffer binding.Dave Airlie2019-07-078-2/+100
| | | | | | | This add support for setting shader buffers and passing them to draw or binding them to the fragment shader jit. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add support for ssbo to the fragment shader jit.Dave Airlie2019-07-073-2/+25
| | | | | | This just adds the ssbo ptrs to the jit fragment shader api. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: add ssbo pointers to the soa build api.Dave Airlie2019-07-071-1/+1
| | | | | | Need to pass ssbo + ssbo size pointers just like constants. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Add CAP for opcode DIVGert Wollny2019-06-301-0/+1
| | | | | | | | Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able to check this. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: make remove_shader_variant static.Dave Airlie2019-06-212-5/+1
| | | | | | | this isn't used outside this file. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Don't use u_ringbuffer for lp_scene_queueCaio Marcelo de Oliveira Filho2019-06-171-36/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Inline the ring buffer and signal logic into lp_scene_queue instead of using a u_ringbuffer. The code ends up simpler since there's no need to handle serializing data from / to packets. This fixes a crash when compiling Mesa with LTO, that happened because of util_ringbuffer_dequeue() was writing data after the "header packet", as shown below struct scene_packet { struct util_packet header; struct lp_scene *scene; }; /* Snippet of old lp_scene_deque(). */ packet.scene = NULL; ret = util_ringbuffer_dequeue(queue->ring, &packet.header, sizeof packet / 4, return packet.scene; but due to the way aliasing analysis work the compiler didn't considered the "&packet->header" to alias with "packet->scene". With the aggressive inlining done by LTO, this would end up always returning NULL instead of the content read by util_ringbuffer_dequeue(). Issue found by Marco Simental and iThiago Macieira. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110884 Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: Change PIPE_CAP_TGSI_FS_FBFETCH bool to PIPE_CAP_FBFETCH countKenneth Graunke2019-05-231-1/+1
| | | | | | | | | | | | | | TGSI's FBFETCH instruction currently only supports reading from a single render target, but NIR intrinsics can support multiple render targets. radeonsi can only support fetching from RT 0, but other drivers may be able to support fetching from any render target. To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?" to an integer number of render targets which can be fetched. Reviewed-by: Marek Olšák <[email protected]>
* gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.Eric Anholt2019-05-131-2/+2
| | | | | | | | The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: pass stream-out targets to draw-module earlyErik Faye-Lund2019-05-062-11/+8
| | | | | | | | | | | | | | | | | We currently set this state in the draw-module twice on each draw, but which trashes this state. So far that's not a problem, because we don't really do much from that function. But it turns out, we're going to have to do more; namely flush when the state changes. This will incur a large performance penalty due to the excessive setting. Instead, let's rely on the CSO caching making sure that llvmpipe_set_so_targets doesn't get called needlessly, and setup the state directly there instead. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* delete autotools .gitignore filesEric Engestrom2019-04-291-5/+0
| | | | | | | | One special case, `src/util/xmlpool/.gitignore` is not entirely deleted, as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`). Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* llvmpipe: Always return some fence in flush (v2)Tomasz Figa2019-04-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: correctly handle waiting in llvmpipe_fence_finishEmil Velikov2019-04-261-1/+6
| | | | | | | | | | | Currently if the timeout differs from 0, we'll end up with infinite wait... even if the user is perfectly clear they don't want that. Use the new lp_fence_timedwait() helper guarding both waits in an !lp_fence_signalled block like the rest of llvmpipe. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add lp_fence_timedwait() helperEmil Velikov2019-04-262-0/+32
| | | | | | | | | | | | | | | | The function is analogous to lp_fence_wait() while taking at timeout (ns) parameter, as needed for EGL fence/sync. v2: - use absolute UTC time, as per spec (Gustaw) - bail out on cnd_timedwait() failure (Gustaw) v3: - check count/rank under mutex (Gustaw) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Gustaw Smolarczyk <[email protected]>
* llvmpipe, softpipe: no support for ATC texturesJonathan Marek2019-04-232-4/+6
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Delete autotoolsDylan Baker2019-04-152-87/+0
| | | | | | | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Marek Olšák <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Matt Turner <[email protected]>
* llvmpipe: fix undefined shift 1 << 31.Dave Airlie2019-04-121-1/+1
| | | | | | | Pointed out by coverity. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* draw: add stream member to stats callbackDave Airlie2019-04-091-1/+1
| | | | | | | | This just adds space for the member to the callback, doesn't change anything else. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* llvmpipe: stop using pipe_sampler_view_release()Brian Paul2019-03-171-6/+0
| | | | | | | | | | | This was used to avoid freeing a sampler view which was created by a context that was already deleted. But the state tracker does not allow that. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-By: Jose Fonseca <[email protected]>
* gallium: add PIPE_CAP_MAX_VARYINGSKarol Herbst2019-02-071-0/+2
| | | | | | | | | | | | | | | | | Some NVIDIA hardware can accept 128 fragment shader input components, but only have up to 124 varying-interpolated input components. We add a new cap to express this cleanly. For most drivers, this will have the same value as PIPE_SHADER_CAP_MAX_INPUTS for the fragment shader. Fixes KHR-GL45.limits.max_fragment_input_components Signed-off-by: Karol Herbst <[email protected]> [imirkin: rebased, improved docs/commit message] Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: 19.0 <[email protected]>
* gallivm: Return true from arch_rounding_available() if NEON is availableMatt Turner2019-01-241-1/+2
| | | | | | | | LLVM uses the single instruction "FRINTI" to implement llvm.nearbyint. Fixes the rounding tests of lp_test_arit. Bug: https://bugs.gentoo.org/665570 Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "llvmpipe: Always return some fence in flush (v2)"Roland Scheidegger2019-01-091-2/+0
| | | | | | | | | This reverts commit f6a6da8131383d8eeee07cd59326a70f4b15866b. With this commit we see massive amounts of asserts triggering in lp_fence_wait(), assert(f->issued), for instance with libgl_xlib state tracker and piglit. Not entirely sure if the assert could just be removed.
* llvmpipe: Always return some fence in flush (v2)Tomasz Figa2019-01-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <[email protected]>
* gallivm: don't use pavg.b intrinsic on llvm >= 6.0Roland Scheidegger2018-12-211-42/+49
| | | | | | | | | | | | | | | | | | | | | This intrinsic disppeared with llvm 6.0, using it ends up in segfaults (due to llvm issuing call to NULL address in the jited shaders). Add code doing the same thing as the autoupgrade code in llvm so it can be matched and replaced back with a pavgb. While here, also improve lp_test_format, so it tests both with and without cache (as it was, it tested the cache versions only, whereas cache is actually disabled in llvmpipe, and in any case even with it enabled vertex and geometry shaders wouldn't use it). (Although at least for the unorm8 uncached fetch, the code is still quite different to what llvmpipe is using, since that would use unorm8x16 type, whereas the test code is using unorm8x4 type, hence disabling some intrinsic paths.) Fixes: 6f4083143bb8 ("gallivm: use llvm jit code for decoding s3tc") Reviewed-by: Jose Fonseca <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* meson: Add tests to suitesDylan Baker2018-11-201-1/+2
| | | | | | | | | | | | | | | | Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* util: Move os_misc to utilDylan Baker2018-10-301-1/+1
| | | | | | | this is needed by u_debug Tested-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_MAX_TEXTURE_UPLOAD_MEMORY_BUDGETMarek Olšák2018-09-071-0/+1
|
* gallium: enable GL_AMD_depth_clamp_separate on r600, radeonsiMarek Olšák2018-09-061-0/+1
|
* gallium: split depth_clip into depth_clip_near & depth_clip_farMarek Olšák2018-09-061-1/+1
| | | | for AMD_depth_clamp_separate.
* gallium: Add a helper for implementing PIPE_CAP_* default values.Eric Anholt2018-09-041-3/+4
| | | | | | | | | | | | | | | | | | One of the pains of implementing a gallium driver is filling in a million pipe caps you don't know about yet when you're just starting out. One of the pains of working on gallium is copy-and-pasting your new PIPE_CAP into each driver. We can fix both of these by having each driver call into the default helper from their default case, so that both sides can ignore each other until they need to. v2: fix i915g build, revert swr change to avoid breaking scons build (https://travis-ci.org/anholt/mesa/jobs/419739857) v3: Rebase on 3 new gallium caps. Reviewed-by: Marek Olšák <[email protected]> (v1) Cc: Bruce Cherniak <[email protected]> Cc: George Kyriazis <[email protected]> Cc: Kenneth Graunke <[email protected]>
* gallium: Split out PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE.Kenneth Graunke2018-08-241-0/+1
| | | | | | | | | | | | | Some hardware can do PIPE_TEX_WRAP_MIRROR_REPEAT but not PIPE_TEX_WRAP_MIRROR_CLAMP and PIPE_TEX_WRAP_MIRROR_CLAMP_TO_BORDER. Drivers for such hardware would like to advertise support for ARB_texture_mirror_clamp_to_edge but not EXT_texture_mirror_clamp. This commit adds a new PIPE_CAP_TEXTURE_MIRROR_CLAMP_TO_EDGE bit, changes the extension enable to be based on that, and enables it in all upstream drivers which supported PIPE_CAP_TEXTURE_MIRROR_CLAMP (so they continue supporting this mode).
* gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZEMarek Olšák2018-08-231-0/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_GS_INVOCATIONSMarek Olšák2018-08-231-0/+2
| | | | Tested-by: Dieter Nützel <[email protected]>
* llvmpipe: add cc clobber to inline asmGrazvydas Ignotas2018-08-231-1/+2
| | | | | | | The bsr instruction modifies flags, so that needs to be indicated to the compiler. No effect on generated code, but still needed for correctness. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add storage_sample_count parameter into is_format_supportedMarek Olšák2018-07-311-0/+4
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_FRAMEBUFFER_MSAA_CONSTRAINTSMarek Olšák2018-07-311-0/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium/llvmpipe: Enable support bptc format.Denis Pauk2018-07-012-4/+2
| | | | | | | | | | | v2: none v3: none Signed-off-by: Denis Pauk <[email protected]> CC: Marek Olšák <[email protected]> CC: Rhys Perry <[email protected]> CC: Matt Turner <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: add support for programmable sample locationsRhys Perry2018-06-141-0/+1
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]> (v2)
* gallium: add PIPE_CAP_GLSL_FEATURE_LEVEL_COMPATIBILITYMarek Olšák2018-05-291-0/+2
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* llvmpipe: improve rasterization discard logicRoland Scheidegger2018-05-2315-89/+118
| | | | | | | | | | | | | | | | | | | | | | This unifies the explicit rasterization discard as well as the implicit rasterization disabled logic (which we need for another state tracker), which really should do the exact same thing. We'll now toss out the prims early on in setup with (implicit or explicit) discard, rather than do setup and binning with them, which was entirely pointless. (We should eventually get rid of implicit discard, which should also enable us to discard stuff already in draw, hence draw would be able to skip the pointless clip and fallback stages in this case.) We still need separate logic for only null ps - this is not the same as rasterization discard. But simplify the logic there and don't count primitives simply when there's an empty fs, regardless of depth/stencil tests, which seems perfectly acceptable by d3d10. While here, also fix statistics for primitives if face culling is enabled. No piglit changes. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fix check for a no-op shaderBrian Paul2018-05-181-2/+4
| | | | | | | | | | The tgsi_info.num_tokens fix broke llvmpipe's detection of no-op shaders. Fix the code to check for num_instructions <= 1 instead. Fixes: 8fde9429c36b75 ("tgsi: fix incorrect tgsi_shader_info::num_tokens computation") Tested-by: Roland Scheidegger <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Fix random number generation for unit testsRoland Scheidegger2018-05-142-2/+19
| | | | | | | | | | | | | | | | | | | | | | | We were never producing negative numbers for signed types. Also fix only producing half the valid range for uint32, and properly clamp signed values. Because this now also properly tests snorm with actually negative values, need to increase eps for such conversions. I believe these cannot actually be hit in ordinary operation (e.g. if a snorm texture is sampled and output to snorm RT, it will still go through snorm->float and float->snorm conversion), so don't bother to do anything to fix the bad accuracy (might be quite complex). Basically, the issue is for something like snorm16->snorm8 that in the end this will just use a 8 bit arithmetic right shift. But the math behind it says we should actually do a division by 32767 / 127, which is ~258, not 256. So the result can be one bit off (values have too large magnitude), and furthermore, the shift has incorrect rounding (always rounds down). For positive numbers, these errors have different direction, but for negative ones they have the same, hence for some values the error will be 2 bit in the end. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=106232
* gallium: add initial support for conservative rasterizationRhys Perry2018-04-301-0/+12
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util: Move util_is_power_of_two to bitscan.h and rename to ↵Ian Romanick2018-03-292-2/+2
| | | | | | | | | | | util_is_power_of_two_or_zero The new name make the zero-input behavior more obvious. The next patch adds a new function with different zero-input behavior. Signed-off-by: Ian Romanick <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* gallium: add packed uniform CAPTimothy Arceri2018-03-201-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>