aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/virgl/virgl_resource.c
Commit message (Collapse)AuthorAgeFilesLines
* virgl: Add override for BGRA format to use swizzled SRGB formatGert Wollny2019-06-201-0/+10
| | | | | | | | | | | Tie in the check whether the host supports tweaks and whether this tweak is enabled. v2: Add comment about the emulated formats not being used directly in the guest (Gurchetan) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: fix sync issue regarding discard/unsync transfersChia-I Wu2019-06-181-5/+15
| | | | | | | | | | | | | | | | | | | | | | GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into ptr = glMapBufferRange(buf, 0, size, GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT); memcpy(ptr, data1, size); glUnmapBuffer(buf); ptr = glMapBufferRange(buf, size, size, GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT); memcpy(ptr, data2, size); glUnmapBuffer(buf); we never want data1 to be copy_transfer'ed. Because that would mean that data2 might overwrite valid data. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis [email protected] Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping") Reviewed-by: Emil Velikov <[email protected]>
* virgl: better support for PIPE_TRANSFER_DISCARD_WHOLE_RESOURCEChia-I Wu2019-06-171-25/+70
| | | | | | | | | | | | | | | | | | | | | | | When the resource to be mapped is busy and the backing storage can be discarded, reallocate the backing storage to avoid waiting. In this new path, we allocate a new buffer, emit a state change, write, and add the transfer to the queue . In the PIPE_TRANSFER_DISCARD_RANGE path, we suballocate a staging buffer, write, and emit a copy_transfer (which may allocate, memcpy, and blit internally). The win might not always be clear. But another win comes from that the new path clears res->valid_buffer_range and does not clear res->clean_mask. This makes it much more preferable in scenarios such as access = enough_space ? GL_MAP_UNSYNCHRONIZED_BIT : GL_MAP_INVALIDATE_BUFFER_BIT; glMapBufferRange(..., GL_MAP_WRITE_BIT | access); memcpy(...); // append new data glUnmapBuffer(...); Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: save virgl_hw_res in virgl_transferChia-I Wu2019-06-171-0/+7
| | | | | | | | | | When PIPE_TRANSFER_DISCARD_WHOLE_RESOURCE is properly supported, virgl_transfer might refer to a different virgl_hw_res than virgl_resource does. We need to save the virgl_hw_res and use the saved one. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: add resource_reference to virgl_winsysChia-I Wu2019-06-171-1/+1
| | | | | | | | It works similar to pipe_resource_reference but is for virgl_hw_res. It can also replace resource_unref. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: virgl_transfer should own its virgl_resourceChia-I Wu2019-06-121-1/+5
| | | | | | | | We should avoid having potentially dangling pointers to pipe_resources in general. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: pass virgl_context to transfer create/destroyChia-I Wu2019-06-121-4/+4
| | | | | | | | A pipe_transfer is a context object. It is fine for the constructor/destructor to have access to the context. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: Work around possible memory exhaustionAlexandros Frantzis2019-06-071-3/+14
| | | | | | | | | | | | | | | | | | | | | Since we don't normally flush before performing copy transfers, it's possible in some scenarios to use too much memory for staging resources and start failing. This can happen either because we exhaust the total available memory (including system memory virtio-gpu swaps out to), or, more commonly, because the total size of resources in a command buffer doesn't fit in virtio-gpu video memory. To reduce the chances of this happening, force a flush before a copy transfer if the total size of queued staging resources exceeds a certain limit. Since after a flush any queued staging resources will be eventually released, this ensures both that each command buffer doesn't require too much video memory, and that we don't end up consuming too much memory for staging resources in total. Fixes kernel errors reported when running texture_upload tests in glbench. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Remove incorrect resource wait conditionAlexandros Frantzis2019-06-071-13/+0
| | | | | | | | | | Now that we have copy transfers in place, we can remove the incorrect resource wait condition. Copy transfers and other optimizations minimize the performance impact of this removal, while providing the correct behavior. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Use copy transfers for texturesAlexandros Frantzis2019-06-071-4/+52
| | | | | | | | | | Extend copy transfers to also be used for busy textures. Performance results: Unigine Valley, qemu before: 22.7 FPS after: 23.1 FPS Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Use buffer copy transfers to avoid waiting when mappingAlexandros Frantzis2019-06-071-2/+82
| | | | | | | | | | | | | | | | | | | | We typically need to wait for a buffer to become ready before mapping, so that we don't write new contents while the host is still using the old contents. However, if we are allowed to discard the contents of the mapped buffer range, then we can avoid waiting by using a staging buffer range which we guarantee to never be busy, copying from the staging buffer range to the target buffer in the host. This commit implements this optimization by utilizing a dedicated u_upload_mgr for the staging buffer. Performance results: Twilight Struggle (Steam/Proton), qemu before: 7 FPS after: 25 FPS glmark2 ubo, qemu before: 38 FPS after: 331 FPS Signed-off-by: Alexandros Frantzis <[email protected]> Suggested-by: Gurchetan Singh <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Support copy transfersAlexandros Frantzis2019-06-071-0/+3
| | | | | | | | | | | | | | | Support transfers that use a different resource as the source of data to transfer. This will be used in upcoming commits to send data to host buffers through a transfer upload buffer, in order to avoid waiting when the buffer resource is busy. Note that we don't support queueing copy transfers in the transfer queue. Copy transfers should be emitted directly in the command queue, allowing us to avoid flushes before them and leads to better performance. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Support VIRGL_BIND_STAGINGAlexandros Frantzis2019-06-071-1/+1
| | | | | | | | | Support a new virgl bind type for staging buffers which don't require dedicated host-side storage. These will be used to implement copy transfers. Signed-off-by: Alexandros Frantzis <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: Avoid unfinished transfer_get with PIPE_TRANSFER_DONTBLOCKAlexandros Frantzis2019-06-071-9/+12
| | | | | | | | | | | | | If we are not allowed to block, and we know that we will have to wait, either because the resource is busy, or because it will become busy due to a readback, return early to avoid performing an incomplete transfer_get. Such an incomplete transfer_get may finish at any time, during which another unsynchronized map could write to the resource contents, leaving the contents in an undefined state. Signed-off-by: Alexandros Frantzis <[email protected]> Suggested-by: Chia-I Wu <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* virgl: fix readback with pending transfersChia-I Wu2019-05-291-6/+26
| | | | | | | | | | When readback is true, and there are pending writes in the transfer queue, we should flush to avoid reading back outdated data. This fixes piglit arb_copy_buffer/dlist and a subtest of arb_copy_buffer/data-sync. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: remove an incorrect check in virgl_res_needs_flushChia-I Wu2019-05-241-15/+0
| | | | | | | | | | | | | | | | | | | | | Imagine this resource_copy_region(ctx, dst, ..., src, ...); transfer_map(ctx, src, 0, PIPE_TRANSFER_WRITE, ...); at the beginning of a cmdbuf. We need to flush in transfer_map so that the transfer is not reordered before the resource copy. The check for "vctx->num_draws == 0 && vctx->num_compute == 0" is not enough. Removing the optimization entirely. Because of the more precise resource tracking in the previous commit, I hope the performance impact is minimized. We will have to go with perfect resource tracking, or attempt a more limited optimization, if there are specific cases we really need to optimize for. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: track valid buffer range for transfer syncChia-I Wu2019-05-221-19/+31
| | | | | | | | | | | | | | | | | | | | | | | | | | virgl_transfer_queue_is_queued was used to avoid flushing. That fails when the resource is being accessed by previous cmdbufs but not the current one. The new approach of tracking the valid buffer range does not apply to textures however. But hopefully it is fine because the goal is to avoid waiting for this scenario glBufferSubData(..., offset, size, data1); glDrawArrays(...); // append new vertex data glBufferSubData(..., offset+size, size, data2); glDrawArrays(...); If glTex(Sub)Image* turns out to be an issue, we will need to track valid level/layer ranges as well. v2: update virgl_buffer_transfer_extend as well v3: do not remove virgl_transfer_queue_is_queued Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]> (v1) Reviewed-by: Gurchetan Singh <[email protected]> (v2)
* virgl: handle DONT_BLOCK and MAP_DIRECTLYChia-I Wu2019-05-151-2/+17
| | | | | | | | | | Handle PIPE_TRANSFER_DONT_BLOCK and PIPE_TRANSFER_MAP_DIRECTLY. Make virgl_resource_transfer_prepare return an enum instead of a bool for extensibility (e.g., instruct the callers to map differently). Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: add virgl_resource_transfer_prepareChia-I Wu2019-05-151-5/+44
| | | | | | | | | | | | virgl_resource_transfer_prepare should be called before mapping to prepare the resource. It does flush, readback, and wait as needed. virgl_res_needs_flush and virgl_res_needs_readback become internal helpers to the new function. There should be no externally visible change. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: honor DISCARD_WHOLE_RESOURCE in virgl_res_needs_readbackChia-I Wu2019-05-151-1/+2
| | | | | Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: clean up virgl_res_needs_readbackChia-I Wu2019-05-151-5/+16
| | | | | | | Add comments and follow the coding style of virgl_res_needs_flush. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: clean up virgl_res_needs_flushChia-I Wu2019-05-141-2/+34
| | | | | | | | | | | Add comments and some minor cleanups. v2: document the function Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]> (v1) Reviewed-by: Gurchetan Singh <[email protected]> Signed-off-by: Chia-I Wu <[email protected]>
* virgl: do not skip readback because of explicit flushChia-I Wu2019-05-141-3/+0
| | | | | | | | | | Both apps and we (see virgl_buffer_transfer_flush_region) might flush regions that are unmodified. We have to read back for those flushes. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: do not use inline writes for subdataChia-I Wu2019-05-061-4/+7
| | | | | | | | | | | | | | Inline writes skip transfer map/unamp at the cost of an extra copy on the data during execbuffer. That is generally a win for small transfers. But the heuristic to use inline writes based on buffer sizes rather than transfer sizes makes little sense. More importantly, inline writes miss optimizations that are done for buffer transfers. Let's just use transfers. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* virgl: Re-use and extend queue transfers for intersecting buffer subdatas.David Riley2019-05-011-0/+46
| | | | | | | | Small buffer subdatas which are essentially doing a memcpy were getting bogged down by all the overhead of creating new transfers. Signed-off-by: David Riley <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: add support for missing command buffer binding.Dave Airlie2019-04-091-1/+1
| | | | | | | When I added indirect support I forgot this, however to use it now we need to check for a new enough capability on the host side. Reviewed-By: Gert Wollny <[email protected]>
* virgl: use uint16_t mask instead of separate booleansGurchetan Singh2019-03-131-8/+7
| | | | | | | This should save some space. Suggested-by: Erik Faye-Lund <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* virgl: use virgl_transfer_inline_write even lessGurchetan Singh2019-02-151-1/+1
| | | | | | | | | | | | | | We've noticed the Team Fortress 2 engine seems to do many small calls to glSubData(..). Let's pick our heuristic based on the resource base width, not the size of a particular upload. This will cause transfers to be batched together in the transfer queue. Revelant glbench microbenchmark -- Before: buffer_upload_dynamic_element_array_131072 = 131.17 mbytes_sec After: buffer_upload_dynamic_element_array_131072 = 6828.24 mbytes_sec Reviewed-by: Gert Wollny <[email protected]>
* virgl: use transfer queueGurchetan Singh2019-02-151-0/+2
| | | | | | | | | | This improves Unigine Valley benchmark by 3 to 10 fps (depending on the scene). It also improves the Team Fortress 2 benchmark from 6 fps to 13 fps (host: 20 fps). Reviewed-by: Gert Wollny <[email protected]>
* virgl: add extra checks in virgl_res_needs_flush_waitGurchetan Singh2019-02-151-4/+9
| | | | | | | | | | | | | | | | | | | | | | This is motivated by the following scenario: glSubBufferData(GL_ARRAY_BUFFER, ...) glFlush(..) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) glSubBufferData(GL_ARRAY_BUFFER, ...) This increases @davidriley's Team Fortress 2 apitrace from 1 fps to 6 fps and helps with the Chromium glbench microbenchmarks: Before: texture_update_rgba_texsubimage2d_2048 = 554.96 mtexel_sec buffer_upload_dynamic_array_12 = 0.02 mbytes_sec buffer_upload_dynamic_array_576 = 1.07 mbytes_sec After: texture_update_rgba_texsubimage2d_2048 = 612.29 mtexel_sec buffer_upload_dynamic_array_12 = 2.22 mbytes_sec buffer_upload_dynamic_array_576 = 164.89 mbytes_sec Reviewed-by: Gert Wollny <[email protected]>
* virgl: pass virgl transfer to virgl_res_needs_flush_waitGurchetan Singh2019-02-151-3/+4
| | | | Reviewed-by: Gert Wollny <[email protected]>
* virgl: when creating / freeing transfers, pass slab pool directlyGurchetan Singh2019-02-151-5/+4
| | | | | | | This will allow us to destroy transfers w/o having a pointer to the context. Reviewed-by: Gert Wollny <[email protected]>
* virgl: track level cleanliness rather than resource cleanlinessGurchetan Singh2019-02-151-4/+8
| | | | | | This allows a minor optimization for texture upload. Reviewed-by: Gert Wollny <[email protected]>
* virgl: use virgl_resource_dirty helperGurchetan Singh2019-02-151-0/+6
| | | | Reviewed-by: Gert Wollny <[email protected]>
* virgl: add ability to do finer grain dirty trackingGurchetan Singh2019-02-151-2/+4
| | | | | | There are levels to cleanliness. Reviewed-by: Gert Wollny <[email protected]>
* virgl: move resource creation / import / destruction to common codeGurchetan Singh2018-12-191-10/+74
| | | | | | We can remove some duplicated code. Reviewed-by: Elie Tournier <[email protected]>
* virgl: make transfer code with PIPE_BUFFER targetsGurchetan Singh2018-12-191-2/+4
| | | | | | | util_format_get_blocksize returns 1 for R8 formats (all PIPE_BUFFERs are R8). Reviewed-by: Elie Tournier <[email protected]>
* virgl: consolidate transfer codeGurchetan Singh2018-12-191-8/+48
| | | | | | | | We could allocate and destroy transfers in one place. v2: Keep l_stride around. Reviewed-by: Elie Tournier <[email protected]>
* virgl: store layer_stride in metadataGurchetan Singh2018-12-191-6/+5
| | | | Reviewed-by: Elie Tournier <[email protected]>
* virgl: move vrend_get_tex_image_offset to common codeGurchetan Singh2018-12-191-0/+24
| | | | | | Will be reused. Reviewed-by: Elie Tournier <[email protected]>
* virgl: move virgl_resource_layout to common codeGurchetan Singh2018-12-191-0/+37
| | | | | | Will be reused. Reviewed-by: Elie Tournier <[email protected]>
* virgl: avoid large inline transfersGurchetan Singh2018-11-301-1/+5
| | | | | | | | | | | | | | | | | | | | We flush everytime the command buffer (16 kB) is full, which is quite costly. This improves dEQP-GLES3.performance.buffer.data_upload.function_call.buffer_data.new_buffer.usage_stream_draw from 111.16 MB/s to 1930.36 MB/s. In addition, I made the benchmark produce buffers from 0 --> VIRGL_MAX_CMDBUF_DWORDS * 4, and tried ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 2), ((VIRGL_MAX_CMDBUF_DWORDS * 4) / 4), etc. I didn't notice any clear differences, so let's just go with the most obvious heuristic. Tested-By: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* gallium/util: don't modify usage in pipe_buffer_writeMarek Olšák2016-07-231-0/+5
| | | | | | All drivers were already doing it except virgl. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-231-1/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* gallium: add external usage flags to resource_from(get)_handle (v2)Marek Olšák2016-03-091-1/+2
| | | | | | | | | This will allow drivers to make better decisions about texture sharing for DRI2, DRI3, Wayland, and OpenCL. v2: add read/write flags, take advantage of __DRI_IMAGE_USE_BACKBUFFER Reviewed-by: Axel Davy <[email protected]>
* virgl: unwrap the includesEmil Velikov2015-10-301-1/+2
| | | | | | | | Include what you want, rather than relying on a header foo.h N levels down the include chain, to provide something that you need. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* virgl: use virgl_screen/surface upcast wrappersEmil Velikov2015-10-301-2/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* virgl: add driver for virtio-gpu 3D (v2)Dave Airlie2015-10-231-0/+89
virgl is the 3D acceleration backend for the virtio-gpu shipping with qemu. The 3D acceleration is designed around gallium and TGSI as the virtualisation layer. The backend renderer translates the virgl interface into OpenGL currently. This is the initial import of the driver to mesa. The kernel driver portions are lined up for drm-next. Currently this driver supports up to GL3.3 and some misc extensions if the host driver exposes it. It is planned to iterate the virgl API to new GL levels as mesa host drivers gain features. v2: fix resource tracking across flushes to avoid ->bind hack in mapping. consolidate mapping and waiting code for transfers. use u_range for dirt tracking. handle larger shaders in protocol. include virtgpu_drm.h in mesa for now. add translation layer for gallium tgsi to virgl tgsi. Signed-off-by: Dave Airlie <[email protected]>