aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: move OA accumulation code to intel/perfLionel Landwerlin2019-04-175-199/+229
| | | | | | | We'll want to reuse this in our Vulkan extension. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: move mdapi data structure to intel/perfLionel Landwerlin2019-04-173-97/+128
| | | | | | | We'll want to reuse those structures later on. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: extract performance query metricsLionel Landwerlin2019-04-1731-866/+1098
| | | | | | | | | | We would like to reuse performance query metrics in other APIs. Let's make the query code dealing with the processing of raw counters into human readable values API agnostic. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: store device revision in gen_device_infoLionel Landwerlin2019-04-174-6/+5
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/icl: Use tcs barrier id bits 24:30 instead of 24:27Topi Pohjolainen2019-04-171-7/+17
| | | | | | | | | Similarly to 1cc17fb731466c68586915acbb916586457b19bc Fixes gpu hangs with dEQP-VK.tessellation.shader_input_output.barrier Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* virgl: document potentially failing blitErik Faye-Lund2019-04-171-0/+6
| | | | | | | | | This blit can fail, but this is not new; in the old version we didn't even try to blit in this case. So let's just document the limitation for now, and leave this for another day. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: do color-conversion during when mapping transferErik Faye-Lund2019-04-171-10/+70
| | | | | | | | | | | | | When running on OpenGL ES, we can't just map any format for reading, because of limitations on glReadPixels. So let's fall back to the blit code-path, and translate the pixels to the correct format in the end. This fixes the remaining failures of KHR-GL32.packed_pixels.* apart from the sRGB tests. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: only blit if resource is readErik Faye-Lund2019-04-171-2/+5
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: get readback-formats from hostErik Faye-Lund2019-04-173-0/+44
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* gallium/util: support translating between uint and sint formatsErik Faye-Lund2019-04-171-0/+62
| | | | | | | | Without this, we can't for instance convert between r8_sint and r8g8b8a8_sint. But that's pretty useful, so let's support it as well. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: make sure bind is set for non-buffersErik Faye-Lund2019-04-171-0/+3
| | | | | | | Otherwise, virglrenderer will reject the resource. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: support write-back with staged transfersErik Faye-Lund2019-04-172-22/+49
| | | | | | | | | | | | | | | | We currently don't support writing to resources that uses a temporary staging-resource to resolve the pixels. If a write-bit was set, we forgot to perform a blit back to the old resource, followed by trying to update the wrong resource, which lacks backing-storage. The end-result would be that nothing useful happened. This approach also fixes a few smaller bugs, like using the wrong box (without x y and z zeroed out), which means a partial update of a multisampled texture could result in the wrong part of the texture being updated. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: use pipe_box for blit dst-rectErik Faye-Lund2019-04-171-5/+12
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: rewrite core of virgl_texture_transfer_mapErik Faye-Lund2019-04-171-36/+58
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: return error if allocating resolve_tmp failsErik Faye-Lund2019-04-171-0/+4
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: wait for the right resourceErik Faye-Lund2019-04-171-1/+1
| | | | | | | | In case we're resolving, we need to wait for the resolved resource instead of the original one. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: check for readback on correct resourceErik Faye-Lund2019-04-171-1/+1
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: make unmap queuing a bit more straight-forwardErik Faye-Lund2019-04-171-5/+7
| | | | | | | | | It's hard to read the code that decides if we want to queue up an unmap or destroy the transfer right away. So let's make it a bit simpler, by setting a bool in case we want to queue it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: simplify virgl_texture_transfer_unmap logicErik Faye-Lund2019-04-171-13/+9
| | | | | | | | There's no reason to keep an extra indentation level here, let's merge the two if-conditions. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: track full virgl_resource instead of just virgl_hw_resErik Faye-Lund2019-04-171-5/+5
| | | | | Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: tmp_resource -> templErik Faye-Lund2019-04-171-4/+3
| | | | | | | | This isn't the temporary resource itself, it's the template that we'll create the resource from. So let's name it appropriately. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* virgl: remove pointless transfer-counterErik Faye-Lund2019-04-174-4/+2
| | | | | | | This is only written to, never read. Let's just get rid of it. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* radeonsi/nir: fix scanning of bindless imagesTimothy Arceri2019-04-171-38/+37
| | | | Fixes: d62d434fe920 ("ac/nir_to_llvm: add image bindless support")
* iris: Add texture cache flushing hacks for blit and resource_copy_regionKenneth Graunke2019-04-161-0/+36
| | | | | | | | | | | | | This is a port of Jason's 8379bff6c4456f8a77041eee225dcd44e5e00a76 from i965 to iris. We can't find anything relevant in the documentation and no one we've talked to has been able to help us pin down a solution. Unfortunately, we have to put the hack in both iris_blit() and iris_copy_region(). st/mesa's CopyImage() implementation sometimes chooses to use pipe->blit() instead of pipe->resource_copy_region(). For blits, we only do the hack if the blit source format doesn't match the underlying resource (i.e. it's reinterpreting the bits). Hopefully this should not be too common.
* v3d: Always set up the qregs for CSD payload.Eric Anholt2019-04-161-10/+2
| | | | | | | | We were failing to set up payload[1] for use by LocalInvocationIndex/ID and shared variable accesses if gl_WorkGroupID/gl_GlobalInvocationID wasn't used (possibly because you only have one workgroup). You're always going to use payload[1], and payload[0] is common enough and we have DCE in the backend to clean it up if it happens to not be used.
* v3d: Only look up the 3rd texture gather offset for non-arrays.Eric Anholt2019-04-161-1/+1
| | | | | | | Fixes assertion failures in the CTS since Karol's cleanup when NIR started noticing that we were reading an invalid component. Fixes: 5450f1c9fb09 ("v3d: prefer using nir_src_comp_as_int over nir_src_as_const_value")
* spirv: Tell which opcode or value is unhandled when failingCaio Marcelo de Oliveira Filho2019-04-164-42/+61
| | | | | | | | | | | | | | | v2: When available, include the opcode name too. (Karol) v3: Use more to_string helpers. (Karol) Include the wrong bit_size in those failures. Include the capability number in spv_check_supported. Provide vtn_fail_with_* macros to avoid noise in the call sites. v4: Provide macros only for opcode and decoration, which have enough usages to justify them. (Jason) Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* spirv: Add more to_string helpersCaio Marcelo de Oliveira Filho2019-04-162-3/+15
| | | | | | | | Also, use a set to identify repeated values. The previous arrangement worked when the repetitions were one after another, but in some of the new cases they are not. Reviewed-by: Karol Herbst <[email protected]>
* intel/mi_builder: Disable mem_mem tests on IVBJason Ekstrand2019-04-161-0/+3
| | | | Tested-by: Clayton Craft <[email protected]>
* iris: Change vendor and renderer stringsKenneth Graunke2019-04-161-1/+4
| | | | | | | | | | | | | | | | | This patch changes the GL_VENDOR string from "Mesa Project" to "Intel". This makes GLX_MESA_query_renderer report "Vendor: Intel (0x8086)" instead of "Vendor: Mesa Project (0x8086)" which is arguably wrong. We now also use a consistent vendor string across Windows and Linux. It also prepends "Mesa" to the GL_RENDERER string, both to credit the community and have a distinguishing mark between the two drivers. We drop "DRI" compared to i965, as it's not really that important. Improves performance in Portal by 1.8x. Iris is now 3.86% faster than i965 at the portal-d1.dem timedemo on my Kabylake laptop. One change is that Portal selects the MapBufferRange path based on the vendor string, and iris's BufferSubData path is still missing the storage invalidation optimization.
* intel/mi_builder: Re-order an initializerJason Ekstrand2019-04-161-2/+2
| | | | | | The order doesn't matter in C99 but some C++ compilers seem to care. Tested-by: Clayton Craft <[email protected]>
* nir/algebraic: Use a cache to avoid re-emitting structsJason Ekstrand2019-04-161-17/+36
| | | | | | | | | | | | | | | | | | | This takes the stupid simplest and most reliable approach to reducing redundancy that I could come up with: Just use the struct declaration as the cach key. This cuts the size of the generated C file to about half and takes about 50 KiB off the .data section. size before (release build): text data bss dec hex filename 5363833 336880 13584 5714297 573179 _install/lib64/libvulkan_intel.so size after (release build): text data bss dec hex filename 5229017 285264 13584 5527865 545939 _install/lib64/libvulkan_intel.so Reviewed-by: Connor Abbott <[email protected]>
* nir/algebraic: Move the template closer to the render functionJason Ekstrand2019-04-161-19/+19
| | | | Reviewed-by: Connor Abbott <[email protected]>
* iris: Move iris_debug_recompile calls before uploading.Kenneth Graunke2019-04-161-33/+33
| | | | | | | | Order of operations is important, otherwise we'll find the program we just uploaded as the "old" compile and get confused why nothing is different between the two keys. Reviewed-by: Jordan Justen <[email protected]>
* iris: Print the reason for shader recompiles.Kenneth Graunke2019-04-161-6/+30
| | | | | | | I was lazy earlier and hadn't bothered typing / refactoring this. Now I'm hitting some extra recompiles and would like to see why. Reviewed-by: Jordan Justen <[email protected]>
* i965: Move program key debugging to the compiler.Kenneth Graunke2019-04-1613-283/+275
| | | | | | | | | | | | | | | | | | | The i965 driver has a bunch of code to compare two sets of program keys and print out the differences. This can be useful for debugging why a shader needed to be recompiled on the fly due to non-orthogonal state dependencies. anv doesn't do recompiles, so we didn't need to share this in the past - but I'd like to use it in iris. This moves the bulk of the code to the compiler where it can be reused. To make that possible, we need to decouple it from i965 - we can't get at the brw program cache directly, nor use brw_context to print things. Instead, we use compiler->shader_perf_log(), and simply pass in keys. We put all of this debugging code in brw_debug_recompile.c, and only export a single function, for simplicity. I also tidied the code a bit while moving it, now that it all lives in one file. Reviewed-by: Jordan Justen <[email protected]>
* winsys/amdgpu: don't set GTT with GDS & OA placements on APUsMarek Olšák2019-04-161-9/+11
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* nir: optimize gl_SampleMaskIn to gl_HelperInvocation for radeonsi when possibleMarek Olšák2019-04-164-2/+48
| | | | Acked-by: Timothy Arceri <[email protected]>
* st/va/enc: Add support for frame_cropping_flag of ↵suresh guttula2019-04-161-0/+8
| | | | | | | | | | | | VAEncSequenceParameterBufferH264 This patch will add support for frame_cropping when the input size is not matched with aligned size. Currently vaapi driver ignores frame cropping values provided by client. This change will update SPS nalu with proper cropping values. Signed-off-by: Satyajit Sahu <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/vce:Add support for frame_cropping_flag of ↵suresh guttula2019-04-161-2/+9
| | | | | | | | | | | | | | VAEncSequenceParameterBufferH264 This patch will add support for frame_cropping when the input size is not matched with aligned size. Currently vaapi driver ignores frame cropping values provided by client. This change will update SPS nalu with proper cropping values. v2: Moving default crop setting to else when enc_frame_cropping_flag is not set. Signed-off-by: Satyajit Sahu <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* vl: Add cropping flags for H264suresh guttula2019-04-161-0/+5
| | | | | | | This patch adds cropping flags for H264 in pipe_h264_enc_pic_control. Signed-off-by: Satyajit Sahu <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* compiler/glsl: handle case where we have multiple users for typesTapani Pälli2019-04-168-11/+66
| | | | | | | | | | | | | | | | | | Both Vulkan and OpenGL might be using glsl_types simultaneously or we can also have multiple concurrent Vulkan instances using glsl_types. Patch adds a one time init to track number of users and will release types only when last user calls _glsl_type_singleton_decref(). This change fixes glsl_type memory leaks we have with anv driver. v2: reuse hash_mutex, cleanup, apply fix also to radv driver and rename helper functions (Jason) v3: move init, destroy to happen on GL context init and destroy Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Do not reswizzle dst if instruction writes to flag registerDanylo Piliaiev2019-04-161-0/+6
| | | | | | | | | | If we write to the flag register changing the swizzle would change what channels are written to the flag register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110201 Fixes: 4cd1a0be Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: <[email protected]>
* radv: sort the shader capabilities alphabeticallySamuel Pitoiset2019-04-161-3/+3
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* iris: Make shader_perf_log print to stderr if INTEL_DEBUG=perf is setKenneth Graunke2019-04-151-4/+11
| | | | | This matches i965's behavior, and makes sure that shader compiler messages are visible when setting INTEL_DEBUG=perf.
* radv: enable shaderInt8 on SI and CIKSamuel Pitoiset2019-04-162-4/+3
| | | | | | | No CTS failures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* virgl: fix fence fd version checkChia-I Wu2019-04-151-2/+2
| | | | | | | Fixes: d1a1c21e762 ("virgl: native fence fd support") Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* virgl: introduce virgl_drm_fenceChia-I Wu2019-04-152-45/+115
| | | | | | | | | virgl_drm_fence can wrap either a fence fd or a virgl_hw_res. Because a fence fd is cheaper than a virgl_hw_res, we use it whenever it is available. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* virgl: hide fence internals from the driverChia-I Wu2019-04-154-41/+44
| | | | | | | | | | | | | | Fence fds are cheaper than resources. We want to let winsys make the decision and use fence fds whenever they are supported. This commit prepares the work. For the moment, we create a resource _and_ a fence fd when supports_fences is true. This will be fixed such that we create a resource _or_ a fence fd. (And because of a version check bug that we will fix later, supports_fences is actually never true). Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* virgl: handle fence_server_sync in winsysChia-I Wu2019-04-156-20/+28
| | | | | | | | It does not need help from the driver. This also fixes one issue where the fence is ignored when the transfer queue is full. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>