summaryrefslogtreecommitdiffstats
path: root/src/amd/vulkan
Commit message (Collapse)AuthorAgeFilesLines
* radv: add some cxxflags for new c++ fileDave Airlie2018-07-101-0/+4
| | | | | | | Looks like I broke intel CI compiles. Fixes: 6f3aee40f9 (radv: using tls to store llvm related info and speed up compiles (v10)) Tested-by: Clayton Craft <[email protected]>
* anv,radv: Add support for VK_KHR_get_display_properties2Jason Ekstrand2018-07-092-0/+58
| | | | Reviewed-by: Keith Packard <[email protected]>
* radv: using tls to store llvm related info and speed up compiles (v10)Dave Airlie2018-07-108-28/+199
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This uses the common compiler passes abstraction to help radv avoid fixed cost compiler overheads. This uses a linked list per thread stored in thread local storage, with an entry in the list for each target machine. This should remove all the fixed overheads setup costs of creating the pass manager each time. This takes a demo app time to compile the radv meta shaders on nocache and exit from 1.7s to 1s. It also has been reported to take the startup time of uncached shaders on RoTR from 12m24s to 11m35s (Alex) v2: fix llvm6 build, inline emit function, handle multiple targets in one thread v3: rebase and port onto new structure v4: rename some vars (Bas) v5: drag all code into radv for now, we can refactor it out later for radeonsi if we make it shareable v6: use a bit more C++ in the wrapper v7: logic bugs fixed so it actually runs again. v8: rebase on top of radeonsi changes. v9: drop some C++ headers, cleanup list entry v10: use pop_back (didn't have enough caffeine) Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add the trace BO to the list when starting a new cmdbufSamuel Pitoiset2018-07-091-4/+7
| | | | | | | | That might reduce CPU overhead a little bit when using RADV_TRACE_FILE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: reduce CPU overhead in radv_flush_descriptors()Samuel Pitoiset2018-07-093-11/+8
| | | | | | | | The number of enabled descriptors for a given pipeline stage can be computed at compile time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: winsys/amdgpu: include missing pthread.h headerMauro Rossi2018-07-071-0/+1
| | | | | | | | | | | | | | | | | | pthread types are used in some files without explicitely including pthread.h. This leads to compile errors on Android 7.x nougat-x86 e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31: In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32: external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t' pthread_mutex_t global_bo_list_lock; ^ 1 error generated. Including pthread.h explicitely solves the building error Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-061-46/+47
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* radv: fix emitting the view index on GFX9Samuel Pitoiset2018-07-061-1/+2
| | | | | | | | For merged shaders, VS as HS for example. Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/winsys: make use of radeon_emit()Samuel Pitoiset2018-07-051-11/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only flush CB meta in pipeline image barriers when neededSamuel Pitoiset2018-07-052-2/+15
| | | | | | | | If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only flush DB meta in pipeline image barriers when neededSamuel Pitoiset2018-07-051-7/+15
| | | | | | | | If the given image doesn't have HTILE, that's useless to flush DB metadata. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix "error: initializer element is not constant" build errorSamuel Pitoiset2018-07-051-2/+2
| | | | | | | | GCC 4.8 fails to compile with "static const", while GCC 8.1 fails to compile with only "static". Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* python: Specify the JSON separatorsMathieu Bridon2018-07-051-1/+1
| | | | | | | | | | | | | | | On Python 2, the default JSON separators are ', ' for items and ': ' for dicts. On Python 3, the default is the same when no indent is specified, but if one is (and we do specify one) then the default items separator becomes ',' (the dict separator remains unchanged). This change explicitly specifies the Python 3 default, which helps ensuring that the output is identical, whether it was generated by Python 2 or 3. Reviewed-by: Eric Engestrom <[email protected]>
* radv: optimize vkCmd{Set,Reset}Event() a little bitSamuel Pitoiset2018-07-051-8/+38
| | | | | | | | | | | | Always emitting a bottom-of-pipe event is quite dumb. Instead, start to optimize these functions by syncing PFP for the top-of-pipe and syncing ME for the post-index-fetch event. This can still be improved by emitting EOS events for syncing PS and CS stages. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: optimize radv_CmdWaitEvents()Samuel Pitoiset2018-07-051-43/+60
| | | | | | | | | | | | This introduces radv_barrier() (same as the draw/dispatch codepath). This helper is used for merging the code from CmdWaitEvents() and CmdPipelineBarrier because it's quite similar. We do ignore the source stage mask for CmdWaitEvents because it's irrelevant when event objects are used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: fold LLVMContext creation into ac_llvm_context_initMarek Olšák2018-07-041-6/+4
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/radv: move llvm compiler info to struct and init in one placeDave Airlie2018-07-043-29/+22
| | | | | | | | | | | | This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: add a check ir tm optionsDave Airlie2018-07-041-1/+3
| | | | | | | This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <[email protected]>
* radv: create/destroy passmgr at the higher level.Dave Airlie2018-07-043-10/+14
| | | | | | | This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: port to use common passmgr code.Dave Airlie2018-07-041-23/+3
| | | | | | | | This adds a inline always pass, but otherwise should work the same. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: drop copy of ac_create_target_machine.Dave Airlie2018-07-041-31/+1
| | | | | | Once we split the init once stuff out, this can be shared again. Reviewed-by: Marek Olšák <[email protected]>
* ac/radv: split the non-common init_once code from the common target code. (v2)Dave Airlie2018-07-041-16/+4
| | | | | | | | This just splits out the non-shared code and reuses ac_get_llvm_target in radv. v2: rebase on Marek's patch - fixup brace position/whitespace Reviewed-by: Marek Olšák <[email protected]>
* ac: move all LLVM module initialization into ac_create_moduleMarek Olšák2018-07-021-10/+2
| | | | | | This removes some ugly code around module initialization. Reviewed-by: Dave Airlie <[email protected]>
* radv: reset the image's predicate after a color decompression passSamuel Pitoiset2018-07-021-0/+5
| | | | | | | | | | | After performing a fast-clear eliminate, a FMASK decompress, or a DCC decompress, we can reset the predicate to FALSE. With that, the GPU should be able to skip unnecessary color decompression passes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: enable/disable predication for the DCC decompression passSamuel Pitoiset2018-07-021-2/+2
| | | | | | | | Performing a DCC decompression pass is currently pretty rare, but using predication allows the GPU to skip unnecessary passes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add padding for the UMR disassemblerSamuel Pitoiset2018-07-021-2/+18
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use separate bind points for the dynamic buffersSamuel Pitoiset2018-06-272-3/+11
| | | | | | | | | | | | | The Vulkan spec says: "pipelineBindPoint is a VkPipelineBindPoint indicating whether the descriptors will be used by graphics pipelines or compute pipelines. There is a separate set of bind points for each of graphics and compute, so binding one does not disturb the other." CC: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused 'predicated' parameter from some functionsSamuel Pitoiset2018-06-274-25/+15
| | | | | | | It's always false. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queriesSamuel Pitoiset2018-06-264-2/+29
| | | | | | | | | Ported from RadeonSI. This appears to fix some random fails with: dEQP-VK.query_pool.statistics_query.* Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable VK_EXT_shader_stencil_exportSamuel Pitoiset2018-06-262-0/+2
| | | | | | | | | | The driver already supports exporting the stencil value. The following CTS test now pass: dEQP-VK.pipeline.shader_stencil_export.op_replace Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: ignore pInheritanceInfo for primary command buffersSamuel Pitoiset2018-06-261-1/+2
| | | | | | | | | From the Vulkan spec: "If this is a primary command buffer, then this value is ignored." CC: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/surface: move cmask_size/alignment into radeon_surfMarek Olšák2018-06-251-2/+2
| | | | | | cmask_size is changed to uint32_t because it can't be greater than 4GB. Reviewed-by: Timothy Arceri <[email protected]>
* radv: fix HTILE metadata initialization in presence of subpass clearsSamuel Pitoiset2018-06-251-8/+1
| | | | | | | | | | | | If the driver ends up by performing a slow depthstencil clear, the HTILE metadata won't be initialized correctly. This fixes random VM faults on Polaris while running CTS with Bas's runner. This doesn't seem to regress performance. CC: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add VK_EXT_display_control to radv driver [v5]Keith Packard2018-06-234-15/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This extension provides fences and frame count information to direct display contexts. It uses new kernel ioctls to provide 64-bits of vblank sequence and nanosecond resolution. v2: Rework fence integration into the driver so that waiting for any of a mixture of fence types (wsi, driver or syncobjs) causes the driver to poll, while a list of just syncobjs or just driver fences will block. When we get syncobjs for wsi fences, we'll adapt to use them. v3: Adopt Jason Ekstrand's coding conventions Declare variables at first use, eliminate extra whitespace between types and names. Wrap lines to 80 columns. Suggested-by: Jason Ekstrand <[email protected]> v4: Adapt to WSI fence API change. It now returns VkResult and no longer has an option for relative timeouts. v5: wsi_register_display_event and wsi_register_device_event now use the default allocator when NULL is provided, so remove the computation of 'alloc' here. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Enable lower_io_to_temporaries after deref changes.Bas Nieuwenhuizen2018-06-221-3/+0
| | | | | | Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Remove deref chain support in radv shader info pass.Bas Nieuwenhuizen2018-06-221-97/+9
| | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* nir: convert lower_io_arrays_to_elements to deref instructionsRob Clark2018-06-221-2/+0
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: convert lower_io_to_scalar to deref instructionsRob Clark2018-06-221-2/+2
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* move lower_deref_instrsRob Clark2018-06-222-4/+2
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Disable lower_io_to_temporaries during deref changes.Bas Nieuwenhuizen2018-06-221-0/+3
| | | | | | Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Remove image_var stores.Bas Nieuwenhuizen2018-06-223-30/+30
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Use deref instructions for tex derefs in meta shaders.Bas Nieuwenhuizen2018-06-225-38/+62
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Gather info for deref instr based load/store.Bas Nieuwenhuizen2018-06-221-5/+55
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: Add shader info support for image deref instructions.Bas Nieuwenhuizen2018-06-221-3/+37
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ac/nir: Support deref instructions in tex instructions.Bas Nieuwenhuizen2018-06-221-0/+13
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir,spirv: Rework function callsJason Ekstrand2018-06-221-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | This commit completely reworks function calls in NIR. Instead of having a set of variables for the parameters and return value, nir_call_instr now has simply has a number of sources which get mapped to load_param intrinsics inside the functions. It's up to the client API to build an ABI on top of that. In SPIR-V, out parameters are handled by passing the result of a deref through as an SSA value and storing to it. This virtue of this approach can be seen by how much it allows us to delete from core NIR. In particular, nir_inline_functions gets halved and goes from a fairly difficult pass to understand in detail to almost trivial. It also simplifies spirv_to_nir somewhat because NIR functions never were a good fit for SPIR-V. Unfortunately, there is no good way to do this without a mega-commit. Core NIR and SPIR-V have to be changed at the same time. This also requires changes to anv and radv because nir_inline_functions couldn't handle deref instructions before this change and can't work without them after this change. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* spirv: Use NIR per-member splittingJason Ekstrand2018-06-221-0/+7
| | | | | | | | | | | | Before, we were doing structure splitting in spirv_to_nir. Unfortunately, this doesn't really work when you think about passing struct pointers into functions. Doing it later in NIR is a much better plan. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand2018-06-221-0/+4
| | | | | | | | | | | This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: always check the return error when submitting a CSSamuel Pitoiset2018-06-221-5/+11
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: check the return values of radv_signal_fence()Samuel Pitoiset2018-06-221-2/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>