aboutsummaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: use only one descriptor in the fmask expand passSamuel Pitoiset2019-06-051-24/+3
| | | | | | | | This removes one useless SMEM load operations which pointed to the same descriptor anyway. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* radv: set ACCESS_NON_READABLE on the fmask expand pass output imageSamuel Pitoiset2019-06-051-0/+1
| | | | | | | The driver will emit GLC=1. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* radv: remove one useless image type in the fmask expand shaderSamuel Pitoiset2019-06-051-6/+3
| | | | | | | Both input and output images use the same type. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* ac: rename LLVM <= 7 helpers for readabilityMarek Olšák2019-06-041-37/+37
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: fix a typo in ac_build_wg_scan_bottomMarek Olšák2019-06-041-1/+1
| | | | | Cc: 19.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Use bo metadata for imported image tiling on Android.Bas Nieuwenhuizen2019-06-043-14/+61
| | | | | | This way we handle linear images etc. correctly. Acked-by: Samuel Pitoiset <[email protected]>
* ac/nir: mark some texture intrinsics as convergentRhys Perry2019-06-041-0/+18
| | | | | | | | | | | | Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <[email protected]> Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: fix some compiler warningsRhys Perry2019-06-041-4/+4
| | | | | | | | | Fixes -Woverflow warnings with GCC 9.1.1 v2: use a cast instead of a bitwise and Signed-off-by: Rhys Perry <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* radv: do not use gfx fast depth clears for layered depth/stencil imagesSamuel Pitoiset2019-06-041-0/+1
| | | | | | | | | | The driver should only fast depth clears with the graphics path when the view covers all image layers, otherwise this might corrupt layers when HTILE is enabled. Cc: 19.0 19.1 [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radv: do not emit vec3 for raw load/store on SISamuel Pitoiset2019-06-044-8/+20
| | | | | | | | It's unsupported, only load/store format with vec3 are supported. Fixes: 6970a9a6ca9 ("ac,radv: remove the vec3 restriction with LLVM 9+")" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/registers: don't use the si, cik, vi names, use gfxNMarek Olšák2019-06-036-1405/+1405
| | | | trivial
* amd/common: use generated register headerNicolai Hähnle2019-06-0316-16351/+19
|
* amd/common: use SH{0,1}_CU_EN definitions only of COMPUTE_STATIC_THREAD_MGMT_SE0Nicolai Hähnle2019-06-031-5/+5
| | | | | | | The automatic header generation unifies identical registers in a series and only emits definitions for the first one. This is mostly to avoid emitting excessive definitions for CB registers, but special-casing an exception for this family of registers doesn't seem worth it.
* amd/common: unify PITCH_GFX6 and PITCH_GFX9Nicolai Hähnle2019-06-033-13/+13
| | | | | | | | | | | The definition of the fields differs, but PITCH_GFX9 is a mere extension of PITCH_GFX6 that does not conflict with any other fields. This aligns the definitions with what will be generated from the register JSON. The information about how large the fields really are is preserved in the register database.
* amd/common: rename R_3F2_CONTROL to IB_CONTROL for disambiguationNicolai Hähnle2019-06-032-2/+2
| | | | | | | This "register" name collides with R_370_CONTROL. This aligns the definitions with what will be generated from the register JSON.
* amd/common: cleanup DATA_FORMAT/NUM_FORMAT field namesNicolai Hähnle2019-06-034-17/+17
| | | | | | | | | | The field layout wasn't actually changed in gfx9, so having the suffix isn't very useful. The field *contents* were changed, but this is reflected in the V_xxx_xxx definitions and is taken into account by the ac_debug logic based on the register JSON. This aligns the definitions with what will be generated from the register JSON.
* amd/common: derive ac_debug tables from register JSONNicolai Hähnle2019-06-034-177/+131
|
* amd/registers: add JSON description of packet3 fieldsNicolai Hähnle2019-06-031-0/+338
|
* amd/registers: add JSON descriptions of registersNicolai Hähnle2019-06-031-0/+15985
| | | | | The descriptions are mostly derived from parsing the existing register headers.
* amd/registers: scripts for processing register descriptions in JSONNicolai Hähnle2019-06-035-0/+1631
| | | | | | | | | | We will derive both the debugging tables and (the majority of) the register headers from descriptions in JSON, instead of deriving the debugging tables from an awkward parsing of the register headers. Some of the scripts are useful for maintaining the register database itself. The scripts are designed to output reasonably readable JSON by default.
* ac: use amdgpu-flat-work-group-sizeMarek Olšák2019-06-033-5/+13
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: flush pending query reset caches before copying resultsSamuel Pitoiset2019-06-031-15/+25
| | | | | | | | | | From the Vulkan spec 1.1.108: "vkCmdCopyQueryPoolResults is guaranteed to see the effect of previous uses of vkCmdResetQueryPool in the same queue, without any additional synchronization." Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radv: remove the vec3 restriction with LLVM 9+Samuel Pitoiset2019-06-034-11/+18
| | | | | | | | | | | | | | | | | | | | | | | | | This changes requires LLVM r356755. 32706 shaders in 16744 tests Totals: SGPRS: 1448848 -> 1455984 (0.49 %) VGPRS: 1016684 -> 1016220 (-0.05 %) Spilled SGPRs: 25871 -> 25815 (-0.22 %) Spilled VGPRs: 122 -> 122 (0.00 %) Scratch size: 11964 -> 11956 (-0.07 %) dwords per thread Code Size: 55324500 -> 55301152 (-0.04 %) bytes Max Waves: 235660 -> 235586 (-0.03 %) Totals from affected shaders: SGPRS: 293704 -> 300840 (2.43 %) VGPRS: 246716 -> 246252 (-0.19 %) Spilled SGPRs: 159 -> 103 (-35.22 %) Scratch size: 188 -> 180 (-4.26 %) dwords per thread Code Size: 8653664 -> 8630316 (-0.27 %) bytes Max Waves: 60811 -> 60737 (-0.12 %) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: use RADV_CMD_DIRTY_DYNAMIC_* when restoring viewport/scissorSamuel Pitoiset2019-05-311-2/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use CmdPushConstants when restoring constants after meta operationsSamuel Pitoiset2019-05-311-6/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable transformFeedbackStreamsLinesTrianglesSamuel Pitoiset2019-05-301-1/+1
| | | | | | | The driver should already support this without any changes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_EXT_sample_locations and disable itSamuel Pitoiset2019-05-306-7/+315
| | | | | | | | | | | | | | Basically, this extension allows applications to use custom sample locations. It doesn't support variable sample locations during subpass. Note that we don't have to upload the user sample locations because the spec doesn't allow this. The extension is currently disabled because the driver needs to support variable sample locations during layout transitions. The depth decompress needs to know them and that's a bit invasive. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Change spirv_to_nir() to return a nir_shaderCaio Marcelo de Oliveira Filho2019-05-291-5/+4
| | | | | | | | | | | | | | | spirv_to_nir() returned the nir_function corresponding to the entrypoint, as a way to identify it. There's now a bool is_entrypoint in nir_function and also a helper function to get the entry_point from a nir_shader. The return type reflects better what the function name suggests. It also helps drivers avoid the mistake of reusing internal shader references after running NIR_PASS on it. When using NIR_TEST_CLONE or NIR_TEST_SERIALIZE, those would be invalidated right in the first pass executed. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Don't re-use entry_point pointer from spirv_to_nirCaio Marcelo de Oliveira Filho2019-05-291-10/+8
| | | | | | | | | | Replace its uses with checking for is_entrypoint and calling nir_shader_get_entrypoint(). This is a preparation to change spirv_to_nir() return type. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use view format when selecting the resolve path for subpassesSamuel Pitoiset2019-05-291-8/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always use view format when performing subpass resolvesSamuel Pitoiset2019-05-293-12/+21
| | | | | | | | | | | | It makes sense to use the image view formats when resolving inside subpasses, while we have to use the image formats for normal resolves. Original patch by Philip Rebohle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: sync before resetting a pool if there is active pending queriesSamuel Pitoiset2019-05-294-0/+27
| | | | | | | | | | | Make sure to sync all previous work if the given command buffer has pending active queries. Otherwise the GPU might write queries data after the reset operation. This fixes a bunch of new dEQP-VK.query_pool.* CTS failures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allocate more space in the CS when emitting eventsSamuel Pitoiset2019-05-281-1/+1
| | | | | | | | | | | If the driver waits for CP DMA to be idle and emit an EOP event we need more space. This fixes a crash with Quake Champions. Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv add radv_get_resolve_pipeline() in the compute pathSamuel Pitoiset2019-05-281-20/+36
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: cleanup the compute resolve path for subpassSamuel Pitoiset2019-05-281-56/+29
| | | | | | | | This makes use of radv_meta_resolve_compute_image() by filling a VkImageResolve region instead of duplicating code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: treat Mullins as Kabini, remove the enumMarek Olšák2019-05-276-12/+1
| | | | it's the same design
* radv: ignore the loadOp if the first use of an attachment is a resolveSamuel Pitoiset2019-05-271-9/+3
| | | | | | | Based on ANV. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always dirty the framebuffer when restoring a subpassSamuel Pitoiset2019-05-272-2/+4
| | | | | | | | | | | The old code was not wrong because the transitions performed after the resolves should re-emit the framebuffer if needed. This change is mostly a no-op but it improves consistency regarding other meta operations that need to save/restore subpasses. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_clear_htile() helperSamuel Pitoiset2019-05-273-6/+16
| | | | | | | | This helper will be useful for clearing HTILE after some depth/stencil resolves. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: tidy up GetQueryPoolResults for occlusion queriesSamuel Pitoiset2019-05-271-7/+5
| | | | | | | | Just move the block that checks the availability bit into the switch like other query types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Drop imov/fmov in favor of one mov instructionJason Ekstrand2019-05-241-2/+1
| | | | | | | | | | | | | | | | The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Rob Clark <[email protected]>
* nir/builder: Remove the use_fmov parameter from nir_swizzleJason Ekstrand2019-05-242-4/+4
| | | | | | | | | | This flag has caused more confusion than good in most cases. You can validly use imov for floats or fmov for integers because, without source modifiers, neither modify their input in any way. Using imov for floats is more reliable so we go that direction. Reviewed-by: Kristian H. Kristensen <[email protected]> Acked-by: Alyssa Rosenzweig <[email protected]>
* vulkan: fix build dependency issue with generated filesLionel Landwerlin2019-05-221-4/+3
| | | | | | | | | | | | | On machines with many cores, you can run into that issue : ../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory v2: Move declare_dependency around (Eric) Signed-off-by: Lionel Landwerlin <[email protected]> Reported-by: Jan Ziak Cc: <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: do not reset query pool during creationSamuel Pitoiset2019-05-221-3/+0
| | | | | | | | | | | From the Vulkan spec 1.1.108: "After query pool creation, each query must be reset before it is used." So, the driver doesn't need to do this at creation time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix the sample max distance value for 8xSamuel Pitoiset2019-05-221-1/+1
| | | | | | | It should be 7, not 8. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit correct centroid priority based on the number of samplesSamuel Pitoiset2019-05-221-3/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clean up the sample locations codebaseSamuel Pitoiset2019-05-224-98/+76
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove remaining code related to 16 samplesSamuel Pitoiset2019-05-222-51/+0
| | | | | | | The driver only supports up to 8 samples. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv, radv, anv: Replace ptr_type with addr_formatCaio Marcelo de Oliveira Filho2019-05-201-5/+5
| | | | | | | | | Instead of setting the glsl types of the pointers for each resource, set the nir_address_format, from which we can derive the glsl_type, and in the future the bit pattern representing a NULL pointer. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: decompress FMASK before performing a MSAA decompress using FMASKSamuel Pitoiset2019-05-201-2/+13
| | | | | | | This fixes some CTS failures related to VK_EXT_sample_locations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>