summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: start conditionalising vertex inputs. (v2)Dave Airlie2017-04-194-14/+63
| | | | | | | | | | | In practice this will probably just drop draw id in a few places. v2: just do draw_id for now. (Bas) it might be possible to do something more if we need it in the future. (nha) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add initial pre-pass for shader info gatheringDave Airlie2017-04-196-9/+116
| | | | | | | | | | | | There is some radv specific info we need to gather from shaders before we get into converting nir->llvm, so we can make better decisions especially around user sgpr allocation. This is just an initial placeholder to gather if sample positions are required in the frag shader. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: enable timestampComputeAndGraphicsGrazvydas Ignotas2017-04-171-1/+1
| | | | | | | | | | | | Commit bfee9866 "radv: Use RELEASE_MEM packet for MEC timestamp query." added WriteTimestamp handling for compute queues but forgot to flip the flag. Tested with DOOM (by me) and CTS (by Bas), but without verification that these tests actually use timestamps on compute queues. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* android: amd/addrlib: trivial fix for gfx9 supportMauro Rossi2017-04-171-0/+2
| | | | | | | | | | | | Fixes the following build error: external/mesa/src/amd/addrlib/gfx9/gfx9addrlib.cpp:36:10: fatal error: 'gfx9_gb_reg.h' file not found ^ 1 error generated. Fixes: 7f160ef "amd/addrlib: import gfx9 support" Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radv: remove the temp descriptor set infrastructureFredrik Höglund2017-04-142-76/+28
| | | | | | | It is no longer used. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: use push descriptors in metaFredrik Höglund2017-04-146-416/+301
| | | | | | | Use push descriptors instead of temp descriptor sets. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add private push descriptors for metaFredrik Höglund2017-04-142-0/+41
| | | | | | | | | | | | This allows meta to use push descriptors without disturbing user push descriptors. radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR in that partial updates are not supported; all descriptors used in subsequent draw commands must be pushed at the same time. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove irrelevant commentGrazvydas Ignotas2017-04-141-1/+1
| | | | | | | A leftover from anv. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: report timestampPeriod correctlyGrazvydas Ignotas2017-04-142-2/+2
| | | | | | | | | | | | The kernel returns frequency in kHz, so to convert to nanosecond interval that Vulkan uses the dividend should be 1000000.0 and not 100000.0. This fixes the GPU graph in DOOM and matches the amdgpu-pro blob. Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver" Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make sizes & offsets 32 bit in radv_descriptor_update_template_entry.Bas Nieuwenhuizen2017-04-142-7/+7
| | | | | | | v2: Also convert the calculations. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Fredrik Höglund <[email protected]>
* radv: Set descriptor set limits.Bas Nieuwenhuizen2017-04-131-15/+29
| | | | | | | Properly and with comments this time. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Increase integer sizes in descriptor sets.Bas Nieuwenhuizen2017-04-131-8/+8
| | | | | | | | Needed if we want to allow them taking more than 64 KiB. The calculations of these already used 32 bits. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: support S8_UINT as a depth/stencil format.Dave Airlie2017-04-141-1/+1
| | | | | | | This enables a bunch of NotSupported CTS tests. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: bump maxGeometryShaderInvocations.Dave Airlie2017-04-141-1/+1
| | | | | | | | | This bumps it to the same level as amdgpu-pro, it also moves a bunch of dEQP-VK.geometry.instanced.* from NotSupported to Pass. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Add more trace points.Bas Nieuwenhuizen2017-04-132-0/+3
| | | | | | | | | | | Most trace points happen after an operation, so add a trace point at the start of the command buffer. Furthermore, add one after a CmdUpdateBuffer using CP_DMA as that didn't emit one yet. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Ignore CmdUpdateBuffer with size 0.Bas Nieuwenhuizen2017-04-131-0/+3
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Enable query inheritance.Bas Nieuwenhuizen2017-04-131-1/+1
| | | | | | | | | | | | | | timestamp and pipeline_statistics only do something on begin & end, so they don't need any action. Occlusion queries only do something to enable/disable and that register is set nowhere else so that doesn't need extra support either. (We technically should fix it to update the reg with the number of samples, but that hasn't happened yet, so we only change it to enable/disable counting) Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: enable variableMultisampleRate.Bas Nieuwenhuizen2017-04-131-1/+1
| | | | | | | | | This is only relevant with 0 attachments. In that case we do nothing on subpass switch already, and the pipeline is the authoritative source of the number of samples, so this shouldn't change anything. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix stencil regression since new addrlib importDave Airlie2017-04-132-1/+9
| | | | | | | | | | The addrlib import meant we'd return after we attempted to setup the no stencil bits for an S8_UINT, now we break and use the stencil level info when creating stencil DB info. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: allocate thin textures as linear.Dave Airlie2017-04-131-0/+7
| | | | | | | | | | This is ported from radeonsi, and avoids the bug in the addrlib code. This should probably be something addrlib does for us, but for now this fixes the regression without changing addrlib and aligns us with radeonsi. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Disable primitive restart for non-indexed drawsAlex Smith2017-04-122-22/+34
| | | | | | | | | | | | | | | | According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's primitiveRestartEnable flag should only apply to indexed draws, however it was being enabled regardless of the type of draw. This could cause problems for non-indexed draws with >=65535 vertices if the previous indexed draw used 16-bit indices. Fixes corruption of the credits text in Mad Max. v2: Reset primitive restart state after executing a secondary command buffer. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Hash the immutable samplers.Bas Nieuwenhuizen2017-04-121-0/+3
| | | | | | | Since the shader code can include them. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use an offset instead of pointers for immutable samplers.Bas Nieuwenhuizen2017-04-124-27/+39
| | | | | | | Makes more sense when we hash the layout for the pipeline cache. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Stop shadowing the result in radv_GetQueryPoolResults.Bas Nieuwenhuizen2017-04-121-4/+4
| | | | | | | The outer result was referred to, which meant bugs. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Return VK_NOT_READY if the query results are not available.Bas Nieuwenhuizen2017-04-121-0/+6
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <[email protected]>
* radv: Set query availability bit even if we don't wait.Bas Nieuwenhuizen2017-04-121-3/+4
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Fixes: 8475a14302e ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <[email protected]>
* radv: Implement pipeline statistics queries.Bas Nieuwenhuizen2017-04-113-27/+394
| | | | | | | | | | | The devil is in the shader again, otherwise this is fairly straightforward. The CTS contains no pipeline statistics copy to buffer testcases, so I did a basic smoketest. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Let count be dynamic in radv_break_on_count.Bas Nieuwenhuizen2017-04-111-3/+3
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Rename query pipeline/set layout.Bas Nieuwenhuizen2017-04-112-13/+13
| | | | | | | For using them with both occlusion and pipeline statistics queries. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use VK_WHOLE_SIZE for the query buffer bindings.Bas Nieuwenhuizen2017-04-111-2/+2
| | | | | | | | The buffer sizes are specified just a few lines earlier, so don't repeat ourselves. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Use a shader for occlusion CmdCopyQueryPoolResults.Bas Nieuwenhuizen2017-04-111-74/+64
| | | | | | | | | | | | | | | | | | Use the new occlusion query copy shader. We don't use the shader for the waiting as a polling loop ineracts badly with having caching enabled. I noticed on my GPU (Tonga) that the values are written out in order, so I just use a WAIT_REG_MEM on the last value. If it turns out other chips don't do that we may need to look a bit more into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal. This also restricts the availability word in the pool to timestamp queries only, as occlusion queries don't use it, and pipeline statistic queries likely won't either. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: Add occlusion query shader.Bas Nieuwenhuizen2017-04-114-0/+435
| | | | | | | | | Adds a shader for writing occlusion query results to a buffer, as the CP packet isn't support on SI or secondary buffers, and doesn't handle the availability bit (or partial results) nor truncation to 32-bit. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac: add unreachable() in ac_build_image_opcode()Samuel Pitoiset2017-04-101-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | To silent the following compiler warning: common/ac_llvm_build.c: In function ‘ac_build_image_opcode’: common/ac_llvm_build.c:1080:3: warning: ‘name’ may be used uninitialized in this function [-Wmaybe-uninitialized] snprintf(intr_name, sizeof(intr_name), "%s%s%s%s.v4f32.%s.v8i32", ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ name, ~~~~~ a->compare ? ".c" : "", ~~~~~~~~~~~~~~~~~~~~~~~ a->bias ? ".b" : ~~~~~~~~~~~~~~~~ a->lod ? ".l" : ~~~~~~~~~~~~~~~ a->deriv ? ".d" : ~~~~~~~~~~~~~~~~~ a->level_zero ? ".lz" : "", ~~~~~~~~~~~~~~~~~~~~~~~~~~~ a->offset ? ".o" : "", ~~~~~~~~~~~~~~~~~~~~~~ type); ~~~~~ Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/addrlib: use correct variable name in headerThomas Hindoe Paaboel Andersen2017-04-101-1/+1
| | | | | | | | Since the inclusion in 7f160efcde41b52ad78e562316384373dab419e3 the header used x_biased, while the implementation used y_biased. This changes the header to macth the implementation since the uses of the function seems to expect y_biased. Reviewed-by: Marek Olšák <[email protected]>
* radv: don't call radeon_check_space in radv_BindDescriptorSetsFredrik Höglund2017-04-071-5/+0
| | | | | | | | This appears to be a leftover from an earlier version of this function. Nothing is emitted into the CS. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_KHR_descriptor_update_templateFredrik Höglund2017-04-075-0/+231
| | | | | | | | | | | All offsets and strides are precomputed by radv_CreateDescriptorUpdateTemplateKHR and stored in the template. v2: Move the new struct declarations from radv_descriptor_set.h to radv_private.h (Bas) Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement VK_KHR_push_descriptorFredrik Höglund2017-04-076-2/+128
| | | | | Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: replace an assertion with a conditionalFredrik Höglund2017-04-071-3/+3
| | | | | | | | | | | | | | | | Replace the !binding_layout->immutable_samplers assertion in radv_update_descriptor_sets with a conditional. The Vulkan specification does not say that it is illegal to update a sampler descriptor when it is immutable; only that pImageInfo is ignored. This change is also needed for push descriptors, because valid descriptors must be pushed for all bindings accessed by shaders, including immutable sampler descriptors. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: refactor radv_UpdateDescriptorSetsFredrik Höglund2017-04-072-12/+52
| | | | | | | | | | | | | Move the implementation into a separate function that takes a cmd_buffer and a dstSetOverride parameter. When cmd_buffer is not NULL, radv_update_descriptor_sets calls cs_add_buffer directly instead of updating the buffer list. This will be used to implement VK_KHR_push_descriptor. Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/addrlib: automake: add all headers to the tarballEmil Velikov2017-04-051-0/+2
| | | | | Fixes: 7f160efcde4 ("amd/addrlib: import gfx9 support") Signed-off-by: Emil Velikov <[email protected]>
* amd/addrlib: second update for Vega10 + bug fixesMarek Olšák2017-04-0417-2132/+3298
| | | | | | | | | | | | | | | | | | | | | | | Highlights: - Display needs tiled pitch alignment to be at least 32 pixels - Implement Addr2ComputeDccAddrFromCoord(). - Macro-pixel packed formats don't support Z swizzle modes - Pad pitch and base alignment of PRT + TEX1D to 64KB. - Fix support for multimedia formats - Fix a case "PRT" entries are not selected on SI. - Fix wrong upper bits in equations for 3D resource. - We can't support 2d array slice rotation in gfx8 swizzle pattern - Set base alignment for PRT + non-xor swizzle mode resource to 64KB. - Bug workaround for Z16 4x/8x and Z32 2x/4x/8x MSAA depth texture - Add stereo support - Optimize swizzle mode selection - Report pitch and height in pixels for each mip - Adjust bpp/expandX for format ADDR_FMT_GB_GR/ADDR_FMT_BG_RG - Correct tcCompatible flag output for mipmap surface - Other fixes and cleanups Acked-by: Alex Deucher <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* radv: Increase descriptor limits.Bas Nieuwenhuizen2017-04-041-14/+14
| | | | | | | | We supported more generally. Decreased the dynamic buffers though, as we only support 16 for uniform+storage. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Rework guard band calculation.Bas Nieuwenhuizen2017-04-031-40/+15
| | | | | | | | | | | | | | | | | | We want the guardband_x/y to be the largerst scalars such that each viewport scaled by that amount is still a subrange of [-32767, 32767]. The old code has a couple of issues: 1) It used scissor instead of viewport_scissor, potentially taking into account a viewport that is too small and therefore selecting a scale that is too large. 2) Merging the viewports isn't ideal, as for example viewports with boundaries [0,1] and [1000, 1001] would allow a guardband scale of ~30k, while their union [0, 1001] only allows a scale of ~32. The new code just determines the guardband per viewport and takes the minimum. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: Enable VK_KHR_incremental_present.Bas Nieuwenhuizen2017-04-033-1/+15
| | | | | | | Just enabling the driver-independent implementation that Jason did. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan/wsi: Plumb present regions through the common codeJason Ekstrand2017-04-031-1/+2
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radv: fix interp at sample code.Dave Airlie2017-04-041-3/+1
| | | | | | | | | | Interp at sample needs to use the center, since the sample positions it retrieves are relative to the center. This fixes a bunch of CTS tests with multisample_interpolation. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: overhaul fragment shader sample positions.Dave Airlie2017-04-045-51/+87
| | | | | | | | | | | | | | | | | | The current code was broken, and I decided to redesign it instead. This puts the sample positions for all samples into the queue constant descriptor buffer after all the spill/ring descriptors. It then uses a single offset register to point how far into the samples the samples for num_samples are. This saves one user sgpr and means we only generate the sample position data in the rare single case where we need it currently. This doesn't fix the failing CTS tests without the followup fix. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: fix texture derivative orderingDave Airlie2017-04-041-2/+2
| | | | | | | | | The ordering NIR gives us is correct for the hw, this fixes: dEQP-VK.glsl.texture_functions.texturegrad.* (mainly trigged on isampler/usampler 3d textures.). Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: round cube array coordinate before fixup.Dave Airlie2017-04-041-1/+5
| | | | | | | | This fixes: dEQP-VK.glsl.texture_functions.texture.samplercubearray* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: move to using common buffer load format.Dave Airlie2017-04-041-8/+5
| | | | | | | Get rid of usage of SI.vs.load.input. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>