summaryrefslogtreecommitdiffstats
path: root/src/amd
Commit message (Collapse)AuthorAgeFilesLines
* radv: Correctly detect changed shaders for vertex descriptors.Bas Nieuwenhuizen2017-10-211-6/+6
| | | | | | | | As they were emitted after the new pipeline, the changed pipeline detection was not working anymore. Fixes: 341529dbee5 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Set larged wrokgroup size for GS on GFX9.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | They don't take a single wave anymore and we need the barriers. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Take the max workgroup size of all provided shaders.Bas Nieuwenhuizen2017-10-211-1/+6
| | | | | Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix pipeline cache locking issuesAlex Smith2017-10-211-7/+23
| | | | | | | | | | | | Need to lock around the whole process of retrieving cached shaders, and around GetPipelineCacheData. This fixes GPU hangs observed when creating multiple pipelines in parallel, which appeared to be due to invalid shader code being pulled from the cache. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: disable implicit sync for radv allocated bos v3Andres Rodriguez2017-10-214-1/+8
| | | | | | | | | | | | | | | | | Implicit sync kicks in when a buffer is used by two different amdgpu contexts simultaneously. Jobs that use explicit synchronization mechanisms end up needlessly waiting to be scheduled for long periods of time in order to achieve serialized execution. This patch disables implicit synchronization for all radv allocations except for wsi bos. The only systems that require implicit synchronization are DRI2/3 and PRIME. v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC v3: Add drm version check (Bas) Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: factor out radv_alloc_memoryAndres Rodriguez2017-10-212-5/+25
| | | | | | | | | This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose VK_EXT_global_priorityAndres Rodriguez2017-10-214-0/+5
| | | | | | | Expose the extension string as supported Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't skip PS/VS partial flushAndres Rodriguez2017-10-211-8/+6
| | | | | | | | | | | | This patch helps lower high priority compute latency. Found by bisecting a perf regression on computeparticles with high priority compute queues enabled. Reverting this micro-optimization doesn't seem to have any negative effect on performance on Dota2 or ssao. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement VK_EXT_global_priorityAndres Rodriguez2017-10-214-8/+62
| | | | | | | | | This extension allows the caller to change a queue's system wide priority. This is useful for applications with specific latency constraints. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: hardcode shader WAVE_LIMIT to the maximum valueAndres Rodriguez2017-10-211-9/+18
| | | | | | | | | | | | | | When WAVE_LIMIT is set, a submission will opt-in for SPI based resource scheduling. Because this mechanism is cooperative, we must ensure that all submissions have this field set, otherwise they will bypass resource arbitration. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-203-26/+26
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* radv: use optimal packet order for drawsSamuel Pitoiset2017-10-201-17/+79
| | | | | | | | | Ported from RadeonSI. The time where shaders are idle should be shorter now. This can give a little boost, like +6% with the dynamicubo Vulkan demo. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_emit_shaders_prefetch()Samuel Pitoiset2017-10-201-12/+19
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_emit_shader_prefetch()Samuel Pitoiset2017-10-201-25/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BITFredrik Höglund2017-10-201-2/+1
| | | | | | | | | | | | The Vulkan specification says: "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_- PIPE_BIT in the source stage mask will effectively not wait for any prior commands to complete." Signed-off-by: Fredrik Höglund <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor}Samuel Pitoiset2017-10-201-2/+2
| | | | | | Fixes two compilation warnings in release build. Trivial. Signed-off-by: Samuel Pitoiset <[email protected]>
* radv: rename radv_cmd_buffer_flush_state() to radv_draw()Samuel Pitoiset2017-10-201-59/+51
| | | | | | | Similar to the dispatch codepath. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit primitive restart from radv_emit_draw_registers()Samuel Pitoiset2017-10-201-29/+30
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: add radv_emit_draw_registers()Samuel Pitoiset2017-10-201-12/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: refactor indirect draws (+count buffer) with radv_draw_infoSamuel Pitoiset2017-10-201-103/+48
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: refactor indirect draws with radv_draw_infoSamuel Pitoiset2017-10-201-75/+133
| | | | | | | | Indirect draws with a count buffer will be refactored in a separate patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: refactor simple and indexed draws with radv_draw_infoSamuel Pitoiset2017-10-201-68/+118
| | | | | | | | | Similar to the dispatch compute logic but for draw calls. For convenience, indirect draws will be converted in a separate patch. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite itSamuel Pitoiset2017-10-201-2/+11
| | | | | | | | | Only on CIK and later. We should only update VGT_INDEX_TYPE but it seems easier to re-emit all the index buffer packets. Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clear the dirty flags in the corresponding emit helpersSamuel Pitoiset2017-10-201-2/+8
| | | | | | | This will allow us to fix the VGT_INDEX_TYPE issue properly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFERSamuel Pitoiset2017-10-202-3/+3
| | | | | | | To be consistent with the emit function name. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: move DB_COUNT_CONTROL initialization to si_emit_config()Samuel Pitoiset2017-10-202-1/+5
| | | | | | | CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable GS on GFX9Bas Nieuwenhuizen2017-10-201-3/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: calculate and emit GFX9 GS registers to pipeline state.Bas Nieuwenhuizen2017-10-204-7/+158
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Fix up GS input vgprs.Bas Nieuwenhuizen2017-10-201-0/+15
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add loading from LDS for merged GS.Bas Nieuwenhuizen2017-10-201-15/+21
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add ES output to LDS for GFX9.Bas Nieuwenhuizen2017-10-201-8/+49
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Add merged GS function.Bas Nieuwenhuizen2017-10-201-17/+63
| | | | | | [airlied: merged fixup + and fixed up a couple more bits]. Reviewed-by: Dave Airlie <[email protected]>
* radv: Only emit TES when it exists.Bas Nieuwenhuizen2017-10-201-4/+6
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Use control shader presence for detecting tess.Bas Nieuwenhuizen2017-10-201-1/+1
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: fixup tess eval shader when combined.Dave Airlie2017-10-202-6/+23
| | | | | | | | | | This fixes some access to the tess eval shader when it's combined with geometry on gfx9. This is a review of Bas's commit: radv: Prevent crashing by accessing TES for VGT reuse depth. Signed-off-by: Dave Airlie <[email protected]>
* radv: Set VGT_GS_MODE properly for gfx9Bas Nieuwenhuizen2017-10-201-4/+7
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: ensure correct outinfo is picked.Dave Airlie2017-10-201-13/+14
| | | | | | | | | | | | | This struct used to rely on being in a union, it isn't anymore, so we have to pick the correct outinfo struct now. This should fix a regression since the union became a struct. dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.) Reviewed-by: Timothy Arceri <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Enable tessellation shaders for GFX9.Bas Nieuwenhuizen2017-10-201-1/+1
| | | | | | It mostly works now. Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: init full exec mask for merged shaders.Dave Airlie2017-10-203-0/+12
| | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: drop unused r600_htile_info.Dave Airlie2017-10-201-9/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix CLEAR_STATE packet length.Dave Airlie2017-10-191-1/+1
| | | | | | | | | | Looking at shader traces I noticed some registers were missing, one of them was being eaten by the wrong clear state length. Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: copy indirect lowering settings from radeonsiTimothy Arceri2017-10-201-1/+26
| | | | | | | | | | | | | | It looks the original indirect mask was probably copied from ANV. Sascha Willems demo results: tessellation ~4000 -> ~4200 fps V2: continue lowering local indirects due to llvm deficiencies. Tested-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: stop redundant setting of active_stagesTimothy Arceri2017-10-201-2/+0
| | | | | | We already set it when above in the nir compilation loop. Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move some code out of loop in store_tcs_output()Timothy Arceri2017-10-201-5/+5
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Modify rsrc1/rsrc2 generation for merged tess.Bas Nieuwenhuizen2017-10-191-7/+16
| | | | | | | No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at a different offset. Reviewed-by: Dave Airlie <[email protected]>
* radv: Set correct registers for merged shader rings.Bas Nieuwenhuizen2017-10-191-12/+24
| | | | | | We need different regs to end up in s0/s1. Reviewed-by: Dave Airlie <[email protected]>
* radv: Add GFX9 HS emitting code.Bas Nieuwenhuizen2017-10-191-5/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Remove remaining hard coded references to VS.Bas Nieuwenhuizen2017-10-193-7/+28
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Update GFX9 user data regs for GS/tess.Bas Nieuwenhuizen2017-10-194-14/+25
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radv: Add code to compile merged shaders.Bas Nieuwenhuizen2017-10-194-13/+39
| | | | Reviewed-by: Dave Airlie <[email protected]>