aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeon/r600_pipe_common.c
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: use r600_gfx_write_event_eop everywhereMarek Olšák2016-10-261-1/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make r600_gfx_write_fence more genericMarek Olšák2016-10-261-10/+25
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix a ZPASS comment, EVENT_WRITE_EOP fixupsMarek Olšák2016-10-261-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: support ARB_compute_variable_group_sizeNicolai Hähnle2016-10-101-1/+9
| | | | | | | | Not sure if it's possible to avoid programming the block size twice (once for the userdata and once for the dispatch). Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_COMPUTE_CAP_MAX_VARIABLE_THREADS_PER_BLOCKSamuel Pitoiset2016-10-071-0/+2
| | | | | | | | | v3: - use a new case statement in r600_pipe_common.c - fix compilation of softpipe... Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: implement set_device_reset_callbackNicolai Hähnle2016-10-051-0/+32
| | | | | | | | | Check for device reset on flush. It would be nicer if the kernel just reported this as an error on the submit ioctl (and similarly for fences), but this will do for now. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use the new parent/child pools for transfersNicolai Hähnle2016-10-051-3/+6
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: optionally run the LLVM IR verifier passNicolai Hähnle2016-10-041-0/+7
| | | | | | | | This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: move r600_common_context::texture_buffers to r600gMarek Olšák2016-10-041-2/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: inline r600_context_add_resource_sizeMarek Olšák2016-10-041-20/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: emit relocations for query fencesNicolai Hähnle2016-09-301-1/+6
| | | | | | | | | This is only needed for r600 which doesn't have ARB_query_buffer_object and therefore wouldn't really need the fences, but let's be optimistic about filling in this feature gap eventually. Cc: Dieter Nützel <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: implement get_query_result_resource (v2)Nicolai Hähnle2016-09-291-0/+3
| | | | | | | v2: fix a comment (Gustaw Smolarczyk) Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add r600_gfx_{write,wait}_fenceNicolai Hähnle2016-09-291-0/+52
| | | | | | | For bottom-of-pipe fences inside the gfx command stream. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/compute: Use the HSA abi for non-TGSI compute shaders v3Tom Stellard2016-09-161-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch switches non-TGSI compute shaders over to using the HSA ABI described here: https://github.com/RadeonOpenCompute/ROCm-Docs/blob/master/AMDGPU-ABI.md The HSA ABI provides a much cleaner interface for compute shaders and allows us to share more code in the compiler with the HSA stack. The main changes in this patch are: - We now pass the scratch buffer resource into the shader via user sgprs rather than using relocations. - Grid/Block sizes are now passed to the shader via the dispatch packet rather than at the beginning of the kernel arguments. Typically for HSA, the CP firmware will create the dispatch packet and set up the user sgprs automatically. However, in Mesa we let the driver do this work. The main reason for this is that I haven't researched how to get the CP to do all these things, and I'm not sure if it is supported for all GPUs. v2: - Add comments explaining why we are setting certain bits of the scratch resource descriptor. v3: - Use amdgcn-mesa-mesa3d triple instead of amdgcn--mesa3d. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: page alignment for buffers is unnecessaryNicolai Hähnle2016-09-121-4/+1
| | | | | | In some places (e.g. shader program pointers) we require 256 bytes alignment. Reviewed-by: Marek Olšák <[email protected]>
* gallium: switch drivers to the slab allocator in src/utilMarek Olšák2016-09-061-4/+3
|
* radeonsi: return correct eviction stats for NVX_gpu_memory_infoMarek Olšák2016-09-051-2/+7
| | | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add cap to export device pointer sizeJan Vesely2016-08-291-0/+8
| | | | | | | | | v2: document the new cap v3: fix 80 char limit in screen.rst Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium/radeon: unify and simplify checking for an empty gfx IBMarek Olšák2016-08-251-10/+19
| | | | | | | We can take advantage of the fact that multi_fence does the obvious thing with NULL fences. This fixes unflushed fences that can get stuck due to empty IBs.
* gallium/radeon: use unflushed fences for deferred flushes (v2)Marek Olšák2016-08-101-1/+43
| | | | | | | | | | +23% Bioshock Infinite performance. v2: - use the new fence_finish interface - allow deferred fences with multiple contexts - clear the ctx pointer after a deferred flush Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add a pipe_context parameter to fence_finishMarek Olšák2016-08-101-0/+1
| | | | | | | | required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush the context Reviewed-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* winsys/amdgpu: query ME/PFP/CE firmware versionsNicolai Hähnle2016-08-081-0/+3
| | | | | | | The radeon kernel module doesn't have the firmware query interface, so the corresponding values will remain 0. Reviewed-by: Marek Olšák <[email protected]>
* Revert "gallium/radeon: count contexts"Marek Olšák2016-08-061-3/+0
| | | | | | This reverts commit b403eb338533894ee012a96bf55653996c92ec7c. Not needed.
* gallium/radeon: count contextsMarek Olšák2016-08-061-0/+3
| | | | | | We don't wanna use unflushed fences when we have multiple contexts. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: move radeon_winsys::cs_memory_below_limit to driversMarek Olšák2016-08-061-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: inline radeon_winsys::query_memory_usageMarek Olšák2016-08-061-1/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add r600_resource::vram_usage and gart_usageMarek Olšák2016-08-061-12/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: move last_gfx_fence from radeonsi to common codeMarek Olšák2016-08-031-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement buffer_subdata without indirect callsMarek Olšák2016-07-231-2/+11
| | | | | | There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-231-2/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* gallium/radeon: make deferred flushes asynchronousMarek Olšák2016-07-221-0/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: just save buffer sizes instead of buffers while recording IBsMarek Olšák2016-07-131-5/+0
| | | | | | whole buffer objects are not needed Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add and use radeon_info::max_alloc_size (v2)Marek Olšák2016-07-051-6/+5
| | | | | | | | | | v2: - squashed the patches - use INT_MAX - clamp max_const_buffer_size - check the DRM version in radeon Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Vedran Miletić <[email protected]>
* radeonsi: print LLVM IRs to ddebug logsMarek Olšák2016-07-051-0/+1
| | | | | | | Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: R600_DEBUG=nodccfb disables separate DCCMarek Olšák2016-06-291-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add and use r600_texture_referenceMarek Olšák2016-06-291-2/+1
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2)Marek Olšák2016-06-291-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: boolean -> bool, TRUE -> true, FALSE -> falseMarek Olšák2016-06-251-1/+1
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon: check VM faults from DMA flushNicolai Hähnle2016-06-241-2/+24
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract IB and bo list saving into separate functionsNicolai Hähnle2016-06-241-0/+53
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add a debug flag for unsafe math LLVM optimizationsMarek Olšák2016-06-211-0/+1
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: rename allocator_so_filled_size -> allocator_zeroed_memoryMarek Olšák2016-06-041-4/+4
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* gallium/u_suballoc: allow different alignment for each allocationMarek Olšák2016-06-041-1/+1
| | | | | | | | | Just move the alignment parameter from u_suballocator_create to u_suballocator_alloc. Reviewed-by: Alex Deucher <[email protected]> Tested-by: Grazvydas Ignotas <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeon/winsys: introduce radeon_winsys_cs_chunkNicolai Hähnle2016-06-011-1/+1
| | | | | | | We will chain multiple chunks together and will keep pointers to the older chunks to support IB dumping. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use cs_check_space throughoutNicolai Hähnle2016-06-011-1/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add the kernel version into the renderer stringMarek Olšák2016-05-261-3/+9
| | | | | | | | | | | Example: Gallium 0.4 on AMD TONGA (DRM 3.2.0 / 4.5.0, LLVM 3.9.0) My kernel version is pretty long already (4.5.0-amd-01025-g32791c1) and adding "kernel" into the string would make too it long for glxinfo to display. Reviewed-by: Michel Dänzer <[email protected]>
* gallium/radeon: add radeon_emitted to check for non-trivial IBsNicolai Hähnle2016-05-171-3/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: don't flush the GFX IB if DMA doesn't depend on itMarek Olšák2016-05-101-2/+8
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: consolidate radeon_add_to_buffer_list calls for DMAMarek Olšák2016-05-101-0/+14
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a heuristic for better (S)DMA performanceMarek Olšák2016-05-101-0/+14
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>