summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi/gfx9: don't ever flush the TC metadata cacheMarek Olšák2017-06-221-10/+3
| | | | | | | | The closed Vulkan driver doesn't do it either. Also remove some old comments that aren't useful. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: use TC L2 for fast color clear with CP DMAMarek Olšák2017-06-221-2/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't emit partial flushes at the end of IBs (v2)Marek Olšák2017-06-221-5/+9
| | | | | | | | The kernel sort of does the same thing with fences. v2: do emit partial flushes on SI Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the correct LLVMTargetMachineRef in si_build_shader_variantNicolai Hähnle2017-06-221-6/+22
| | | | | | | | | | | | si_build_shader_variant can actually be called directly from one of normal-priority compiler threads. In that case, the thread_index is only valid for the normal tm array. v2: - use the correct sel/shader->compiler_ctx_state Fixes: 86cc8097266c ("radeonsi: use a compiler queue with a low priority for optimized shaders") Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fenceMarek Olšák2017-06-223-8/+28
| | | | | | | | instead of using a monotonic suballocator v2: initialize the memory at context creation Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: enable the constant engineMarek Olšák2017-06-221-4/+1
| | | | | | | I think this kernel commit fixes it: drm/amdgpu:use FRAME_CNTL for new GFX ucode Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: indirect buffers and all CP packets use TC L2Marek Olšák2017-06-224-13/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: flush CB after MSAA only when transitioning from CB to texturesMarek Olšák2017-06-222-14/+60
| | | | | | | | | | | | | The main flush before texturing is done after the FMASK decompress pass. CB after MSAA rendering is not flushed in set_framebuffer_state and also not in memory_barrier if the current color buffer is MSAA. We fully rely on the FMASK decompress pass for the flushing. Some CB decompress and resolve passes need an explicit flush before and after. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: unify CB_RESOLVE blitter invocation codeMarek Olšák2017-06-221-17/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: flush DB caches only when transitioning from DB to texturingMarek Olšák2017-06-225-25/+56
| | | | | | | | | Use the mechanism of si_decompress_textures, but instead of doing the actual decompression, just flag the DB cache flush there. This removes a lot of unnecessary DB cache flushes. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add separate HUD counters for CB and DB cache flushesMarek Olšák2017-06-221-3/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set correct usage flag according to image access typeSamuel Pitoiset2017-06-201-1/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: update all resident texture descriptors when neededSamuel Pitoiset2017-06-201-57/+104
| | | | | | | | | | | | | | | To avoid useless DCC fetches when DCC is disabled, descriptors have to be updated in order to reflect this change. This is quite similar to how we update descriptors of bound textures. As a side effect, this should also prevent VM faults when bindless textures are invalidated, because the VA in the descriptor has to be updated accordingly as well. I don't see any performance improvements with DOW3. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: keep track of the sampler state for texture handlesSamuel Pitoiset2017-06-202-0/+2
| | | | | | | | Needed for updating all resident texture descriptors when dirty_tex_counter changes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix dumping shader descriptors into ddebug logsMarek Olšák2017-06-191-35/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add a workaround for inexact SNORM8 blitting againMarek Olšák2017-06-191-0/+37
| | | | | | | | GFX9 is affected. We only have tests for GL_x_SNORM where x is R8, RG8, RGB8, and RGBA8. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: fix TC-compatible stencil compressionMarek Olšák2017-06-191-0/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: fix TXF_LZ with 1D texturesMarek Olšák2017-06-191-1/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi/gfx9: disable sparse buffersMarek Olšák2017-06-191-0/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: reduce overhead for resident textures which need color decompressionSamuel Pitoiset2017-06-184-34/+58
| | | | | | | | | This is done by introducing a separate list. si_decompress_textures() is now 5x faster. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: reduce overhead for resident textures which need depth decompressionSamuel Pitoiset2017-06-184-8/+29
| | | | | | | This is done by introducing a separate list. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use util_dynarray_foreach for bindless resourcesSamuel Pitoiset2017-06-182-129/+46
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add a new HUD query for the number of resident handlesSamuel Pitoiset2017-06-181-0/+3
| | | | | | | | | Useful for debugging performance issues when ARB_bindless_texture is enabled. This query doesn't make a distinction between texture and image handles. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: include ac_binary.h for struct ac_shader_binaryEmil Velikov2017-06-171-2/+2
| | | | | | | | | | | | | | The header embeds the struct so it needs the header inclusion instead of the dummy forward declaration. Cc: Nicolai Hähnle <[email protected]> Cc: Marek Olšák <[email protected]> Cc: Tom Stellard <[email protected]> Fixes: 32206c5e560 ("radeonsi: Add radeon_shader_binary member to struct si_shader") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable ARB_bindless_textureSamuel Pitoiset2017-06-141-1/+3
| | | | | | | This has only been tested on RX480. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add support for loading bindless imagesSamuel Pitoiset2017-06-141-7/+21
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add support for loading bindless samplersSamuel Pitoiset2017-06-141-3/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: invalidate buffers which are made resident if neededSamuel Pitoiset2017-06-141-0/+34
| | | | | | | | When a buffer becomes resident, check if it has been invalidated, if so update the descriptor and the dirty flag. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: upload new descriptors when resident buffers are invalidatedSamuel Pitoiset2017-06-142-0/+148
| | | | | | | | | | | | | When texture buffers are invalidated the addr in the resident descriptor has to be updated but we can't create a new descriptor because the resident handle has to be the same. Instead, use the WRITE_DATA packet which allows to update memory directly but graphics/compute have to be idle in case the GPU is reading the descriptor. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only decompress resident textures/images when usedSamuel Pitoiset2017-06-141-2/+11
| | | | | | | | When the current bound shaders don't use any bindless textures or images, it's useless to decompress the resident resources. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: track use of bindless samplers/images from tgsi_shader_infoSamuel Pitoiset2017-06-145-5/+46
| | | | | | | | | This adds some new helper functions to know if the current draw call (or dispatch compute) is using bindless samplers/images, based on TGSI analysis. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress resident textures/images before graphics/computeSamuel Pitoiset2017-06-143-0/+114
| | | | | | | | Similar to the existing decompression code path except that it loops over the list of resident textures/images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress DCC for resident textures/imagesSamuel Pitoiset2017-06-142-0/+83
| | | | | | | | | | Analogous to bound textures/images. We should also update the resident descriptors and disable COMPRESSION_EN for avoiding useless DCC fetches, but I postpone this optimization for a separate series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only add descriptors in presence of resident handlesSamuel Pitoiset2017-06-141-0/+6
| | | | | | | | | This won't help much except for applications that use a ton of resident handles. Though, this will reduce the winsys overhead a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add all resident buffers to the current CSSamuel Pitoiset2017-06-143-0/+52
| | | | | | | | | | | Resident buffers have to be added to every new command stream. Though, this could be slightly improved when current shaders don't use any bindless textures/images but usually applications tend to use bindless for almost every draw call, and the winsys thread might help when buffers are added early. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement ARB_bindless_textureSamuel Pitoiset2017-06-143-0/+285
| | | | | | | | This implements the Gallium interface. Decompression of resident textures/images will follow in the next patches. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add a slab allocator for bindless descriptorsSamuel Pitoiset2017-06-144-0/+119
| | | | | | | | | | | | | | | | | | | For each texture/image handles, we need to allocate a new buffer for the bindless descriptor. But when the number of buffers added to the current CS becomes high, the overhead in the winsys (and in the kernel) is important. To reduce this bottleneck, the idea is to suballocate the bindless descriptors using a slab similar to the one used in the winsys. Currently, a buffer can hold 1024 bindless descriptors but this limit is arbitrary and could be changed in the future for some reasons. Once a slab is allocated the "base" buffer is added to a per-context list. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_shader_image_desc() helperSamuel Pitoiset2017-06-141-32/+47
| | | | | | | To share some common code between bound and bindless images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_sampler_view_desc() helperSamuel Pitoiset2017-06-141-43/+52
| | | | | | | To share some common code between bound and bindless textures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_init_descriptor_list() helperSamuel Pitoiset2017-06-141-0/+15
| | | | | | | | This will be used in order to initialize resident descriptors for bindless textures/images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_BINDLESS_TEXTURESamuel Pitoiset2017-06-141-0/+1
| | | | | | | | | Whether bindless texture operations are supported by the underlying driver. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack si_context betterMarek Olšák2017-06-121-18/+18
| | | | | | there isn't much to gain here Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack si_framebuffer betterMarek Olšák2017-06-121-6/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack si_sampler_view betterMarek Olšák2017-06-121-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack si_buffer_resources betterMarek Olšák2017-06-121-4/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack struct si_descriptors betterMarek Olšák2017-06-121-15/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: pack struct si_vertex_elements betterMarek Olšák2017-06-121-9/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: replace si_vertex_elements::elements with separate fieldsMarek Olšák2017-06-124-14/+14
| | | | | | It makes si_vertex_elements a little smaller. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: rename si_vertex_element -> si_vertex_elementsMarek Olšák2017-06-124-6/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: allocate si_state_rasterizer::pm4_poly_offset only when neededMarek Olšák2017-06-122-2/+14
| | | | | | Each element has over 700 bytes. Reviewed-by: Nicolai Hähnle <[email protected]>