summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: add a workaround for a bug in LLVM <= 3.8Marek Olšák2016-05-191-0/+7
| | | | | | This is not directly applicable to stable and needs to be backported. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: Mixed colorbuffer formats are unsupportedAxel Davy2016-05-181-0/+10
| | | | | | | | | | Besides depth/stencil, the hardware doesn't support mixed formats. The GL state tracker doesn't make use of them. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Change default behaviour for undefined COLOR0Axel Davy2016-05-181-0/+3
| | | | | | | | | d3d 9 needs COLOR0 to be 1.0 on all channels when undefined. 0.0 for the others is fine. GL behaviour is undefined. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: force level zero on image instructions in non-fragment shaders (v2)Nicolai Hähnle2016-05-171-0/+5
| | | | | | | | | | | | | | | | | | | | | | | Section 8.9 (Texture Functions) of the OpenGL Shading Language 4.5 specification: However, automatic level of detail is computed only for fragment shaders. Other shaders operate as though the base level of detail were computed as zero. and Section 8.9.3 (Texture Gather Functions): When performing a texture gather operation, the minification and magnification filters are ignored, and the rules for LINEAR filtering in the OpenGL Specification are applied to the base level of the texture image to identify the four texels i_0 j_1, i_1 j_1, i_1 j_0, and i_0 j_0. Of course, explicit LOD or derivative variants work in all shader types. This fixes several GL4x-CTS.texture_gather.* tests. v2: TG4 is always level zero (thanks, Ilia) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: emit TXQ in separate functionsNicolai Hähnle2016-05-171-52/+78
| | | | | | | | | TXQ is sufficiently different that having in it in the same code path as texture sampling/fetching opcodes doesn't make much sense. v2: guard against NULL pointer dereferences Reviewed-by: Marek Olšák <[email protected]> (v1)
* gallium/radeon: add radeon_emitted to check for non-trivial IBsNicolai Hähnle2016-05-171-2/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use radeon_emitNicolai Hähnle2016-05-172-25/+25
| | | | | | | Mostly generated using a sed-script, with manual fix-up for multi-line statements. Reviewed-by: Marek Olšák <[email protected]>
* Treewide: Remove Elements() macroJan Vesely2016-05-175-11/+11
| | | | | Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: Add a pipe cap for arb_cull_distanceTobias Klausmann2016-05-141-0/+1
| | | | | | | | | This lets us safely enable or disable the extension as needed Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/sid_tables: rename reg_table to sid_reg_tableNicolai Hähnle2016-05-132-3/+3
| | | | | | | | | This is purely cosmetic, making it easier to assign blame for space used in the binary in case somebody else makes a similar cleanup effort in the future. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/sid_tables: store offset into global fields table instead of pointerNicolai Hähnle2016-05-132-9/+16
| | | | | | | This avoids relocations in the final binary. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi/sid_tables: store strings by offset instead of by pointerNicolai Hähnle2016-05-132-28/+141
| | | | | | | This saves some space and avoids the need for relocations. Acked-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: Set declared tessellation LDS size to hardware size.Bas Nieuwenhuizen2016-05-101-16/+2
| | | | | | | | | | | | | | | The calculated limit gave problems on SI as it was > 32 KiB and the hardware LDS size on SI is only 32 KiB. It isn't correct anyway when processing multiple patches in a threadgroup. As we potentially have any number of patches such that the used LDS is at most the hardware LDS size, and exact size per patch is not known at compile time, this seems like the only valid bound. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: consolidate radeon_add_to_buffer_list calls for DMAMarek Olšák2016-05-102-33/+0
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: flush if DMA IB memory usage is too highMarek Olšák2016-05-102-6/+6
| | | | | | | | This prevents IB rejections due to insane memory usage from many concecutive texture uploads. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add new SDMA texture copy codeMarek Olšák2016-05-101-0/+438
| | | | | | | | | | | | | | | This implements: - Linear-to-linear partial copies. (unaligned) - Tiled-to-linear and linear-to-tiled partial copies. (unaligned except 1-2 Bpp) - Tiled-to-tiled partial copies aligned to 8x8. v2: Extend the SDMA L2T VM fault workaround to T2L. - Same algorithm, just applied to T2L. (and using a 0-based address and surface.bo_size instead of buf->size) Reviewed-by: Alex Deucher <[email protected]> (v1) Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: fix (S)DMA read-after-write hazardsMarek Olšák2016-05-102-0/+3
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: raise the max size for SDMA buffer copiesMarek Olšák2016-05-102-3/+3
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove SDMA texture copy codeMarek Olšák2016-05-101-215/+2
| | | | | | | | | Most of this has never worked according to the new test. The new code will be radically different. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: only expose *_init_*dma_functions from (S)DMA filesMarek Olšák2016-05-105-34/+31
| | | | | | | just normalizing the interfaces Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: implement randomized SDMA texture copy testing (v2)Marek Olšák2016-05-101-0/+3
| | | | | | | | | | | v2: - adjustments for exercising all important SDMA code paths - decrease the probability of getting huge sizes (faster testing) - increase the probability of getting power-of-two dimensions - change the memory cap to 128MB (faster testing) - better detect which engine has been used Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use a common function for DMA blit preparationMarek Olšák2016-05-102-22/+5
| | | | | | | this is more robust and probably fixes some bugs already Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use gart_page_size instead of hardcoded 4096Marek Olšák2016-05-101-1/+1
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: workaround for tesselation on SINicolai Hähnle2016-05-091-0/+8
| | | | | | | | | | | | | | | | | | We request more than 32KB of LDS here, which SI doesn't have. Since LLVM recently started checking the size of declared LDS allocations, all shaders involved in tesselation fail to compile on SI. Note that the entire calculation here seems wrong, given how we calculate indices for generic attributes, so the number ends up wrong on CI+ as well. A proper solution is clearly needed, but this patch should serve as a band-aid for SI in the meantime. Also note that the real size of the LDS allocation in hardware is independent from what we tell LLVM, so this is really more of a "cosmetic" change. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95198 Cc: "11.2" <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: always allocate export memory for pixel shadersNicolai Hähnle2016-05-091-5/+10
| | | | | | | | Experiments with framebuffer-no-attachments type draw calls have shown that NULL exports stall terribly unless we ensure that export memory is allocated by the SPI. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: expose performance counters as 64 bitNicolai Hähnle2016-05-091-5/+8
| | | | | | | This is useful for shader-related counters, since they tend to quickly exceed 32 bits. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix undefined behavior (memcpy arguments must be non-NULL)Nicolai Hähnle2016-05-071-1/+3
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix some reported undefined left-shiftsNicolai Hähnle2016-05-071-3/+3
| | | | | | | | One of these is an unsigned bitfield, which I suspect is a false positive, but gcc 5.3.1 complains about it with -fsanitize=undefined. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: clean left-shift undefined behaviorNicolai Hähnle2016-05-071-2061/+2061
| | | | | | | | | | | | | | Shifting into the sign bit of a signed int is undefined behavior. Unfortunately, there are potentially many places where this happens using the register macros. This commit is the result of running sed -ie "s/(((\(\w\+\)) & 0x\(\w\+\)) << \(\w\+\))/(((unsigned)(\1) \& 0x\2) << \3)/g" on all header files in gallium/{r600,radeon,radeonsi}. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Compute correct LDS size for fragment shaders.Bas Nieuwenhuizen2016-05-061-3/+6
| | | | | | | | No sure where the 36 came from, but we clearly need at least 48 bytes per attribute per primitive. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: set DECOMPRESS_Z_ON_FLUSH if nr_samples >= 4Marek Olšák2016-05-061-1/+2
| | | | | | | | Vulkan always sets this. It only affects in-place Z decompression. This is recommended for performance, but what app uses MSAA depth texturing? Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: mark descriptor loads as using dynamically uniform indicesNicolai Hähnle2016-05-051-5/+17
| | | | | | | | This tells LLVM to always use SMEM loads for descriptors. It fixes a regression in piglit's arb_shader_storage_buffer_object/execution/indirect.shader_test that was caused by LLVM r268259 (but the proper fix is really here in Mesa). Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused tile mode gettersMarek Olšák2016-05-022-157/+2
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in SDMA setupMarek Olšák2016-05-021-51/+28
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in SI DMA setupMarek Olšák2016-05-021-33/+21
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: just read tile mode arrays in DB setupMarek Olšák2016-05-021-36/+19
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: when parsing dmesg, skip empty linesMarek Olšák2016-05-021-0/+3
| | | | | Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the hw MSAA resolving if formats are compatibleMarek Olšák2016-05-021-1/+2
| | | | | | | This allows resolving RGBA into RGBX. This should improve HL2 Lost Coast performance. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix PIPE_FORMAT_R11G11B10_FLOAT handlingNicolai Hähnle2016-05-021-8/+10
| | | | | | | | | | That format has first_non_void < 0. This fixes a regression in piglit arb_shader_image_load_store-semantics that was introduced by commit 76b8c5cc602, while hopefully still shutting Coverity up (and failing in a more obvious way if a similar error should re-appear). Reviewed-by: Jakob Sinclair <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: correct NULL-pointer check in si_upload_const_bufferNicolai Hähnle2016-05-021-1/+1
| | | | | | Cc: "11.1 11.2" <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix synchronization of shader imagesMarek Olšák2016-04-301-7/+11
| | | | | | | This fixes the winsys->cs_is_buffer_referenced query, which is used for synchronization before buffers are mapped. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: drop support for LINEAR_GENERAL layoutMarek Olšák2016-04-283-31/+7
| | | | | | | Unused. All texture imports use LINEAR_ALIGNED regardless of what the DDX does. Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: rework clear_buffer flagsMarek Olšák2016-04-284-21/+28
| | | | | | | | | Changes: - don't flush DB for fast color clears - don't flush any caches for initial clears - remove the flag from si_copy_buffer, always assume shader coherency Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove needless cache flushes at the end of CP DMA operationsMarek Olšák2016-04-281-8/+0
| | | | | | not needed AFAIK Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove flushes at the beginning and end of IBs done by the kernelMarek Olšák2016-04-281-12/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: check if value is negativeJakob Sinclair2016-04-281-1/+4
| | | | | | | | | | | Fixes a Coverity defect by adding checks to see if a value is negative before using it to index an array. By checking the value first it makes the code a bit safer but overall should not have a big impact. CID: 1355598 Signed-off-by: Jakob Sinclair <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/radeon: remove use_reusable_pool parameter from r600_init_resourceNicolai Hähnle2016-04-271-1/+1
| | | | | | All callers set it to true. Reviewed-by: Marek Olšák <[email protected]>
* radeon/video: always use the reusable buffer poolNicolai Hähnle2016-04-271-1/+1
| | | | | | | | | | | A semantic error was introduced in a past refactoring that caused the bind parameter to be passed into the use_reusable_pool parameter of buffer_create. Since this clearly makes no sense, and there is no clear reason why the cache _shouldn't_ be used, just use the cache always. Cc: Christian König <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: work around an MSAA fast stencil clear problemNicolai Hähnle2016-04-271-3/+15
| | | | | | | | A piglit test (arb_texture_multisample-stencil-clear) has been sent. This problem was discovered analyzing Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: expclear must be disabled on first Z/S clearNicolai Hähnle2016-04-271-2/+2
| | | | | | The documentation and the HW team say so. Reviewed-by: Marek Olšák <[email protected]>