summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeon
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: implement buffer_subdata without indirect callsMarek Olšák2016-07-233-3/+39
| | | | | | There is less noise in CPU profile data now. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: split transfer_inline_write into buffer and texture callbacksMarek Olšák2016-07-233-4/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | to reduce the call indirections with u_resource_vtbl. The worst call tree you could get was: - u_transfer_inline_write_vtbl - u_default_transfer_inline_write - u_transfer_map_vtbl - driver_transfer_map - u_transfer_unmap_vtbl - driver_transfer_unmap That's 6 indirect calls. Some drivers only had 5. The goal is to have 1 indirect call for drivers that care. The resource type can be determined statically at most call sites. The new interface is: pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data) pipe_context::texture_subdata(ctx, resource, level, usage, box, data, stride, layer_stride) v2: fix whitespace, correct ilo's behavior Reviewed-by: Nicolai Hähnle <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* gallium/radeon: make deferred flushes asynchronousMarek Olšák2016-07-221-0/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: remove RADEON_FLUSH_KEEP_TILING_FLAGS flagMarek Olšák2016-07-191-2/+1
| | | | | | always set Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/uvd: add session context buffer for polaris 10/11 v2Christian König2016-07-182-0/+21
| | | | | | | | | This way we have unlimited UVD sessions. v2: only enable it when kernel supports it as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* Revert "radeon/llvm: Use alloca instructions for larger arrays"Marek Olšák2016-07-142-149/+25
| | | | | | This reverts commit 513fccdfb68e6a71180e21827f071617c93fd09b. Bioshock Infinite hangs with that.
* radeon/uvd: fail to create a decoder if RUVD_MSG_CREATE submission failsMarek Olšák2016-07-141-6/+9
| | | | | | This is the bare minimum for reporting the error to the user. Reviewed-by: Christian König <[email protected]>
* gallium/radeon: add a return value to cs_flushMarek Olšák2016-07-141-3/+5
| | | | | | Required by our UVD code. Reviewed-by: Christian König <[email protected]>
* radeon/vce: handle newly added parametersBoyuan Zhang2016-07-141-13/+20
| | | | | | | Replace the previous hardcoded value with newly defined parameters Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/radeon: normalize the code styleMarek Olšák2016-07-132-338/+286
| | | | | | no change in behavior Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: just save buffer sizes instead of buffers while recording IBsMarek Olšák2016-07-132-6/+1
| | | | | | whole buffer objects are not needed Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/uvd: simplify sending context buffer messageChristian König2016-07-081-4/+1
| | | | | | | Just send it whenever it is allocated. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/uvd: fix contex buffer destruction in the error pathChristian König2016-07-081-6/+2
| | | | | | | Destroying a not allocated buffer is harmless. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/uvd: move polaris fw check into radeon_video.c v2Christian König2016-07-082-11/+13
| | | | | | | | | | It's actually not very clever to claim to support H.264 and then fail to create a decoder. v2: prefix FW macro with UVD_. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/video: fix coding style in radeon_video.c v2Christian König2016-07-081-15/+15
| | | | | | | v2: fix other tabs as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeonsi: explicitly choose center locations for 1xAA on PolarisNicolai Hähnle2016-07-081-0/+7
| | | | | | | | | | | | | Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon/llvm: Use alloca instructions for larger arraysTom Stellard2016-07-062-25/+151
| | | | | | | | | | | | | | | | | | | | We were storing arrays in vectors, which was leading to some really bad spill code for large arrays. allocas instructions are a better fit for arrays and LLVM optimizations are more geared toward dealing with allocas instead of vectors. For arrays that have 16 or less 32-bit elements, we will continue to use vectors, because this will force LLVM to store them in registers and use indirect registers, which is usually faster for small arrays. In the future we should use allocas for all arrays and teach LLVM how to store allocas in registers. This fixes the piglit test: spec/glsl-1.50/execution/geometry/max-input-component Reviewed-by: Marek Olšák <[email protected]>
* radeon/llvm: Add helpers for loading and storing data from arrays.Tom Stellard2016-07-061-10/+41
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/llvm: Remove uses_temp_indirect_addressing() functionTom Stellard2016-07-061-23/+1
| | | | | | bld->indirect_files is never set, so this function always returns false. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add depth/stencil_adjusted output to surface computationNicolai Hähnle2016-07-062-2/+10
| | | | | | | | | | | This fixes a rare bug with stencil texturing -- seen on Polaris and Tonga, though it's basically a function of the memory configuration so could affect other parts as well. Fixes piglit "unaligned-blit * stencil downsample" and various "fbo-depth-array *stencil*" tests. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: allocate only the required plane for flushed depthNicolai Hähnle2016-07-061-3/+34
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: replace is_flushing_texture with db_compatibleNicolai Hähnle2016-07-062-2/+3
| | | | | | | | | | | This is a left-over of when I considered generalizing the separate stencil support. I do prefer the new name since it emphasizes what flushing vs. non-flushing means from a functional point-of-view, namely special handling of the texture format. v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: add can_sample_z/s flags for texturesNicolai Hähnle2016-07-062-4/+24
| | | | | | v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon/winsyses: remove unused stencil_offsetNicolai Hähnle2016-07-061-1/+0
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: remove redundant null-pointer checkNicolai Hähnle2016-07-061-2/+1
| | | | | | v2: keep using r600_texture_reference Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: print StencilLayout only onceNicolai Hähnle2016-07-061-2/+2
| | | | | | It is the same for all levels. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: flush stdout after printing texture informationNicolai Hähnle2016-07-061-0/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeon/vce: update encRefPic addr and array mode to tiledLeo Liu2016-07-051-0/+1
| | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/vce: increase cpb height alignmentLeo Liu2016-07-051-1/+1
| | | | | | | | Height should be aligned with 2 macroblocks, thus making safer for tiled mode Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/radeon: add and use radeon_info::max_alloc_size (v2)Marek Olšák2016-07-052-6/+6
| | | | | | | | | | v2: - squashed the patches - use INT_MAX - clamp max_const_buffer_size - check the DRM version in radeon Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Vedran Miletić <[email protected]>
* radeonsi: print LLVM IRs to ddebug logsMarek Olšák2016-07-052-0/+2
| | | | | | | Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove unused code - radeon_llvm_util.*Marek Olšák2016-07-053-165/+0
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: keep using v_rcp_f32 for division in future LLVM (v2)Marek Olšák2016-07-052-2/+30
| | | | | | | | | | This will be needed after some LLVM changes that haven't landed yet. v2: - use LLVMIsConstant to fix an LLVM assertion failure. LLVMSetMetadata doesn't work with constants. - don't set float metadata as string Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/uvd: fix overflow error while calculating bit stream buffer sizeIndrajit Das2016-07-041-1/+1
| | | | Reviewed-by: Christian König <[email protected]>
* radeon/uvd: fix a h265 context size bugsonjiang2016-06-291-0/+3
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/uvd: seperate uvd context buffer from DPBsonjiang2016-06-291-9/+97
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon uvd add uvd fw version for amdgpusonjiang2016-06-291-0/+1
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium/radeon: remove zombie textures kept alive by DCC stat gatheringMarek Olšák2016-06-291-12/+27
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't re-create queries for DCC stat gatheringMarek Olšák2016-06-293-5/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: assume X11 DRI3 can use at most 5 back buffersMarek Olšák2016-06-291-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: separate DCC starts as disabled (ps_draw_ratio = 0)Marek Olšák2016-06-291-9/+10
| | | | | | | | | | | | | | DRI3: - Only slows clears can enable it for the first frame. - A good PS/draw ratio can enable it for other frames. DRI2: - Only slows clears can enable it for a frame. - Page-flipped color buffers are unref'd at the end of each frame, so it can't be enabled in any other way. - Relying on slow clears is sufficient for our synthetic benchmarks. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: R600_DEBUG=nodccfb disables separate DCCMarek Olšák2016-06-293-1/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add and use r600_texture_referenceMarek Olšák2016-06-293-8/+11
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a HUD query for PS draw ratio stats from separate DCCMarek Olšák2016-06-294-0/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2)Marek Olšák2016-06-293-2/+305
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add state setup for a separate DCC bufferMarek Olšák2016-06-292-3/+23
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always calculate DCC info even if it's not used immediatelyMarek Olšák2016-06-291-1/+2
| | | | | | for a later use Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add flag R600_QUERY_HW_FLAG_BEGIN_RESUMESMarek Olšák2016-06-292-1/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use conformant line rasterizationMarek Olšák2016-06-292-2/+16
| | | | | | | | | | AA lines are not completely correct (see TODO), but everything else should be. + 3 linestipple piglits Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeon/vce: use vce structure for vce 52 firmwareBoyuan Zhang2016-06-285-98/+517
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>