summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* r300: use the new parent/child pools for transfers (v2)Nicolai Hähnle2016-10-055-7/+11
| | | | | | v2: slab_alloc_st -> slab_alloc Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: use the new parent/child pools for transfersNicolai Hähnle2016-10-053-6/+11
| | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97894 Reviewed-by: Marek Olšák <[email protected]>
* gallivm: Use AVX2 gather instrinsics.Jose Fonseca2016-10-041-0/+95
| | | | | | v2: Use AVX2 gather for non aligned loads too. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Use 8 wide AoS sampling on AVX2.Roland Scheidegger2016-10-041-5/+6
| | | | | | | | | | v2: Make sure that with num_lods > 1 and min_filter != mag_filter we still enter the splitting path. So this case would still use 4-wide aos path (as a side note, the 4-wide aos sampling path could actually be improved quite a bit if we have avx2, by just doing the filtering with 256bit vectors). Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: Basic AVX2 support.José Fonseca2016-10-044-28/+98
| | | | | | v2: pblendb -> pblendvb Reviewed-by: Roland Scheidegger <[email protected]>
* st/omx/dec/h265: add scaling list dataLeo Liu2016-10-041-5/+97
| | | | | | | | | Specified by subclause 7.3.4 v2: get the loop optimized Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/omx/dec/h265: fix the skip for before and after listLeo Liu2016-10-041-3/+4
| | | | | | | | | | For reference picture sets, there are cases that rps will not always be used. Once detect the unused flag from encoded bitstream, we should not add this rps to any list, otherwise pass the incorrect reference and skip the correct rps. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: set the default reference picture set for referenceLeo Liu2016-10-041-2/+4
| | | | | | | | | | It will fix the corruption for frame, that only has one stort term ref picture set, we set NULL rps for this case previously, causing taking incorrect reference. Instead we should take that only short term set as reference Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: decoder size should follow from spsLeo Liu2016-10-042-7/+8
| | | | | | | | | | | | | The video size from format container is not always compatible with the size from codec bitstream, the HW decoder should take the size information from bitstream, otherwise the corruption appears with clip that has different size info between bitstream and format container So we are passing width(height)_in_samples from sequence parameter set to video decoder. Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/omx/dec/h265: increase dpb max size to 32Leo Liu2016-10-041-1/+1
| | | | | | For clip with frame delta poc over 16 Signed-off-by: Leo Liu <[email protected]>
* radeonsi: optionally run the LLVM IR verifier passNicolai Hähnle2016-10-045-9/+38
| | | | | | | | This is enabled automatically if shader printing is enabled, or separately by R600_DEBUG=checkir. Catch mal-formed IR before it crashes in a later pass. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: fix argument type of llvm.{cttz,ctlz}.i32 intrinsicsNicolai Hähnle2016-10-041-2/+2
| | | | | | Caught by R600_DEBUG=checkir (next commit). Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: unify the creation of basic blocksNicolai Hähnle2016-10-041-10/+24
| | | | | | | This changes the order of basic blocks to be equal to the order of code in the original TGSI, which is nice for making sense of shader dumps. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: merge branch and loop flow control stacksNicolai Hähnle2016-10-042-82/+78
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: simplify if/else/endif blocksNicolai Hähnle2016-10-042-25/+18
| | | | | | | In particular, we no longer emit an else block when there is no ELSE instruction. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: label basic blocks by the corresponding TGSI pcNicolai Hähnle2016-10-041-0/+17
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: cleanup and fix branch emitsNicolai Hähnle2016-10-041-37/+14
| | | | | | | | | | | | | Some of the existing code is needlessly complicated. The basic principle should be: control-flow opcodes emit branches to properly terminate the current block, _unless_ the current block already has a terminator (which happens if and only if there was a BRK or CONT). This also fixes a bug where multiple terminators were created in a block. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97887 Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]>
* winsys/radeon: add buffer_get_reloc_offsetNicolai Hähnle2016-10-044-2/+25
| | | | | | | | | | | Really fix the bug that was supposed to be fixed by commits 3e7cced4b and a48bf02d: even when virtual addresses are used, the legacy relocation-based method with offsets relative to the kernel's buffer object are used for video submissions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't declare LDS in PS when ds_bpermute is usedMarek Olšák2016-10-043-4/+7
| | | | | | | | I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interpMarek Olšák2016-10-041-49/+7
| | | | | | | We can finally do this, because the opcodes are scalar now. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: simplify si_llvm_emit_ddxyMarek Olšák2016-10-041-51/+29
| | | | | | | | si_llvm_emit_ddxy is called once per element, so we don't have to generate code for 4 elements at once. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VIMarek Olšák2016-10-041-5/+9
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: use a helper function for BuildGEP(0, x)Marek Olšák2016-10-041-47/+35
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: remove obsolete shader definitionsMarek Olšák2016-10-041-12/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: remove unnecessary #includesMarek Olšák2016-10-0411-23/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: clean up lucky #include dependenciesMarek Olšák2016-10-042-36/+35
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't re-create shader PM4 states after scratch buffer updateMarek Olšák2016-10-043-15/+25
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: move r600_common_context::texture_buffers to r600gMarek Olšák2016-10-046-9/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't set sampler buffer offsets in create_sampler_viewMarek Olšák2016-10-043-24/+22
| | | | | | | | | do it at bind time, so that pipe_sampler_view is immutable with regard to buffer reallocations and we don't have to remember all existing buffer views. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: optimize si_invalidate_buffer based on bind_historyMarek Olšák2016-10-041-87/+100
| | | | | | | | | Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...) Bioshock Infinite: +1% performance Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: track buffer bind historyMarek Olšák2016-10-044-5/+23
| | | | | | | similar to gl_buffer_object::UsageHistory Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: drop support for NULL sampler viewsMarek Olšák2016-10-042-12/+4
| | | | | | | not used anymore. It was used when the polygon stipple texture was constant. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emissionMarek Olšák2016-10-041-7/+10
| | | | | | | We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: move VGT_LS_HS_CONFIG to derived tess_stateMarek Olšák2016-10-043-28/+14
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFERMarek Olšák2016-10-041-4/+3
| | | | | | | Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: parse SURFACE_SYNC correctly on CIK-VIMarek Olšák2016-10-041-9/+16
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: inline r600_context_add_resource_sizeMarek Olšák2016-10-042-21/+13
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: Fix primitive restart when index changesJames Legg2016-10-041-7/+7
| | | | | | | | | | | If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-045-5/+9
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-2/+3
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vl/dri3: fix warning about incompatible pointer typeNayan Deshmukh2016-10-031-1/+1
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* swr: Removed stalling SwrWaitForIdle from queries.Bruce Cherniak2016-10-034-119/+87
| | | | | | | | Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. Reviewed-by: George Kyriazis <[email protected]>
* swr: [rasterizer core] refactor thread creationTim Rowley2016-10-033-9/+29
| | | | | | | | | | Create worker pool now computes number of worker threads based on things like topologies, etc. and creates the pool but doesn't actually launch the threads. Instead there is a separate start thread pool function. This allows thread resources to be constructed first before threads start. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] canonicalize blend compile stateTim Rowley2016-10-032-0/+39
| | | | | | Canonicalize to prevent unnecessary JIT compiles. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] archrast fixesTim Rowley2016-10-034-7/+14
| | | | | | | - Immediately sleep threads until thread data is initialized - Fix some compile bugs with AR enabled Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fixes for icc in vs2015 compat modeTim Rowley2016-10-0312-1404/+1439
| | | | | | | - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] generalize compute dispatch mechanismTim Rowley2016-10-033-4/+15
| | | | | | Generalize compute dispatch mechanism to support other types of dispatches. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] os.h portability header changesTim Rowley2016-10-031-0/+6
| | | | | | | - Fix conflict between windows MemoryFence and llvm::sys::MemoryFence - Declare gettid() Signed-off-by: Tim Rowley <[email protected]>