summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* winsys/radeon: add buffer_get_reloc_offsetNicolai Hähnle2016-10-044-2/+25
| | | | | | | | | | | Really fix the bug that was supposed to be fixed by commits 3e7cced4b and a48bf02d: even when virtual addresses are used, the legacy relocation-based method with offsets relative to the kernel's buffer object are used for video submissions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97969 Reviewed-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't declare LDS in PS when ds_bpermute is usedMarek Olšák2016-10-043-4/+7
| | | | | | | | I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: use DDX/DDY directly in si_llvm_emit_ddxy_interpMarek Olšák2016-10-041-49/+7
| | | | | | | We can finally do this, because the opcodes are scalar now. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: simplify si_llvm_emit_ddxyMarek Olšák2016-10-041-51/+29
| | | | | | | | si_llvm_emit_ddxy is called once per element, so we don't have to generate code for 4 elements at once. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't call build_gep0 in si_llvm_emit_ddxy on VIMarek Olšák2016-10-041-5/+9
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: use a helper function for BuildGEP(0, x)Marek Olšák2016-10-041-47/+35
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: remove obsolete shader definitionsMarek Olšák2016-10-041-12/+4
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: remove unnecessary #includesMarek Olšák2016-10-0411-23/+0
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: clean up lucky #include dependenciesMarek Olšák2016-10-042-36/+35
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't re-create shader PM4 states after scratch buffer updateMarek Olšák2016-10-043-15/+25
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: move r600_common_context::texture_buffers to r600gMarek Olšák2016-10-046-9/+8
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't set sampler buffer offsets in create_sampler_viewMarek Olšák2016-10-043-24/+22
| | | | | | | | | do it at bind time, so that pipe_sampler_view is immutable with regard to buffer reallocations and we don't have to remember all existing buffer views. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: optimize si_invalidate_buffer based on bind_historyMarek Olšák2016-10-041-87/+100
| | | | | | | | | Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...) Bioshock Infinite: +1% performance Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: track buffer bind historyMarek Olšák2016-10-044-5/+23
| | | | | | | similar to gl_buffer_object::UsageHistory Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: drop support for NULL sampler viewsMarek Olšák2016-10-042-12/+4
| | | | | | | not used anymore. It was used when the polygon stipple texture was constant. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emissionMarek Olšák2016-10-041-7/+10
| | | | | | | We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: move VGT_LS_HS_CONFIG to derived tess_stateMarek Olšák2016-10-043-28/+14
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFERMarek Olšák2016-10-041-4/+3
| | | | | | | Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: parse SURFACE_SYNC correctly on CIK-VIMarek Olšák2016-10-041-9/+16
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: inline r600_context_add_resource_sizeMarek Olšák2016-10-042-21/+13
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: Fix primitive restart when index changesJames Legg2016-10-041-7/+7
| | | | | | | | | | | If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* spirv: replace assert() with unreachable()Timothy Arceri2016-10-041-1/+1
| | | | | | This fixes an uninitialized warning for is_vertex_input. Reviewed-by: Jason Ekstrand <[email protected]>
* intel: use the correct format specifier for printing uint64_tTimothy Arceri2016-10-042-11/+13
| | | | | | Fixes a bunch of warnings in 32-bit builds. Reviewed-by: Iago Toral Quiroga <[email protected]>
* gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-045-5/+9
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-2/+3
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* intel: fix compilation warning on gen_get_device_infoTapani Pälli2016-10-042-2/+2
| | | | | | | (warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Only emit 1 viewport when possible.Kenneth Graunke2016-10-0310-30/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensive workloads such as Glamor. This meant that switching to core profile actually penalized Glamor to an extent, which is unfortunate. This patch tracks the number of relevant viewports, switching between 1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting viewport state when switching, but hopefully this is offset by doing 1/16th of the work in the common case. The new flag is also lighter weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case. According to Eric Anholt, x11perf -copypixwin10 performance improves by 11.5094% +/- 3.10841% (n=10) on his Skylake. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* spirv: translate cull distance semantic.Dave Airlie2016-10-041-1/+1
| | | | | | | | This just translates to the correct cull distance slot. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* compiler: add printable values for cull distance varyings.Dave Airlie2016-10-041-0/+2
| | | | | | | | We need these for spir-v/nir shaders. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocksJason Ekstrand2016-10-032-4/+6
| | | | | | | | | | | | | | | | | | | | | Previously, we were saving off the last nir_block in a vtn_block before moving on so that we could find the nir_block again when it came time to handle phi sources. Unfortunately, NIR's control flow modification code is inconsistent when it comes to how it splits blocks so the block pointer we saved off may point to a block somewhere else in the shader by the time we get around to handling phi sources. In order to get around this, we insert a nop instruction and use that as the logical end of our block. Since the control flow manipulation code respects instructions, the nop will keeps its place like any other instruction and we can easily find the end of our block when we need it. This fixes a bug triggered by a couple of vkQuake shaders. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233 Cc: "12.0" <[email protected]> Tested-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a nop intrinsicJason Ekstrand2016-10-031-0/+3
| | | | | | | | | | | This intrinsic has no destination, no sources, no variables, and can be eliminated. In other words, it does nothing and will always get deleted by dead code elimination. However, it does provide a quick-and-easy way to temporarily tag a particular location in a NIR shader. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* intel/isl: Allow non-2D HiZ surfacesJason Ekstrand2016-10-031-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add a detailed comment about multisampling with HiZJason Ekstrand2016-10-031-2/+58
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Remove tiling checks from choose_msaa_layoutJason Ekstrand2016-10-032-14/+7
| | | | | | | | | | | We already do those checks in filter_tiling. There's no good reason to repeat them in choose_msaa_layout. If anything they should have been asserts and not "return false" checks. Also, this check was causing us to outright reject multisampled HiZ surfaces which wasn't intended. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Handle HiZ and CCS tiling more directlyJason Ekstrand2016-10-032-16/+16
| | | | | | | | | | | | The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces respectively. There's no reason why we should go through filter_tiling and it's much easier to always get HiZ and CCS right if we just handle them directly. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Allow multisampling with ISL_FORMAT_HiZJason Ekstrand2016-10-032-3/+12
| | | | | | | | | | | | | HiZ buffers can be multisampled and, on Broadwell and earlier, simply using interleaved multisampling with a compression block size of 8x4 samples yields the correct HiZ surface size calculations. Unfortunately, choose_msaa_layout was rejecting multisampled HiZ buffers because of format checks. Now that we have a simple helper for determining if a format supports multisampling, that's an easy enough issue to fix. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Allow creation of 1-D compressed texturesJason Ekstrand2016-10-032-3/+11
| | | | | | | | | | | Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Fix up asserts in calc_phys_level0_extent_saJason Ekstrand2016-10-031-2/+4
| | | | | | | | | | | | | The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add a format_supports_multisampling helperJason Ekstrand2016-10-035-36/+33
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* vl/dri3: fix warning about incompatible pointer typeNayan Deshmukh2016-10-031-1/+1
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* swr: Removed stalling SwrWaitForIdle from queries.Bruce Cherniak2016-10-034-119/+87
| | | | | | | | Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. Reviewed-by: George Kyriazis <[email protected]>
* swr: [rasterizer core] refactor thread creationTim Rowley2016-10-033-9/+29
| | | | | | | | | | Create worker pool now computes number of worker threads based on things like topologies, etc. and creates the pool but doesn't actually launch the threads. Instead there is a separate start thread pool function. This allows thread resources to be constructed first before threads start. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] canonicalize blend compile stateTim Rowley2016-10-032-0/+39
| | | | | | Canonicalize to prevent unnecessary JIT compiles. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] archrast fixesTim Rowley2016-10-034-7/+14
| | | | | | | - Immediately sleep threads until thread data is initialized - Fix some compile bugs with AR enabled Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fixes for icc in vs2015 compat modeTim Rowley2016-10-0312-1404/+1439
| | | | | | | - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] generalize compute dispatch mechanismTim Rowley2016-10-033-4/+15
| | | | | | Generalize compute dispatch mechanism to support other types of dispatches. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] os.h portability header changesTim Rowley2016-10-031-0/+6
| | | | | | | - Fix conflict between windows MemoryFence and llvm::sys::MemoryFence - Declare gettid() Signed-off-by: Tim Rowley <[email protected]>