summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: optimize si_invalidate_buffer based on bind_historyMarek Olšák2016-10-041-87/+100
| | | | | | | | | Just enclose each section with: if (rbuffer->bind_history & PIPE_BIND_...) Bioshock Infinite: +1% performance Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: track buffer bind historyMarek Olšák2016-10-044-5/+23
| | | | | | | similar to gl_buffer_object::UsageHistory Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: drop support for NULL sampler viewsMarek Olšák2016-10-042-12/+4
| | | | | | | not used anymore. It was used when the polygon stipple texture was constant. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: separate IA_MULTI_VGT_PARAM and VGT_PRIMITIVE_TYPE emissionMarek Olšák2016-10-041-7/+10
| | | | | | | We want to emit IA_MULTI_VGT_PARAM less often because it's a context reg. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: move VGT_LS_HS_CONFIG to derived tess_stateMarek Olšák2016-10-043-28/+14
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: don't check PIPE_BARRIER_MAPPED_BUFFERMarek Olšák2016-10-041-4/+3
| | | | | | | Caches are always flushed at IB boundary. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: parse SURFACE_SYNC correctly on CIK-VIMarek Olšák2016-10-041-9/+16
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/radeon: inline r600_context_add_resource_sizeMarek Olšák2016-10-042-21/+13
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: Fix primitive restart when index changesJames Legg2016-10-041-7/+7
| | | | | | | | | | | If primitive restart is enabled for two consecutive draws which use different primitive restart indices, then the first draw's primitive restart index was incorrectly used for the second draw. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98025 Cc: 11.1 11.2 12.0 <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* spirv: replace assert() with unreachable()Timothy Arceri2016-10-041-1/+1
| | | | | | This fixes an uninitialized warning for is_vertex_input. Reviewed-by: Jason Ekstrand <[email protected]>
* intel: use the correct format specifier for printing uint64_tTimothy Arceri2016-10-042-11/+13
| | | | | | Fixes a bunch of warnings in 32-bit builds. Reviewed-by: Iago Toral Quiroga <[email protected]>
* gallium/winsys: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-045-5/+9
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/xa: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/dri: replace calls to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-2/+3
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/auxiliary: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* egl/android: replace call to dup(2) with fcntl(F_DUPFD_CLOEXEC)Matt Whitlock2016-10-041-1/+2
| | | | | | | | | | Without this fix, duplicated file descriptors leak into child processes. See commit aaac913e901229d11a1894f6aaf646de6b1a542c for one instance where the same fix was employed. Cc: <[email protected]> Signed-off-by: Matt Whitlock <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* intel: fix compilation warning on gen_get_device_infoTapani Pälli2016-10-042-2/+2
| | | | | | | (warning: 'const' type qualifier on return type has no effect) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Only emit 1 viewport when possible.Kenneth Graunke2016-10-0310-30/+75
| | | | | | | | | | | | | | | | | | | | | | | | | | In core profile, we support up to 16 viewports. However, in the majority of cases, only 1 of them is actually used - we only need the others if the last shader stage prior to the rasterizer writes gl_ViewportIndex. Processing all 16 viewports adds additional CPU overhead, which hurts CPU-intensive workloads such as Glamor. This meant that switching to core profile actually penalized Glamor to an extent, which is unfortunate. This patch tracks the number of relevant viewports, switching between 1 and ctx->Const.MaxViewports if gl_ViewportIndex is written. A new BRW_NEW_VIEWPORT_COUNT flag tracks this. This could mean re-emitting viewport state when switching, but hopefully this is offset by doing 1/16th of the work in the common case. The new flag is also lighter weight than BRW_NEW_VUE_MAP_GEOM_OUT, which we were using in one case. According to Eric Anholt, x11perf -copypixwin10 performance improves by 11.5094% +/- 3.10841% (n=10) on his Skylake. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Anuj Phogat <[email protected]>
* spirv: translate cull distance semantic.Dave Airlie2016-10-041-1/+1
| | | | | | | | This just translates to the correct cull distance slot. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* compiler: add printable values for cull distance varyings.Dave Airlie2016-10-041-0/+2
| | | | | | | | We need these for spir-v/nir shaders. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir/spirv/cfg: Use a nop intrinsic for tagging the ends of blocksJason Ekstrand2016-10-032-4/+6
| | | | | | | | | | | | | | | | | | | | | Previously, we were saving off the last nir_block in a vtn_block before moving on so that we could find the nir_block again when it came time to handle phi sources. Unfortunately, NIR's control flow modification code is inconsistent when it comes to how it splits blocks so the block pointer we saved off may point to a block somewhere else in the shader by the time we get around to handling phi sources. In order to get around this, we insert a nop instruction and use that as the logical end of our block. Since the control flow manipulation code respects instructions, the nop will keeps its place like any other instruction and we can easily find the end of our block when we need it. This fixes a bug triggered by a couple of vkQuake shaders. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97233 Cc: "12.0" <[email protected]> Tested-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a nop intrinsicJason Ekstrand2016-10-031-0/+3
| | | | | | | | | | | This intrinsic has no destination, no sources, no variables, and can be eliminated. In other words, it does nothing and will always get deleted by dead code elimination. However, it does provide a quick-and-easy way to temporarily tag a particular location in a NIR shader. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "12.0" <[email protected]>
* intel/isl: Allow non-2D HiZ surfacesJason Ekstrand2016-10-031-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add a detailed comment about multisampling with HiZJason Ekstrand2016-10-031-2/+58
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Remove tiling checks from choose_msaa_layoutJason Ekstrand2016-10-032-14/+7
| | | | | | | | | | | We already do those checks in filter_tiling. There's no good reason to repeat them in choose_msaa_layout. If anything they should have been asserts and not "return false" checks. Also, this check was causing us to outright reject multisampled HiZ surfaces which wasn't intended. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Handle HiZ and CCS tiling more directlyJason Ekstrand2016-10-032-16/+16
| | | | | | | | | | | | The HiZ and CCS tiling formats are always used for HiZ and CCS surfaces respectively. There's no reason why we should go through filter_tiling and it's much easier to always get HiZ and CCS right if we just handle them directly. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Allow multisampling with ISL_FORMAT_HiZJason Ekstrand2016-10-032-3/+12
| | | | | | | | | | | | | HiZ buffers can be multisampled and, on Broadwell and earlier, simply using interleaved multisampling with a compression block size of 8x4 samples yields the correct HiZ surface size calculations. Unfortunately, choose_msaa_layout was rejecting multisampled HiZ buffers because of format checks. Now that we have a simple helper for determining if a format supports multisampling, that's an easy enough issue to fix. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Allow creation of 1-D compressed texturesJason Ekstrand2016-10-032-3/+11
| | | | | | | | | | | Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Fix up asserts in calc_phys_level0_extent_saJason Ekstrand2016-10-031-2/+4
| | | | | | | | | | | | | The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add a format_supports_multisampling helperJason Ekstrand2016-10-035-36/+33
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* vl/dri3: fix warning about incompatible pointer typeNayan Deshmukh2016-10-031-1/+1
| | | | | Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* swr: Removed stalling SwrWaitForIdle from queries.Bruce Cherniak2016-10-034-119/+87
| | | | | | | | Previous fundamental change in stats gathering added a temporary SwrWaitForIdle to begin_query and end_query. Code has been reworked to remove stall. Reviewed-by: George Kyriazis <[email protected]>
* swr: [rasterizer core] refactor thread creationTim Rowley2016-10-033-9/+29
| | | | | | | | | | Create worker pool now computes number of worker threads based on things like topologies, etc. and creates the pool but doesn't actually launch the threads. Instead there is a separate start thread pool function. This allows thread resources to be constructed first before threads start. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] canonicalize blend compile stateTim Rowley2016-10-032-0/+39
| | | | | | Canonicalize to prevent unnecessary JIT compiles. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] archrast fixesTim Rowley2016-10-034-7/+14
| | | | | | | - Immediately sleep threads until thread data is initialized - Fix some compile bugs with AR enabled Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer jitter] fixes for icc in vs2015 compat modeTim Rowley2016-10-0312-1404/+1439
| | | | | | | - Move most jitter functionality into SwrJit namespace - Avoid global "using namespace llvm" in headers Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer core] generalize compute dispatch mechanismTim Rowley2016-10-033-4/+15
| | | | | | Generalize compute dispatch mechanism to support other types of dispatches. Signed-off-by: Tim Rowley <[email protected]>
* swr: [rasterizer common] os.h portability header changesTim Rowley2016-10-031-0/+6
| | | | | | | - Fix conflict between windows MemoryFence and llvm::sys::MemoryFence - Declare gettid() Signed-off-by: Tim Rowley <[email protected]>
* anv/formats: Fix build on gcc-4 and earlierVille Syrjälä2016-10-031-4/+19
| | | | | | | | | | | | | | | | | | | | gcc-4 and earlier don't allow compound literals where a constant is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE() when populating the anv_formats[] array. There are a few ways around it: First one would be -std=c89/gnu89, but the rest of the code depends on c99 so it's not really an option. The second option would be to upgrade to gcc-5+ where the compiler behaviour was relaxed a bit [1]. And the third option is just to avoid using compound literals. I chose the last option since it keeps gcc-4 and earlier working. [1] https://gcc.gnu.org/gcc-5/porting_to.html Cc: Jason Ekstrand <[email protected]> Cc: Topi Pohjolainen <[email protected]> Fixes: 7ddb21708c80 ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles") Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* egl: stop claiming support for pbuffer + msaaTapani Pälli2016-10-031-0/+9
| | | | | | | | | | | | | | | | | | | This fixes a crash in egl-create-msaa-pbuffer-surface Piglit test and same crash in many dEQP EGL tests. I also found that some Qt example did a workaround because of this crash: https://bugreports.qt.io/browse/QTBUG-47509 v2: Ian pointed out that v1 removed support for all multisample configs, including window ones. This one removes pbuffer bit when adding configs, now only pbuffer+msaa gets rejected and window+msaa continues to work. Fixed also comment (Emil) Signed-off-by: Tapani Pälli <[email protected]> Cc: [email protected] Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: rename max_ds_* variable to max_tes_*Timothy Arceri2016-10-037-32/+32
| | | | | | Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: rename max_hs_* variables to max_tcs_*Timothy Arceri2016-10-037-32/+32
| | | | | | Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop pointless stage == MESA_SHADER_FRAGMENT checks.Kenneth Graunke2016-10-021-5/+1
| | | | | | | There's an assert right above this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: add missing headers to blob.hTimothy Arceri2016-10-021-0/+2
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv/cfg: Detect switch_break after loop_break/continueJason Ekstrand2016-10-011-2/+2
| | | | | | | | | | | While the current CFG code is valid in the case where a switch break also happens to be a loop continue, it's a bit suboptimal. Since hardware is capable of handling the continue as a direct jump, it's better to use a continue instruction when we can than to bother with all of the nasty switch break lowering. Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv/cfg: Handle switches whose break block is a loop continueJason Ekstrand2016-10-011-0/+13
| | | | | | | | | | | | | | | | It is possible that the break block of a switch is actually the continue of the loop containing the switch. In this case, we need to identify the break block as a continue and break out of current level of CFG handling. If we don't, the continue portion of the loop will get handled twice, once by following after the break and a second time by the loop handling code handling it explicitly. This fixes 6 of the new Vulkan CTS tests: - dEQP-VK.spirv_assembly.instruction.graphics.opphi.out_of_order* - dEQP-VK.spirv_assembly.instruction.graphics.selection_block_order.out_of_order* Signed-off-by: Jason Ekstrand <[email protected]> Cc: "12.0" <[email protected]>
* nir/spirv: add spirv2nir binary to .gitignoreEric Engestrom2016-10-011-0/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: improve mmap() error handlingEric Engestrom2016-10-011-1/+9
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: improve lseek() error handlingEric Engestrom2016-10-011-2/+10
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/spirv: add some error checking to open()Eric Engestrom2016-10-011-0/+9
| | | | | | CovID: 1373369 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>