summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: set IF_THRESHOLD to 3Marek Olšák2016-11-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Piglit regressions (radeonsi or LLVM bugs, they pass on softpipe): - glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr - glsl-1.10/execution/variable-indexing/vs-output-array-vec4-index-wr - glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-col-row-wr - glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-row-wr Totals: SGPRS: 1132185 -> 1168801 (3.23 %) VGPRS: 907856 -> 906204 (-0.18 %) Spilled SGPRs: 2011 -> 2425 (20.59 %) Spilled VGPRs: 368 -> 96 (-73.91 %) Scratch VGPRs: 1344 -> 1060 (-21.13 %) dwords per thread Code Size: 35916164 -> 35705372 (-0.59 %) bytes LDS: 767 -> 767 (0.00 %) blocks Max Waves: 194010 -> 194921 (0.47 %) Wait states: 0 -> 0 (0.00 %) Before: VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR alien_isolation 2938 38 40 bioshock-infinite 1769 245 732 dirt-showdown 548 85 72 f1-2015 776 0 320 ue4_lightroom_inter.. 74 0 180 After: VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR alien_isolation 2938 38 40 bioshock-infinite 1769 0 480 dirt-showdown 548 58 40 f1-2015 776 0 320 ue4_lightroom_inter.. 74 0 180 Bioshock and DiRT benefit. If I set IF_THRESHOLD=4, tesseract starts spilling VGPRs Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl_to_tgsi: lower small branches based on the CAPMarek Olšák2016-11-151-1/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_SHADER_CAP_LOWER_IF_THRESHOLDMarek Olšák2016-11-1514-0/+21
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lower_if: conditionally lower if-branches based on their sizeMarek Olšák2016-11-152-7/+50
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lower_if: don't lower branches touching tess control outputsMarek Olšák2016-11-156-8/+28
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lower_if: check more node types in check_control_flow -> check_ir_nodeMarek Olšák2016-11-151-3/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl/lower_if: move and rename found_control_flowMarek Olšák2016-11-151-7/+10
| | | | | | | I'll want to update more variables in check_control_flow, so using the visitor is convenient. Reviewed-by: Nicolai Hähnle <[email protected]>
* util/disk_cache: use unambiguous namingMarek Olšák2016-11-153-112/+114
| | | | Reviewed-by: Emil Velikov <[email protected]>
* util: import cache.c/h from glslMarek Olšák2016-11-156-42/+17
| | | | | | | | | | It's not dependent on GLSL and it can be useful for shader caches that don't deal with GLSL. v2: address review comments v3: keep the other 3 lines in configure.ac Reviewed-by: Emil Velikov <[email protected]>
* gallivm: limit use of setFastMathFlags to LLVM 3.8 and laterMarek Olšák2016-11-151-0/+2
| | | | Reviewed-by: Brian Paul <[email protected]>
* intel: Set min_ds_entries on Broxton.Kenneth Graunke2016-11-151-0/+2
| | | | | | | | This was missing. Cc: [email protected] Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* dri: make use of loader_get_extensions_name(..) helperChristian Gmeiner2016-11-153-7/+7
| | | | | | | | | | | | Changes since v1: - removed not needed includes - use the loader version of the helper v2 [Emil Velikov] - Keep the includes - they are required. Signed-off-by: Christian Gmeiner <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* Revert "dri: make use of dri_get_extensions_name(..) helper"Emil Velikov2016-11-153-8/+7
| | | | | | This reverts commit 1a21d21580965eff751414d140b3c176eeee2eb3. Pushed the wrong version of the patch.
* radeonsi: set unsafe fpmath on FP instructions when allowed by R600_DEBUGMarek Olšák2016-11-151-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: add lp_create_builder with an unsafe_fpmath optionMarek Olšák2016-11-152-0/+17
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fold some shader context initialization to si_llvm_context_initMarek Olšák2016-11-153-29/+30
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* loader: fixup driver names if neededChristian Gmeiner2016-11-151-0/+6
| | | | | | | | | | | | This makes it possible to 'use' the imx-drm driver. Remeber that it is not possible to have sysmbol names in C/C++ with a '-' in it. Changes since v1: - move the fix to loader.c Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1) Reviewed-by: Emil Velikov <[email protected]>
* dri: make use of dri_get_extensions_name(..) helperChristian Gmeiner2016-11-153-7/+8
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* loader: add loader_get_extensions_name(..) helperChristian Gmeiner2016-11-152-0/+21
| | | | | | | | | | | | | Changes since v1: - renamed function to loader_get_extensions_name - moved function into loader Signed-off-by: Christian Gmeiner <[email protected]> V2: [Emil Velikov] - Use local define. Signed-off-by: Emil Velikov <[email protected]>
* egl: Use pkg-config for Android NDK buildGurchetan Singh2016-11-151-0/+2
| | | | | | | | | | | | | | | | | It's possible to build Mesa for Android using the traditional autotools workflow [1]. ChromiumOS fetches Android prebuilts and puts them in a sysroot. We now want to use pkg-config to specify the location of system headers and libraries [2]. To enable this, let's add the required pkg-config checks and link against them. [1] https://developer.android.com/ndk/guides/standalone_toolchain.html [2] https://chromium-review.googlesource.com/#/c/403237/ v2: Bundle pkg-config checks together (Emil) v3: Provide further context on standalone NDK Mesa build (Emil) Reviewed-by: Emil Velikov <[email protected]>
* meta/GetTexSubImage: Account for GL_PACK_SKIP_IMAGES on compressed texturesEduardo Lima Mitev2016-11-151-3/+17
| | | | | | | | | | | | This option was being ignored when packing compressed 3D and cube textures. Fixes CTS test (on gen8+): * GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pixelstore v2: Drop API checks. v3 (Ken): Just apply the existing code in more cases. Reviewed-by: Kenneth Graunke <[email protected]>
* anv/format: handle unsupported formats earlierIago Toral Quiroga2016-11-151-3/+3
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* main: return error if asking for GL_TEXTURE_BORDER_COLOR in ↵Samuel Iglesias Gonsálvez2016-11-151-0/+12
| | | | | | | | | | | | | | | | | | | | | TEXTURE_2D_MULTISAMPLE{_ARRAY} through TexParameter{i,Ii,Iui}v() OpenGL ES 3.2 says in section 8.10. "TEXTURE PARAMETERS", at the end of the section: "An INVALID_ENUM error is generated if target is TEXTURE_2D_- MULTISAMPLE or TEXTURE_2D_MULTISAMPLE_ARRAY , and pname is any sampler state from table 21.12." GL_TEXTURE_BORDER_COLOR is present in that table. v2: - Add check to _mesa_texture_parameteriv() (Kenneth) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98250 Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: fix multi level clears with VK_REMAINING_MIP_LEVELSLionel Landwerlin2016-11-141-2/+2
| | | | | | | | | | | | A commit from the CTS suite on the 1.0-dev branch started using VK_REMAINING_MIP_LEVELS, we're not dealing with it properly for clears. Fixes: dEQP-VK.api.image_clearing.clear_color_image.* Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "13.0" <[email protected]>
* anv/format: support VK_FORMAT_R8G8B8_SRGBIago Toral Quiroga2016-11-141-1/+1
| | | | | | Fixes dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/format: handle unsupported formats properlyIago Toral Quiroga2016-11-141-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According to the spec for vkGetPhysicalDeviceImageFormatProperties: "If format is not a supported image format, or if the combination of format, type, tiling, usage, and flags is not supported for images, then vkGetPhysicalDeviceImageFormatProperties returns VK_ERROR_FORMAT_NOT_SUPPORTED." Makes the following Vulkan CTS tests report 'Not Supported' instead of crashing: dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_unorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_snorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_uint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_sint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_unorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_snorm dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sscaled dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_uint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_sint dEQP-VK.api.image_clearing.clear_color_image.1d_b8g8r8a8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r4g4_unorm_pack8 dEQP-VK.api.image_clearing.clear_color_image.1d_r8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_r8g8b8_srgb dEQP-VK.api.image_clearing.clear_color_image.1d_b5g5r5a1_unorm_pack16 Reviewed-by: Lionel Landwerlin <[email protected]>
* clover: adapt to new error API since LLVM r286752Vedran Miletić2016-11-141-2/+8
| | | | Tested-by: Dieter Nützel <[email protected]>
* swr: [rasterizer core] remove driverTypeTim Rowley2016-11-145-49/+2
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] move to pass by valueTim Rowley2016-11-142-2/+2
| | | | | | | | | Move to pass by value since most events are very small in size. We can look at pass by reference but will need to create multiple versions to handle temp objects. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] add mode for aux buffer in the SWR_SURFACE_STATETim Rowley2016-11-141-0/+16
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer common] don't bleed NOMINMAX definition after <windows.h>Tim Rowley2016-11-141-1/+4
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] add eventsTim Rowley2016-11-146-6/+541
| | | | | | | Added events for tracking early/late Depth and stencil events, TE patch info, GS prim info, and FrontEnd/BackEnd DrawEnd events. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] fix culling issuesTim Rowley2016-11-141-66/+119
| | | | | | | | | - Do proper culling of wireframe triangles (including non-culling of degenerates) - Fix degenerate culling of CCW front-facing triangles in wireframe and conservative rast Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core/jitter] fix alpha test bugTim Rowley2016-11-143-3/+15
| | | | | | | | Alpha from render target 0 should always be used for alpha test for all render targets, according to GL and DX9 specs. Previously we were using alpha from the current render target. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] various code style changesTim Rowley2016-11-146-5/+26
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] don't generate empty filesTim Rowley2016-11-144-8/+39
| | | | | | | Don't generate files when no events have been generated outside the header events. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] fix open file handle limit issueTim Rowley2016-11-141-6/+44
| | | | | | | | Buffer events ourselves and then when that's full or we're destroying the context then write the contents to file. Previously, we're relying ofstream to buffer for us. Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer archrast] fix double free issueTim Rowley2016-11-149-24/+41
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] separate frontend/backend stats enablesTim Rowley2016-11-146-26/+51
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer core] 16-wide tile store nearly completedTim Rowley2016-11-145-314/+917
| | | | | | | | | | * All format combinations coded * Fully emulated on AVX2 and AVX * Known issue: the MSAA sample locations need to be adjusted for 8x2 Set ENABLE_AVX512_SIMD16 and USD_8x2_TILE_BACKEND to 1 in knobs.h to enable Reviewed-by: Bruce Cherniak <[email protected]>
* i965/vec4: skip registers already marked as no_spillJuan A. Suarez Romero2016-11-141-2/+2
| | | | | | | Do not evaluate spill costs for registers that were already marked as no_spill. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Don't crash on function names with invalid identifiers.Kenneth Graunke2016-11-121-2/+4
| | | | | | | | | | | | | | | | Karol Herbst's fuzzing efforts noticed that we would segfault on: void bug() { 2(0); } We just need to bail if the function name isn't an identifier. Based on a bug fix by Karol Herbst. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97422 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Fix assert fails when assignment expressions are in array sizes.Kenneth Graunke2016-11-121-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | Karol Herbst's fuzzing efforts discovered that we would hit the following assert: assert(dummy_instructions.is_empty()); when processing an illegal array size expression of float[(1=1)?1:1] t; In do_assignment, we realized we needed an rvalue for (1 = 1), and generated a temporary variable and assignment from the RHS. We've already flagged an error (non-lvalue in assignment), and return a bogus value as the rvalue. But process_array_size sees the bogus value, which happened to be a constant expression, and rightly assumes that processing a constant expression shouldn't have generated any code. instructions. To handle this, make do_assignment not generate any temps or assignments when it's already raised an error - just return an error value directly. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98694 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* vc4: Add simulator kernel validation for multithreaded fragment shaders.Jonas Pfeil2016-11-123-5/+76
| | | | | This is Jonas Pfeil's code from the kernel, brought back to Mesa by anholt.
* vc4: Mark threaded FSes as non-singlethread in the CL.Eric Anholt2016-11-123-1/+6
|
* vc4: Flag the last thread switch in the program as the last.Eric Anholt2016-11-123-0/+34
| | | | | | We don't allow the last thread switch to be inside control flow, to be sure that we hit the last state exactly once. If the last texturing was in control flow, fall back to single threaded.
* vc4: Add THRSW nodes after each tex sample setup in multithreaded mode.Eric Anholt2016-11-122-0/+49
| | | | | This is a suboptimal implementation, but Jonas Pfeil found that it was still a massive performance gain.
* vc4: Add some spec citations about texture fifo management.Eric Anholt2016-11-121-5/+37
|
* vc4: Use ra14/rb14 as the spilling registers.Eric Anholt2016-11-122-8/+8
| | | | This makes the raddr fixups compatible with FS threading.
* vc4: Add support for register allocation for threaded shaders.Eric Anholt2016-11-123-20/+85
| | | | | | We have two major requirements: Make sure that only the bottom half of the physical reg space is used, and make sure that none of our values are live in an accumulator across a switch.