aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* util/ralloc: Make sizeof(linear_header) a multiple of 8Matt Turner2018-11-121-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch sizeof(linear_header) was 20 bytes in a non-debug build on 32-bit platforms. We do some pointer arithmetic to calculate the next available location with ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset); in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation would only be 4-byte aligned. On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of 4-byte registers to memory) requires an 8-byte aligned address. Such an instruction is used to store to an 8-byte integer type, like intmax_t which is used in glcpp's expression_value_t struct. As a result of the 4-byte alignment returned by linear_alloc_child() we would generate a SIGBUS (unaligned exception) on SPARC. According to the GNU libc manual malloc() always returns memory that has at least an alignment of 8-bytes [1]. I think our allocator should do the same. So, simple fix with two parts: (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally. (2) Mark linear_header with an aligned attribute, which will cause its sizeof to be rounded up to that alignment. (We already do this for ralloc_header) With this done, all Mesa's unit tests now pass on SPARC. [1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html Fixes: 47e17586924f ("glcpp: use the linear allocator for most objects") Bug: https://bugs.gentoo.org/636326 Reviewed-by: Eric Anholt <[email protected]>
* util/ralloc: Switch from DEBUG to NDEBUGMatt Turner2018-11-121-14/+4
| | | | | | | The debug code is all asserts, so protect it with the same thing that controls assert. Reviewed-by: Eric Anholt <[email protected]>
* nir: add support for removing redundant stores to copy prop varTimothy Arceri2018-11-131-10/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example the following type of thing is seen in TCS from a number of Vulkan and DXVK games: vec1 32 ssa_557 = deref_var &oPatch (shader_out float) vec1 32 ssa_558 = intrinsic load_deref (ssa_557) () vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float) vec1 32 ssa_560 = intrinsic load_deref (ssa_559) () vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float) vec1 32 ssa_562 = intrinsic load_deref (ssa_561) () intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */ intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */ intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */ No shader-db changes on i965 (SKL). vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 7832 -> 7728 (-1.33 %) VGPRS: 6476 -> 6740 (4.08 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 469572 -> 456596 (-2.76 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 989 -> 960 (-2.93 %) Wait states: 0 -> 0 (0.00 %) The Max Waves and VGPRS changes here are misleading. What is happening is a bunch of TCS outputs are being optimised away as they are now recognised as unused. This results in more varyings being compacted via nir_compact_varyings() which can result in more register pressure when they are not packed in an optimal way. This is an existing problem independent of this patch. I've run some benchmarks and haven't noticed any performance regressions in affected games. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/i965: make use of nir_link_constant_varyings()Timothy Arceri2018-11-131-0/+3
| | | | | | | | | | | | | | | | | | | shader-db results for SLK: total instructions in shared programs: 13106498 -> 13091573 (-0.11%) instructions in affected programs: 1186244 -> 1171319 (-1.26%) helped: 6186 HURT: 0 total cycles in shared programs: 332062633 -> 331961653 (-0.03%) cycles in affected programs: 8537165 -> 8436185 (-1.18%) helped: 5371 HURT: 862 LOST: 6 GAINED: 14 Reviewed-by: Jason Ekstrand <[email protected]>
* egl: Improve the debugging of gbm format matching in DRI configs.Eric Anholt2018-11-121-2/+3
| | | | | | | | | | | | | | | | | Previously the debug would be: libEGL debug: No DRI config supports native format 0x20203852 libEGL debug: No DRI config supports native format 0x38385247 but libEGL debug: No DRI config supports native format R8 libEGL debug: No DRI config supports native format GR88 is a lot easier to understand. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* gbm: Introduce a helper function for printing GBM format names.Eric Anholt2018-11-122-0/+26
| | | | | | | | | | This requires that the caller make a little (stack) allocation to store the string. v2: Use gbm_format_canonicalize (suggested by Daniel) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* gbm: Move gbm_format_canonicalize() to the core.Eric Anholt2018-11-123-16/+19
| | | | | | | I want it for the format name debugging code. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* meson: fix libatomic testsDylan Baker2018-11-121-1/+2
| | | | | | | | | | | There are two problems: 1) the extra underscore in MISSING_64BIT_ATOMICS 2) we should link with libatomic if the previous test decided we needed it Fixes: d1992255bb29054fa51763376d125183a9f602f3 ("meson: Add build Intel "anv" vulkan driver") Reviewed-and-Tested-by: Matt Turner <[email protected]>
* mesa: mark GL_SR8_EXT non-renderable on GLESMarek Olšák2018-11-121-0/+1
| | | | | | Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: disable L3 thread pinningMarek Olšák2018-11-121-9/+0
| | | | | | | This implementation can have massive drawbacks. Cc: 18.3 <[email protected]> Reviewed-by: Edmondo Tommasina <[email protected]>
* nir: add lowering for ffloorChristian Gmeiner2018-11-122-0/+4
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* util: Fix warning in u_cpu_detect on non-x86Alyssa Rosenzweig2018-11-121-2/+2
| | | | | | | | | regs is only set and used on x86; on other platforms (like ARM), this code causes a trivial warning, solved by moving the regs declaration to the architecture-dependent usage. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]>
* meson: Don't set -WallDylan Baker2018-11-121-2/+2
| | | | | | | | | meson does this for you with its warn levels, so we don't need to set it ourselves. Fixes: d1992255bb29054fa51763376d125183a9f602f3 ("meson: Add build Intel "anv" vulkan driver") Reviewed-by: Eric Engestrom <[email protected]>
* freedreno/drm: fix unused 'entry' warningsRob Clark2018-11-121-2/+0
| | | | | | | Looks like importing libdrm_freedreno into mesa crossed paths with e27902a2613. Signed-off-by: Rob Clark <[email protected]>
* i965: add support for sampling from AYUVLionel Landwerlin2018-11-124-0/+11
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* dri: add AYUV formatLionel Landwerlin2018-11-123-0/+4
| | | | | | | | v2: Add a AYUV entry android in the android backend (Tapani) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir/lower_tex: Add AYUV lowering supportLionel Landwerlin2018-11-122-0/+20
| | | | | | | | | | | | | | | Byte ordering is : 0: V 1: U 2: Y 3: A v2: Split refactoring of alpha channel (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (v1) Acked-by: Eric Engestrom <[email protected]> (v2)
* nir/lower_tex: add alpha channel parameter for yuv loweringLionel Landwerlin2018-11-121-6/+11
| | | | | | | | | We're about to introduce AYUV support which provides its own alpha channel. So give alpha as a parameter and set it to 1 on exising formats. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: make use of num_good_cu_per_sh in si_emit_graphics() tooSamuel Pitoiset2018-11-121-2/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: clean up setting partial_es_wave for distributed tess on VISamuel Pitoiset2018-11-121-7/+4
| | | | | | | | Only needed when the pipeline actually uses tessellation. I don't think that changes anything, except improving readability. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: cleanup and document a Hawaii bug with offchip buffersSamuel Pitoiset2018-11-121-9/+8
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl/test: Fix use after free in test_optpass.Hanno Böck2018-11-121-1/+4
| | | | | | | | | | | The variable state is free'd and afterwards state->error is used as the return value, resulting in a use after free bug detected by memory safety tools like address sanitizer. Signed-off-by: Hanno Böck <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108636 Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* nir: don't pack varyings ints with floats unless flatTimothy Arceri2018-11-121-4/+7
| | | | | | Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add glsl_type_is_integer() helperTimothy Arceri2018-11-122-0/+6
| | | | | | Fixes: 1c9c42d16b4c ("nir: add varying component packing helpers") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Prevent emission of IR instructions not aligned to their own ↵Francisco Jerez2018-11-091-3/+17
| | | | | | | | | | | | | | | | | execution size. This can occur during payload setup of SIMD-split send message instructions, which can lead to the emission of header setup instructions with a non-zero channel group and fixed SIMD width. Such instructions could end up using undefined channel enable signals except they don't care since they're always marked force_writemask_all. Not known to affect correctness of any workload at this point, but it would be trivial to back-port to stable if something comes up. Reported-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]> Tested-by: Sagar Ghuge <[email protected]>
* st/mesa: make use of nir_link_constant_varyings()Timothy Arceri2018-11-101-0/+3
| | | | | | | | | | | | | | | | | | Shader-db results radeonsi (VEGA): Totals from affected shaders: SGPRS: 161464 -> 161368 (-0.06 %) VGPRS: 86904 -> 86292 (-0.70 %) Spilled SGPRs: 296 -> 314 (6.08 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3618596 -> 3573852 (-1.24 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 26189 -> 26276 (0.33 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Eric Anholt <[email protected]>
* nir: add new linking opt nir_link_constant_varyings()Timothy Arceri2018-11-102-0/+111
| | | | | | | This pass moves constant outputs to the consuming shader stage where possible. Reviewed-by: Eric Anholt <[email protected]>
* st/nine: clean up thead shutdown sequence a bitAndre Heider2018-11-091-4/+2
| | | | | | | Just break out of the loop instead, it does the same thing. Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* st/nine: plug thread related leaksAndre Heider2018-11-092-0/+9
| | | | | Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* st/nine: fix stack corruption due to ABI mismatchAndre Heider2018-11-091-1/+13
| | | | | | | | | | | | | | This fixes various crashes and hangs when using nine's 'thread_submit' feature. On 64bit, the thread function's data argument would just be NULL. On 32bit, the data argument would be garbage depending on the compiler flags (in my case -march>=core2). Fixes: f3fa7e3068512d ("st/nine: Use WINE thread for threadpool") Cc: [email protected] Signed-off-by: Andre Heider <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* radeonsi: stop command submission with PIPE_CONTEXT_LOSE_CONTEXT_ON_RESET onlyMarek Olšák2018-11-0915-20/+27
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CONTEXT_LOSE_CONTEXT_ON_RESETMarek Olšák2018-11-092-0/+6
| | | | Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2Marek Olšák2018-11-094-3/+11
| | | | and add has_dcc_constant_encode.
* radeonsi: use better DCC clear codesMarek Olšák2018-11-091-5/+21
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/surface: remove the overallocation workaround for Vega12Marek Olšák2018-11-091-4/+0
| | | | | | not needed anymore (probably since the tile_swizzle fix) Reviewed-by: Samuel Pitoiset <[email protected]>
* intel/aub_read: remove useless breaksLionel Landwerlin2018-11-091-6/+0
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* Revert "mesa: expose NV_conditional_render on GLES"Erik Faye-Lund2018-11-092-3/+3
| | | | This reverts commit 5213be9fab72548c799b30e320dd1b257534f096.
* Revert "mesa/main: fixup make check after NV_conditional_render for gles"Erik Faye-Lund2018-11-092-6/+0
| | | | This reverts commit cccd7a253f9ed14ea748a222f58b0e5c895eb939.
* mesa/main: fixup make check after NV_conditional_render for glesErik Faye-Lund2018-11-092-0/+6
| | | | | | | | | It seems I missed some details when exposing NV_conditional_render on GLES; this fixes up "make check". Fixes: 5213be9fab7 ("mesa: expose NV_conditional_render on GLES") Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-and-Tested-by: Eric Engestrom <[email protected]>
* radv: include LLVM IR in the VK_AMD_shader_info "disassembly"Nicolai Hähnle2018-11-091-0/+1
| | | | | | | Helpful for debugging compiler backend problems: this allows us to easily retrieve the LLVM IR from RenderDoc. Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa: expose NV_conditional_render on GLESErik Faye-Lund2018-11-092-3/+3
| | | | | | | | The extension spec has been updated to include GLES 2 support, so let's enable it there. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir/constant_folding: fix incorrect bit-size checkIago Toral Quiroga2018-11-091-3/+1
| | | | | | | | | nir_alu_type_get_type_size takes a type as parameter and we were passing a bit-size instead, which did what we wanted by accident, since a bit-size of zero matches nir_type_invalid, which has a size of 0 too. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/compiler: fix node interference of simd16 instructionsIago Toral Quiroga2018-11-091-19/+17
| | | | | | | | | | | | | | | | | SIMD16 instructions need to have additional interferences to prevent source / destination hazards when the source and destination registers are off by one register. While we already have code to handle this, it was only running for SIMD16 dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch. An example of this are pull constant loads since commit b56fa830c6095, but there are more cases. This fixes a number of CTS test failures found in work-in-progress tests that were hitting this situation for 16-wide pull constants in a SIMD8 program. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* gallivm: fix improper clamping of vertex index when fetching gs inputsRoland Scheidegger2018-11-091-10/+31
| | | | | | | | | | | | | | | | | | | | Because we only have one file_max for the (2d) gs input file, the value actually represents the max of attrib and vertex index (although I'm not entirely sure if we really want the max, since the max valid value of the vertex dimension can be easily deduced from the input primitive). Thus in cases where the number of inputs is higher than the number of vertices per prim, we did not properly clamp the vertex index, which would result in out-of-bound fetches, potentially causing segfaults (the segfaults seemed actually difficult to trigger, but valgrind certainly wasn't happy). This might have happened even if the shader did not actually try to fetch bogus vertices, if the fetching happened in non-active conditional clauses. To fix simply use the correct max vertex index value (derived from the input prim type) instead when clamping for this case. Reviewed-by: Jose Fonseca <[email protected]>
* i965: Lift restriction in external textures for EGLImage supportAditya Swarup2018-11-083-15/+0
| | | | | | | | | | | | | | | | Fixes Skqp's unitTest_EGLImageTest test. For Intel platforms, we support external textures only for EGLImages created with EGL_EXT_image_dma_buf_import. This restriction seems to be Intel specific and not present for other platforms. While running SKQP test - unitTest_EGLImageTest, GL_INVALID is sent to the test because of this restriction. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105301 Signed-off-by: Aditya Swarup <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* glsl: Add pragma to disable all warningsIan Romanick2018-11-088-10/+89
| | | | | | | | | | | | | | | | | | | | | | | | | | Use #pragma warning(off) and #pragma warning(on) to disable or enable all warnings. This is a big hammer. If we ever need a smaller hammer, we can enhance this functionality. There is one lame thing about this. Because we parse everything, create an AST, then convert the AST to GLSL IR, we have to treat the #pragma like a statment. This means that you can't do something like ' void ' #pragma warning(off) ' __foo ' #pragma warning(on) ' (float param0); Fixing that would, as far as I can tell, require a huge amount of work. I did try just handling the #pragma during parsing (like we do for state for the whole shader. v2: Fix the #pragma lines in the commit message that git-commit ate. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add warning tests for identifiers with __Ian Romanick2018-11-082-0/+25
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Add an assert to optimize_frontfacing_ternaryJason Ekstrand2018-11-081-0/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Use nir_src_is_const and friends in lowering codeJason Ekstrand2018-11-082-12/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/analyze_ubo_ranges: Use nir_src_is_const and friendsJason Ekstrand2018-11-081-8/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>