summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ac/nir_to_llvm: fix component packing for double outputsTimothy Arceri2018-03-281-1/+3
| | | | | | | | | | We need to wait until after the writemask is widened before we adjust it for component packing. Together with the previous patch this fixes a number of arb_enhanced_layouts component layout piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_nir: fix driver location for dual-slot packed doublesTimothy Arceri2018-03-281-6/+16
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: fix scanning of multi-slot output varyingsTimothy Arceri2018-03-281-109/+127
| | | | | | | | | | This fixes tcs/tes varying arrays where we dont lower indirects and therefore don't split arrays. Here we also fix useagemask for dual slot doubles. Fixes a number of arb_tessellation_shader piglit tests. Reviewed-by: Marek Olšák <[email protected]>
* broadcom/vc5: Fix RG16I/UI texture sampling.Eric Anholt2018-03-271-2/+2
| | | | | | | How many times did I look at this table without noticing the missing 'G' in the texture column? Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
* nir: fix generated nir_intrinsics.c for MSVCRob Clark2018-03-271-0/+4
| | | | | | | | | | Apparently it is not happy about things like: .foo = {} So skip over initializers for empty lists. Fixes: 76dfed8ae2d5c6c509eb2661389be3c6a25077df Reported-by: Roland Scheidegger <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* docs: update calendar 18.0.0 is outEmil Velikov2018-03-271-22/+4
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add news item and link release notes for 18.0.0Emil Velikov2018-03-272-0/+8
| | | | Signed-off-by: Emil Velikov <[email protected]>
* docs: add sha256 checksums for 18.0.0Emil Velikov2018-03-271-1/+2
| | | | | Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit fb64913d195112462786c0459d12f4bc8e7adee7)
* docs: Update 18.0.0 release notesEmil Velikov2018-03-272-74/+320
| | | | | | | | Note: the file was originally 17.4.0, yet git stuggles to detect the move :-\ Signed-off-by: Emil Velikov <[email protected]> (cherry picked from commit dceb1ce807a8b0ab32dc16b38040969bdbcc0d1b)
* nir: mako all the intrinsicsRob Clark2018-03-2711-619/+727
| | | | | | | | | | | | | | | | | | | | | | | I threatened to do this a long time ago.. I probably *should* have done it a long time ago when there where many fewer intrinsics. But the system of macro/#include magic for dealing with intrinsics is a bit annoying, and python has the nice property of optional fxn params, making it possible to define new intrinsics while ignoring parameters that are not applicable (and naming optional params). And not having to specify various array lengths explicitly is nice too. I think the end result makes it easier to add new intrinsics. v2: couple small fixes found with a test program to compare the old and new tables v3: misc comments, don't rely on capture=true for meson.build, get rid of system_values table to avoid return value of intrinsic() and *mostly* remove side-effects, add autotools build support v4: scons build Signed-off-by: Rob Clark <[email protected]> Acked-by: Dylan Baker <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* nir: fix per_vertex_output intrinsicRob Clark2018-03-271-1/+1
| | | | | | | | | This is supposed to have both BASE and COMPONENT but num_indices was inadvertantly set to 1. Cc: <[email protected]> Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl_types: fix build break with intel/msvc compilerRob Clark2018-03-271-83/+24
| | | | | | | | | | | | | | | | | | | | The VECN() macro was taking advantage of a GCC specific feature that is not available on lesser compilers, mostly for the purposes of avoiding a macro that encoded a return statement. But as suggested by Ian, we could just have the macro produce the entire method body and avoid the need for this. So let's do that instead. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740 Fixes: f407edf3407396379e16b0be74b8d3b85d2ad7f0 Cc: Emil Velikov <[email protected]> Cc: Timothy Arceri <[email protected]> Cc: Roland Scheidegger <[email protected]> Cc: Ian Romanick <[email protected]> Signed-off-by: Rob Clark <[email protected]> Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add GL_HALF_FLOAT as supported type to readpixelsLin Johnson2018-03-271-0/+2
| | | | | | | | | | | | | | | | | | | EXT_color_buffer_float spec states: "An INVALID_OPERATION error is generated ... if the color buffer is a floating-point format and type is not FLOAT, HALF FLOAT, or UNSIGNED_INT_10F_11F_11F_REV." This means that GL_HALF_FLOAT type should be supported when color buffer has floating-point format. Fixes Android CTS test android.view.cts.PixelCopyTest. v2: remove comments of EXT_color_buffer_half_float as EXT_color_buffer_float can use type GL_HALF_FLOAT Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* broadcom/vc5: Fix swizzling of RGB10_A2UI render targets.Eric Anholt2018-03-261-1/+1
| | | | | | | This is the actual hardware layout, and we were only swizzling R/B back around in texturing. Fixes part of KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in simulation.
* broadcom/vc5: Fix extraneous register index in QIR dumping of TLBU writes.Eric Anholt2018-03-261-0/+1
| | | | Just like TLB without a config uniform, we don't have a register index.
* broadcom/vc5: Implement workaround for GFXH-1431.Eric Anholt2018-03-261-1/+5
| | | | | This should fix some blending errors, but doesn't impact any testcases in the CTS.
* broadcom/vc5: Fix EZ disabling and allow using GT/GE direction as well.Eric Anholt2018-03-265-21/+111
| | | | | | | Once we've disabled EZ for some draws, we need to not use EZ on future draws. Implementing that made implementing the GT/GE direction trivial. Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
* broadcom/vc5: Disable TF on V3D 4.x when drawing with queries disabled.Eric Anholt2018-03-262-0/+8
| | | | | | On 3.x, we just don't flag the primitive as needing TF, but those primitive bits are now allocated to the new primitive types. Now we need to actually update the enable flag at draw time.
* broadcom/vc5: Disable transform feedback on V3D 4.x at the end of the job.Eric Anholt2018-03-263-5/+29
| | | | | | The next job from this client will turn it back on unless TF gets disabled, but we don't want the state to leak from this client to another (which causes GPU hangs).
* broadcom/vc5: Move the BCL epilogue code to a per-version compile.Eric Anholt2018-03-265-24/+67
| | | | I need to do some new packets for transform feedback on 4.1.
* broadcom/vc5: Fix transform feedback in the presence of point size.Eric Anholt2018-03-263-4/+23
| | | | | | | I had this note to myself, and it turns out that a lot of CTS tests use XFB with points to get data out without using a fragment shader. Keep track of two sets of precomputed TF specs (point size in VPM prologue or not), and switch between them when we enable/disable point size.
* broadcom/vc5: Split transform feedback specs update from buffers.Eric Anholt2018-03-261-27/+32
| | | | | The specs update will be changing based on additional state flags in the next commit, and this unindents the buffer update code.
* broadcom/vc5: Limit each transform feedback data spec to 16 dwords.Eric Anholt2018-03-262-14/+31
| | | | | | | | | The length-1 field only has 4 bits, so we need to generate separate specs when there's too much TF output per buffer. Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type and transform_feedback_max_interleaved.
* gallium/u_vbuf: Protect against overflow with large instance divisors.Eric Anholt2018-03-261-1/+10
| | | | | | | | | | | | GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1 as a divisor, so we would overflow to count=0 and upload no data, triggering the assert below. We want to upload 1 element in this case, fixing the test on VC5. v2: Use some more obvious logic, and explain why we don't use the normal round_up(). Reviewed-by: Brian Paul <[email protected]>
* st: Allow accelerated CopyTexImage from RGBA to RGB.Eric Anholt2018-03-261-6/+26
| | | | | | | | | | | | | | | | There's nothing to worry about here -- the A channel just gets dropped by the blit. This avoids a segfault in the fallback path when copying from a RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an RGBA16_SINT texture (the fallback path tries to get/fetch to float buffers, but the float pack/unpack functions are NULL for SINT/UINT). Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5. v2: Extract the logic to a helper function and explain what's going on better. v3: const-qualify args Reviewed-by: Brian Paul <[email protected]>
* winsys/amdgpu: always allow GTT placements on APUsMarek Olšák2018-03-261-7/+5
| | | | Reviewed-by: Christian König <[email protected]>
* radeonsi: don't reallocate on DMABUF export if local BOs are disabledMarek Olšák2018-03-264-5/+9
|
* glsl: fix infinite loop caused by bug in loop unrolling passTimothy Arceri2018-03-271-1/+1
| | | | | | | | | | | | | | | | | | Just checking for 2 jumps is not enough to be sure we can do a complex loop unroll. We need to make sure we also have also found 2 loop terminators. Without this we were attempting to unroll a loop where the second jump was nested inside multiple ifs which loop analysis is unable to detect as a terminator. We ended up splicing out the first terminator but failed to actually unroll the loop, this resulted in the creation of a possible infinite loop. Fixes: 646621c66da9 "glsl: make loop unrolling more like the nir unrolling path" Tested-by: Gert Wollny <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
* gallium: Do not add -Wframe-address option for gcc <= 4.4.Vinson Lee2018-03-261-1/+1
| | | | | | | | | | | | | | | | This patch fixes these build errors with GCC 4.4. Compiling src/gallium/auxiliary/util/u_debug_stack.c ... src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’: src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529 Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Correct minor typo in header commentsAlyssa Rosenzweig2018-03-261-1/+1
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* intel/aubinator_error_decode: Decode more registers.Rafael Antognolli2018-03-261-0/+12
| | | | | | Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add SAMPLER_INSTDONE register.Rafael Antognolli2018-03-266-0/+139
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add ROW_INSTDONE register.Rafael Antognolli2018-03-266-0/+114
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add SC_INSTDONE register.Rafael Antognolli2018-03-266-0/+140
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/vec4: Fix null destination register in 3-source instructionsIan Romanick2018-03-262-0/+27
| | | | | | | | | | | | | | | | | | | | | | | A recent commit (see below) triggered some cases where conditional modifier propagation and dead code elimination would cause a MAD instruction like the following to be generated: mad.l.f0 null, ... Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases like this in the scalar backend. This commit basically ports that code to the vec4 backend. NOTE: I have sent a couple tests to the piglit list that reproduce this bug *without* the commit mentioned below. This commit fixes those tests. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]> Tested-by: Tapani Pälli <[email protected]> Cc: [email protected] Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704
* nir: Don't condition 'a-b < 0' -> 'a < b' on is_not_used_by_conditionalIan Romanick2018-03-262-18/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that i965 recognizes that a-b generates the same conditions as 'a < b', there is no reason to condition this transformation on 'is not used by conditional.' Since this was the only user of the is_not_used_by_conditional function, delete it. All Gen6+ platforms had similar results. (Skylake shown) total instructions in shared programs: 14400775 -> 14400595 (<.01%) instructions in affected programs: 36712 -> 36532 (-0.49%) helped: 182 HURT: 26 helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1 helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90% 95% mean confidence interval for instructions value: -0.97 -0.76 95% mean confidence interval for instructions %-change: -0.59% -0.43% Instructions are helped. total cycles in shared programs: 532929592 -> 532926345 (<.01%) cycles in affected programs: 478660 -> 475413 (-0.68%) helped: 187 HURT: 22 helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18 helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03% HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11 HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86% 95% mean confidence interval for cycles value: -19.50 -11.57 95% mean confidence interval for cycles %-change: -1.42% -0.58% Cycles are helped. GM45 and Iron Lake had similar results. (Iron Lake shown) total cycles in shared programs: 177851578 -> 177851810 (<.01%) cycles in affected programs: 24408 -> 24640 (0.95%) helped: 2 HURT: 4 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44% HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54 HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02% 95% mean confidence interval for cycles value: -7.75 85.08 95% mean confidence interval for cycles %-change: -0.39% 1.49% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Propagate conditional modifiers from compares to addsIan Romanick2018-03-261-5/+65
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No changes on Broadwell or later as those platforms do not use the vec4 backend. Ivy Bridge and Haswell had similar results. (Ivy Bridge shown) total instructions in shared programs: 11682119 -> 11681056 (<.01%) instructions in affected programs: 150403 -> 149340 (-0.71%) helped: 950 HURT: 0 helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1 helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71% 95% mean confidence interval for instructions value: -1.19 -1.04 95% mean confidence interval for instructions %-change: -0.84% -0.79% Instructions are helped. total cycles in shared programs: 257495842 -> 257495238 (<.01%) cycles in affected programs: 270302 -> 269698 (-0.22%) helped: 271 HURT: 13 helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2 helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28% HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4 HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26% 95% mean confidence interval for cycles value: -2.41 -1.84 95% mean confidence interval for cycles %-change: -0.31% -0.26% Cycles are helped. Sandy Bridge total instructions in shared programs: 10430493 -> 10429727 (<.01%) instructions in affected programs: 120860 -> 120094 (-0.63%) helped: 766 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 146138718 -> 146138446 (<.01%) cycles in affected programs: 244114 -> 243842 (-0.11%) helped: 132 HURT: 0 helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2 helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19% 95% mean confidence interval for cycles value: -2.12 -2.00 95% mean confidence interval for cycles %-change: -0.18% -0.15% Cycles are helped. GM45 and Iron Lake had identical results. (Iron Lake shown) total instructions in shared programs: 7780251 -> 7780248 (<.01%) instructions in affected programs: 175 -> 172 (-1.71%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49% total cycles in shared programs: 177851584 -> 177851578 (<.01%) cycles in affected programs: 9796 -> 9790 (-0.06%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05% Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/vec4: Allow cmod propagation when src0 is a uniform or shader inputIan Romanick2018-03-261-1/+2
| | | | | | | | | | No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help more shaders. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Propagate conditional modifiers from compares to addsIan Romanick2018-03-262-5/+400
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The math inside the add and the cmp in this instruction sequence is the same. We can utilize this to eliminate the compare. add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This is reduced to: add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This optimization pass could do even better. The nature of converting vectorized code from the GLSL front end to scalar code in NIR results in sequences like: add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q }; In this sequence, only the first cmp.z is removed. With different scheduling, all 3 could get removed. Skylake total instructions in shared programs: 14407009 -> 14400173 (-0.05%) instructions in affected programs: 1307274 -> 1300438 (-0.52%) helped: 4880 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.45 -1.35 95% mean confidence interval for instructions %-change: -0.72% -0.69% Instructions are helped. total cycles in shared programs: 532943169 -> 532923528 (<.01%) cycles in affected programs: 14065798 -> 14046157 (-0.14%) helped: 2703 HURT: 339 helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2 helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21% HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12 HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41% 95% mean confidence interval for cycles value: -8.66 -4.26 95% mean confidence interval for cycles %-change: -0.24% -0.14% Cycles are helped. LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 14719636 -> 14712949 (-0.05%) instructions in affected programs: 1288188 -> 1281501 (-0.52%) helped: 4845 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.43 -1.33 95% mean confidence interval for instructions %-change: -0.72% -0.68% Instructions are helped. total cycles in shared programs: 559599253 -> 559581699 (<.01%) cycles in affected programs: 13315565 -> 13298011 (-0.13%) helped: 2600 HURT: 269 helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2 helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20% HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20 HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75% 95% mean confidence interval for cycles value: -8.47 -3.77 95% mean confidence interval for cycles %-change: -0.27% -0.18% Cycles are helped. LOST: 0 GAINED: 8 Haswell total instructions in shared programs: 12978609 -> 12973483 (-0.04%) instructions in affected programs: 932921 -> 927795 (-0.55%) helped: 3480 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58% 95% mean confidence interval for instructions value: -1.53 -1.42 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 410270788 -> 410250531 (<.01%) cycles in affected programs: 10986161 -> 10965904 (-0.18%) helped: 2087 HURT: 254 helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4 helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21% HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16 HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47% 95% mean confidence interval for cycles value: -12.82 -4.49 95% mean confidence interval for cycles %-change: -0.31% -0.18% Cycles are helped. LOST: 0 GAINED: 5 Ivy Bridge total instructions in shared programs: 11686082 -> 11681548 (-0.04%) instructions in affected programs: 937696 -> 933162 (-0.48%) helped: 3150 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: -1.49 -1.38 95% mean confidence interval for instructions %-change: -0.71% -0.67% Instructions are helped. total cycles in shared programs: 257514962 -> 257492471 (<.01%) cycles in affected programs: 11524149 -> 11501658 (-0.20%) helped: 1970 HURT: 239 helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3 helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17% HURT stats (abs) min: 1 max: 1358 x̄: 50.00 x̃: 15 HURT stats (rel) min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65% 95% mean confidence interval for cycles value: -17.01 -3.35 95% mean confidence interval for cycles %-change: -0.33% -0.08% Cycles are helped. LOST: 9 GAINED: 1 Sandy Bridge total instructions in shared programs: 10432841 -> 10429893 (-0.03%) instructions in affected programs: 685071 -> 682123 (-0.43%) helped: 2453 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1 helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46% 95% mean confidence interval for instructions value: -1.23 -1.17 95% mean confidence interval for instructions %-change: -0.67% -0.62% Instructions are helped. total cycles in shared programs: 146133660 -> 146134195 (<.01%) cycles in affected programs: 3991634 -> 3992169 (0.01%) helped: 1237 HURT: 153 helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2 helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14% HURT stats (abs) min: 1 max: 1740 x̄: 59.56 x̃: 12 HURT stats (rel) min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42% 95% mean confidence interval for cycles value: -5.13 5.90 95% mean confidence interval for cycles %-change: -0.17% 0.16% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 GM45 and Iron Lake had similar results (GM45 shown): total instructions in shared programs: 4800332 -> 4798380 (-0.04%) instructions in affected programs: 565995 -> 564043 (-0.34%) helped: 1451 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1 helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31% 95% mean confidence interval for instructions value: -1.40 -1.29 95% mean confidence interval for instructions %-change: -0.50% -0.45% Instructions are helped. total cycles in shared programs: 122032318 -> 122027798 (<.01%) cycles in affected programs: 8334868 -> 8330348 (-0.05%) helped: 1029 HURT: 1 helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2 helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04% HURT stats (abs) min: 38 max: 38 x̄: 38.00 x̃: 38 HURT stats (rel) min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25% 95% mean confidence interval for cycles value: -4.70 -4.08 95% mean confidence interval for cycles %-change: -0.09% -0.08% Cycles are helped. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Allow cmod propagation when src0 is a uniform or shader inputIan Romanick2018-03-261-1/+2
| | | | | | | | | | No shader-db changes. This source must have been written by a previous instruction, so it cannot be a uniform or a shader input. However, this change allows the next commit to help about 900 more shaders. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add negative_equals methodsIan Romanick2018-03-267-0/+72
| | | | | | | | | | | | | | | | | | | | | | This method is similar to the existing ::equals methods. Instead of testing that two src_regs are equal to each other, it tests that one is the negation of the other. v2: Simplify various checks based on suggestions from Matt. Use src_reg::type instead of fixed_hw_reg.type in a check. Also suggested by Matt. v3: Rebase on 3 years. Fix some problems with negative_equals with VF constants. Add fs_reg::negative_equals. v4: Replace the existing default case with BRW_REGISTER_TYPE_UB, BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF. Suggested by Matt. Expand the FINISHME comment to better explain why it isn't already finished. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]>
* mesa/st/tests: Use tgsi opcode enum also in the test classesGert Wollny2018-03-262-8/+8
| | | | | | | Fixes: ec478cf9c31K ("st/mesa,tgsi: use enum tgsi_opcode") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105737 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meson: fix header check messageEric Engestrom2018-03-261-1/+1
| | | | | | | | before: Checking if "endian.h works" compiles: YES after: Checking if "endian.h" compiles: YES Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glsl_types: vec8/vec16 supportRob Clark2018-03-256-10/+23
| | | | | | | Not used in GL but 8 and 16 component vectors exist in OpenCL. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl_types: refactor/prep for vec8/vec16Rob Clark2018-03-253-149/+49
| | | | | | | | | | | | | Refactor things so there isn't so much typing involved to add new things. Also drops a pointless conditional (out of bounds rows or columns already returns error_type in all paths.. might as well drop it rather than make the check more convoluted in the next patch by adding the vec8/vec16 case). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* anv: Set genX_table for gen11Jordan Justen2018-03-231-0/+3
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add gen11 to anv_genX_callJordan Justen2018-03-231-0/+3
| | | | | Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vbo: Make sure the internal VAO's stay within limits.Mathias Fröhlich2018-03-232-1/+6
| | | | | Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Flag early if we modify a SharedAndImmutable VAO.Mathias Fröhlich2018-03-231-0/+6
| | | | | Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: When copying a VAO also copy the vertex attribute mode.Mathias Fröhlich2018-03-231-0/+1
| | | | | Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>