| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
brw_bo_alloc no longer uses this parameter, so there's no point.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Buffers are always page aligned on 965+ hardware; I believe this extra
parameter is a vestige from the Gen2-3 era.
All callers pass 0, and in fact we assert that the alignment is 0 unless
BO_ALLOC_BUSY is set (for some reason). We can just drop the parameter
and set the value to 0 explicitly.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
intel_miptree_create_for_bo does not actually allocate a BO, so
specifying allocation flags accomplishes nothing and is confusing.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This is just zero - passing nothing already gives us a post-sync
operation of "nothing".
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
I have no idea why but having dest_components == -1 was causing a memory
leak somewhere. Without this, you can't get through a full shader-db
run without running out of memory.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This was causing us to walk dest_components times over a thing with no
destination. This happened to work because all of the image intrinsics
without a destination also happened to have dest_components == 0. We
shouldn't be reading dest_components if has_dest == false.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two problems, both of which are fixed now:
- The indirect address was not being shifted by 4
- The indirect address was being placed as an argument in the offset case
This fixes some of the new interpolateAt* piglits which now test for
these situations.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an if nesting inside anouther if is optimised away we can
end up with a loop terminator and following block that looks like
this:
if ssa_596 {
block block_5:
/* preds: block_4 */
vec1 32 ssa_601 = load_const (0xffffffff /* -nan */)
break
/* succs: block_8 */
} else {
block block_6:
/* preds: block_4 */
/* succs: block_7 */
}
block block_7:
/* preds: block_6 */
vec1 32 ssa_602 = phi block_6: ssa_552
vec1 32 ssa_603 = phi block_6: ssa_553
vec1 32 ssa_604 = iadd ssa_551, ssa_66
The problem is the phis. Loop unrolling expects the last block in
the loop to be empty once we splice the instructions in the last
block into the continue branch. The problem is we cant move phis
so here we lower the phis to regs when preparing the loop for
unrolling. As it could be possible to have multiple additional
blocks/ifs following the terminator we just convert all phis at
the top level of the loop body for simplicity.
We also add some comments to loop_prepare_for_unroll() while we
are here.
Fixes: 51daccb289eb "nir: add a loop unrolling pass"
Reviewed-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
|
|
|
|
|
|
|
| |
Fixes piglit test:
tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-interleave-range.shader_test
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The location was only being incremented the first time we processed a
location. This meant we would incorrectly skip some elements of
an array if the first element was packed and proccessed previously
but other elements were not.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We need to wait until after the writemask is widened before we
adjust it for component packing.
Together with the previous patch this fixes a number of
arb_enhanced_layouts component layout piglit tests.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes tcs/tes varying arrays where we dont lower indirects and
therefore don't split arrays. Here we also fix useagemask for dual
slot doubles.
Fixes a number of arb_tessellation_shader piglit tests.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
How many times did I look at this table without noticing the missing 'G'
in the texture column?
Fixes KHR-GLES3.copy_tex_image_conversions.required.* on 7268.
|
|
|
|
|
|
|
|
|
|
| |
Apparently it is not happy about things like: .foo = {}
So skip over initializers for empty lists.
Fixes: 76dfed8ae2d5c6c509eb2661389be3c6a25077df
Reported-by: Roland Scheidegger <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit fb64913d195112462786c0459d12f4bc8e7adee7)
|
|
|
|
|
|
|
|
| |
Note: the file was originally 17.4.0, yet git stuggles to detect the
move :-\
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit dceb1ce807a8b0ab32dc16b38040969bdbcc0d1b)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I threatened to do this a long time ago.. I probably *should* have done
it a long time ago when there where many fewer intrinsics. But the
system of macro/#include magic for dealing with intrinsics is a bit
annoying, and python has the nice property of optional fxn params,
making it possible to define new intrinsics while ignoring parameters
that are not applicable (and naming optional params). And not having to
specify various array lengths explicitly is nice too.
I think the end result makes it easier to add new intrinsics.
v2: couple small fixes found with a test program to compare the old and
new tables
v3: misc comments, don't rely on capture=true for meson.build, get rid
of system_values table to avoid return value of intrinsic() and
*mostly* remove side-effects, add autotools build support
v4: scons build
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Dylan Baker <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is supposed to have both BASE and COMPONENT but num_indices was
inadvertantly set to 1.
Cc: <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VECN() macro was taking advantage of a GCC specific feature that is
not available on lesser compilers, mostly for the purposes of avoiding a
macro that encoded a return statement.
But as suggested by Ian, we could just have the macro produce the entire
method body and avoid the need for this. So let's do that instead.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105740
Fixes: f407edf3407396379e16b0be74b8d3b85d2ad7f0
Cc: Emil Velikov <[email protected]>
Cc: Timothy Arceri <[email protected]>
Cc: Roland Scheidegger <[email protected]>
Cc: Ian Romanick <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
EXT_color_buffer_float spec states:
"An INVALID_OPERATION error is generated ... if the color buffer is
a floating-point format and type is not FLOAT, HALF FLOAT, or
UNSIGNED_INT_10F_11F_11F_REV."
This means that GL_HALF_FLOAT type should be supported when color
buffer has floating-point format.
Fixes Android CTS test android.view.cts.PixelCopyTest.
v2: remove comments of EXT_color_buffer_half_float as
EXT_color_buffer_float can use type GL_HALF_FLOAT
Signed-off-by: Lin Johnson <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
| |
This is the actual hardware layout, and we were only swizzling R/B back
around in texturing. Fixes part of
KHR-GLES3.copy_tex_image_conversions.required.cubemap_negx_cubemap_negx in
simulation.
|
|
|
|
| |
Just like TLB without a config uniform, we don't have a register index.
|
|
|
|
|
| |
This should fix some blending errors, but doesn't impact any testcases in
the CTS.
|
|
|
|
|
|
|
| |
Once we've disabled EZ for some draws, we need to not use EZ on future
draws. Implementing that made implementing the GT/GE direction trivial.
Fixes KHR-GLES3.shaders.fragdepth.compare.no_write on V3D 4.1 simulation.
|
|
|
|
|
|
| |
On 3.x, we just don't flag the primitive as needing TF, but those
primitive bits are now allocated to the new primitive types. Now we need
to actually update the enable flag at draw time.
|
|
|
|
|
|
| |
The next job from this client will turn it back on unless TF gets
disabled, but we don't want the state to leak from this client to another
(which causes GPU hangs).
|
|
|
|
| |
I need to do some new packets for transform feedback on 4.1.
|
|
|
|
|
|
|
| |
I had this note to myself, and it turns out that a lot of CTS tests use
XFB with points to get data out without using a fragment shader. Keep
track of two sets of precomputed TF specs (point size in VPM prologue or
not), and switch between them when we enable/disable point size.
|
|
|
|
|
| |
The specs update will be changing based on additional state flags in the
next commit, and this unindents the buffer update code.
|
|
|
|
|
|
|
|
|
| |
The length-1 field only has 4 bits, so we need to generate separate specs
when there's too much TF output per buffer.
Fixes
GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_builtin_type
and transform_feedback_max_interleaved.
|
|
|
|
|
|
|
|
|
|
|
|
| |
GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor uses -1
as a divisor, so we would overflow to count=0 and upload no data,
triggering the assert below. We want to upload 1 element in this case,
fixing the test on VC5.
v2: Use some more obvious logic, and explain why we don't use the normal
round_up().
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's nothing to worry about here -- the A channel just gets dropped by
the blit. This avoids a segfault in the fallback path when copying from a
RGBA16_SINT renderbuffer to a RGB16_SINT destination represented by an
RGBA16_SINT texture (the fallback path tries to get/fetch to float
buffers, but the float pack/unpack functions are NULL for SINT/UINT).
Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba16i on VC5.
v2: Extract the logic to a helper function and explain what's going on
better.
v3: const-qualify args
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Christian König <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just checking for 2 jumps is not enough to be sure we can do a
complex loop unroll. We need to make sure we also have also found
2 loop terminators.
Without this we were attempting to unroll a loop where the second
jump was nested inside multiple ifs which loop analysis is unable
to detect as a terminator. We ended up splicing out the first
terminator but failed to actually unroll the loop, this resulted
in the creation of a possible infinite loop.
Fixes: 646621c66da9 "glsl: make loop unrolling more like the nir unrolling path"
Tested-by: Gert Wollny <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105670
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes these build errors with GCC 4.4.
Compiling src/gallium/auxiliary/util/u_debug_stack.c ...
src/gallium/auxiliary/util/u_debug_stack.c: In function ‘debug_backtrace_capture’:
src/gallium/auxiliary/util/u_debug_stack.c:268: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:269: error: #pragma GCC diagnostic not allowed inside functions
src/gallium/auxiliary/util/u_debug_stack.c:271: error: #pragma GCC diagnostic not allowed inside functions
Fixes: 370e356ebab4 ("gallium: silence __builtin_frame_address nonzero argument is unsafe warning")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105529
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
| |
Decode SC_INSTDONE, ROW_INSTDONE and SAMPLER_INSTDONE.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A recent commit (see below) triggered some cases where conditional
modifier propagation and dead code elimination would cause a MAD
instruction like the following to be generated:
mad.l.f0 null, ...
Matt pointed out that fs_visitor::fixup_3src_null_dest() fixes cases
like this in the scalar backend. This commit basically ports that code
to the vec4 backend.
NOTE: I have sent a couple tests to the piglit list that reproduce this
bug *without* the commit mentioned below. This commit fixes those
tests.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Tested-by: Tapani Pälli <[email protected]>
Cc: [email protected]
Fixes: ee63933a7 ("nir: Distribute binary operations with constants into bcsel")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105704
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that i965 recognizes that a-b generates the same conditions as 'a <
b', there is no reason to condition this transformation on 'is not used
by conditional.'
Since this was the only user of the is_not_used_by_conditional function,
delete it.
All Gen6+ platforms had similar results. (Skylake shown)
total instructions in shared programs: 14400775 -> 14400595 (<.01%)
instructions in affected programs: 36712 -> 36532 (-0.49%)
helped: 182
HURT: 26
helped stats (abs) min: 1 max: 2 x̄: 1.13 x̃: 1
helped stats (rel) min: 0.15% max: 1.82% x̄: 0.70% x̃: 0.62%
HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel) min: 0.24% max: 1.02% x̄: 0.82% x̃: 0.90%
95% mean confidence interval for instructions value: -0.97 -0.76
95% mean confidence interval for instructions %-change: -0.59% -0.43%
Instructions are helped.
total cycles in shared programs: 532929592 -> 532926345 (<.01%)
cycles in affected programs: 478660 -> 475413 (-0.68%)
helped: 187
HURT: 22
helped stats (abs) min: 2 max: 200 x̄: 20.99 x̃: 18
helped stats (rel) min: 0.23% max: 24.10% x̄: 1.48% x̃: 1.03%
HURT stats (abs) min: 1 max: 214 x̄: 30.86 x̃: 11
HURT stats (rel) min: 0.01% max: 23.06% x̄: 3.12% x̃: 0.86%
95% mean confidence interval for cycles value: -19.50 -11.57
95% mean confidence interval for cycles %-change: -1.42% -0.58%
Cycles are helped.
GM45 and Iron Lake had similar results. (Iron Lake shown)
total cycles in shared programs: 177851578 -> 177851810 (<.01%)
cycles in affected programs: 24408 -> 24640 (0.95%)
helped: 2
HURT: 4
helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4
helped stats (rel) min: 0.42% max: 0.47% x̄: 0.44% x̃: 0.44%
HURT stats (abs) min: 24 max: 108 x̄: 60.00 x̃: 54
HURT stats (rel) min: 0.52% max: 1.62% x̄: 1.04% x̃: 1.02%
95% mean confidence interval for cycles value: -7.75 85.08
95% mean confidence interval for cycles %-change: -0.39% 1.49%
Inconclusive result (value mean confidence interval includes 0).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
No changes on Broadwell or later as those platforms do not use the vec4
backend.
Ivy Bridge and Haswell had similar results. (Ivy Bridge shown)
total instructions in shared programs: 11682119 -> 11681056 (<.01%)
instructions in affected programs: 150403 -> 149340 (-0.71%)
helped: 950
HURT: 0
helped stats (abs) min: 1 max: 16 x̄: 1.12 x̃: 1
helped stats (rel) min: 0.23% max: 2.78% x̄: 0.82% x̃: 0.71%
95% mean confidence interval for instructions value: -1.19 -1.04
95% mean confidence interval for instructions %-change: -0.84% -0.79%
Instructions are helped.
total cycles in shared programs: 257495842 -> 257495238 (<.01%)
cycles in affected programs: 270302 -> 269698 (-0.22%)
helped: 271
HURT: 13
helped stats (abs) min: 2 max: 14 x̄: 2.42 x̃: 2
helped stats (rel) min: 0.06% max: 1.13% x̄: 0.32% x̃: 0.28%
HURT stats (abs) min: 2 max: 12 x̄: 4.00 x̃: 4
HURT stats (rel) min: 0.15% max: 1.18% x̄: 0.30% x̃: 0.26%
95% mean confidence interval for cycles value: -2.41 -1.84
95% mean confidence interval for cycles %-change: -0.31% -0.26%
Cycles are helped.
Sandy Bridge
total instructions in shared programs: 10430493 -> 10429727 (<.01%)
instructions in affected programs: 120860 -> 120094 (-0.63%)
helped: 766
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.30% max: 2.70% x̄: 0.78% x̃: 0.73%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.
total cycles in shared programs: 146138718 -> 146138446 (<.01%)
cycles in affected programs: 244114 -> 243842 (-0.11%)
helped: 132
HURT: 0
helped stats (abs) min: 2 max: 4 x̄: 2.06 x̃: 2
helped stats (rel) min: 0.03% max: 0.43% x̄: 0.16% x̃: 0.19%
95% mean confidence interval for cycles value: -2.12 -2.00
95% mean confidence interval for cycles %-change: -0.18% -0.15%
Cycles are helped.
GM45 and Iron Lake had identical results. (Iron Lake shown)
total instructions in shared programs: 7780251 -> 7780248 (<.01%)
instructions in affected programs: 175 -> 172 (-1.71%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.49% max: 2.44% x̄: 1.81% x̃: 1.49%
total cycles in shared programs: 177851584 -> 177851578 (<.01%)
cycles in affected programs: 9796 -> 9790 (-0.06%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.08% x̄: 0.06% x̃: 0.05%
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
No shader-db changes. This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input. However, this
change allows the next commit to help more shaders.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The math inside the add and the cmp in this instruction sequence is the
same. We can utilize this to eliminate the compare.
add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted };
cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch };
(-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q };
This is reduced to:
add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted };
(-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q };
This optimization pass could do even better. The nature of converting
vectorized code from the GLSL front end to scalar code in NIR results in
sequences like:
add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted };
add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted };
add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted };
cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch };
(-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q };
cmp.z.f0(8) null<1>F g3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch };
(-f0) sel(8) g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q };
cmp.z.f0(8) null<1>F g4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch };
(-f0) sel(8) g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q };
In this sequence, only the first cmp.z is removed. With different
scheduling, all 3 could get removed.
Skylake
total instructions in shared programs: 14407009 -> 14400173 (-0.05%)
instructions in affected programs: 1307274 -> 1300438 (-0.52%)
helped: 4880
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1
helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.45 -1.35
95% mean confidence interval for instructions %-change: -0.72% -0.69%
Instructions are helped.
total cycles in shared programs: 532943169 -> 532923528 (<.01%)
cycles in affected programs: 14065798 -> 14046157 (-0.14%)
helped: 2703
HURT: 339
helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2
helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21%
HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12
HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41%
95% mean confidence interval for cycles value: -8.66 -4.26
95% mean confidence interval for cycles %-change: -0.24% -0.14%
Cycles are helped.
LOST: 0
GAINED: 1
Broadwell
total instructions in shared programs: 14719636 -> 14712949 (-0.05%)
instructions in affected programs: 1288188 -> 1281501 (-0.52%)
helped: 4845
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1
helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52%
95% mean confidence interval for instructions value: -1.43 -1.33
95% mean confidence interval for instructions %-change: -0.72% -0.68%
Instructions are helped.
total cycles in shared programs: 559599253 -> 559581699 (<.01%)
cycles in affected programs: 13315565 -> 13298011 (-0.13%)
helped: 2600
HURT: 269
helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2
helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20%
HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20
HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75%
95% mean confidence interval for cycles value: -8.47 -3.77
95% mean confidence interval for cycles %-change: -0.27% -0.18%
Cycles are helped.
LOST: 0
GAINED: 8
Haswell
total instructions in shared programs: 12978609 -> 12973483 (-0.04%)
instructions in affected programs: 932921 -> 927795 (-0.55%)
helped: 3480
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58%
95% mean confidence interval for instructions value: -1.53 -1.42
95% mean confidence interval for instructions %-change: -0.80% -0.75%
Instructions are helped.
total cycles in shared programs: 410270788 -> 410250531 (<.01%)
cycles in affected programs: 10986161 -> 10965904 (-0.18%)
helped: 2087
HURT: 254
helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4
helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21%
HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16
HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47%
95% mean confidence interval for cycles value: -12.82 -4.49
95% mean confidence interval for cycles %-change: -0.31% -0.18%
Cycles are helped.
LOST: 0
GAINED: 5
Ivy Bridge
total instructions in shared programs: 11686082 -> 11681548 (-0.04%)
instructions in affected programs: 937696 -> 933162 (-0.48%)
helped: 3150
HURT: 0
helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1
helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49%
95% mean confidence interval for instructions value: -1.49 -1.38
95% mean confidence interval for instructions %-change: -0.71% -0.67%
Instructions are helped.
total cycles in shared programs: 257514962 -> 257492471 (<.01%)
cycles in affected programs: 11524149 -> 11501658 (-0.20%)
helped: 1970
HURT: 239
helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3
helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17%
HURT stats (abs) min: 1 max: 1358 x̄: 50.00 x̃: 15
HURT stats (rel) min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65%
95% mean confidence interval for cycles value: -17.01 -3.35
95% mean confidence interval for cycles %-change: -0.33% -0.08%
Cycles are helped.
LOST: 9
GAINED: 1
Sandy Bridge
total instructions in shared programs: 10432841 -> 10429893 (-0.03%)
instructions in affected programs: 685071 -> 682123 (-0.43%)
helped: 2453
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1
helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46%
95% mean confidence interval for instructions value: -1.23 -1.17
95% mean confidence interval for instructions %-change: -0.67% -0.62%
Instructions are helped.
total cycles in shared programs: 146133660 -> 146134195 (<.01%)
cycles in affected programs: 3991634 -> 3992169 (0.01%)
helped: 1237
HURT: 153
helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2
helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14%
HURT stats (abs) min: 1 max: 1740 x̄: 59.56 x̃: 12
HURT stats (rel) min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42%
95% mean confidence interval for cycles value: -5.13 5.90
95% mean confidence interval for cycles %-change: -0.17% 0.16%
Inconclusive result (value mean confidence interval includes 0).
LOST: 0
GAINED: 1
GM45 and Iron Lake had similar results (GM45 shown):
total instructions in shared programs: 4800332 -> 4798380 (-0.04%)
instructions in affected programs: 565995 -> 564043 (-0.34%)
helped: 1451
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1
helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31%
95% mean confidence interval for instructions value: -1.40 -1.29
95% mean confidence interval for instructions %-change: -0.50% -0.45%
Instructions are helped.
total cycles in shared programs: 122032318 -> 122027798 (<.01%)
cycles in affected programs: 8334868 -> 8330348 (-0.05%)
helped: 1029
HURT: 1
helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2
helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04%
HURT stats (abs) min: 38 max: 38 x̄: 38.00 x̃: 38
HURT stats (rel) min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25%
95% mean confidence interval for cycles value: -4.70 -4.08
95% mean confidence interval for cycles %-change: -0.09% -0.08%
Cycles are helped.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
No shader-db changes. This source must have been written by a previous
instruction, so it cannot be a uniform or a shader input. However, this
change allows the next commit to help about 900 more shaders.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|