diff options
author | Chad Versace <[email protected]> | 2012-09-04 12:15:29 -0700 |
---|---|---|
committer | Chad Versace <[email protected]> | 2012-09-25 10:58:45 -0700 |
commit | 413c4914129cd26ca87960852d8c0264c0fb29e7 (patch) | |
tree | 722c87715533b2ac9fe9d57553d766436ff5be5a /src/mesa/swrast/s_points.h | |
parent | 581619f5a70c0e2ef6ac6d1e810b4f5a6e6416d4 (diff) |
intel: Improve teximage perf for Google Chrome paint rects (v3)
This patch reduces the time spent in glTexImage and glTexSubImage by
over 5x on Sandybridge for the workload described below.
It adds a new fast path for glTexImage2D and glTexSubImage2D,
intel_texsubimage_tiled_memcpy, which is optimized for Google Chrome's
paint rectangles. The fast path is implemented only for 2D GL_BGRA
textures for chipsets with a LLC.
=== Performance Analysis ===
Workload description:
Personalize your google.com page with a wallpaper. Start chromium
with flags "--ignore-gpu-blacklist --enable-accelerated-painting
--force-compositing-mode". Start recording with chrome://tracing. Visit
google.com and wait for page to finish rendering. Measure the time spent
by process CrGpuMain in GLES2DecoderImpl::HandleTexImage2D and
HandleTexSubImage2D.
System config:
cpu: Sandybridge Mobile GT2+ (0x0126)
kernel 3.4.9 x86_64
chromium 21.0.1180.89 (154005)
Statistics:
| N Median Avg Stddev
--------------|-------------------------
before (msec) | 8 472.5 463.75 72.6
after (msec) | 8 78.0 79.6 5.7
Arithmetic difference at 95.0% confidence:
-384.1 +/- 55.2 msec
-82.8% +/- 11.9%
Ratio at 95.0% confidence:
5.81 +/- 0.119
v2:
- Replace check for `intel->gen >= 6` with `intel->has_llc`, per
danvet.
- Fix typo in comment, s/throuh/through/.
- Swap 'before' and 'after' rows in stat table.
v3:
- If the current batch references the bo, then flush batch before mapping
the bo. Found by Chris.
- Restrict supported texture images to level 0 of target
GL_TEXTURE_2D. This avoids an arithmetic bug in calculating image
offsets within the miptree, found by Paul. This restriction does not
diminish this patch's benefit to Chrome OS performance.
- Use less instructions for bit6 swizzling, suggested by Paul.
- Remove erroneous comment about Y-tiling, for Paul.
- Print perf_debug messages when flushing and stalling.
- Update stats in commit message; run workload under a release build
rather than a debug build.
Note: This is a candidate for the 9.0 branch.
Acked-by: Eric Anholt <[email protected]>
CC: Stéphane Marchesin <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
Diffstat (limited to 'src/mesa/swrast/s_points.h')
0 files changed, 0 insertions, 0 deletions