summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* r600g: fix EXP on CaymanMarek Olšák2012-09-271-4/+2
| | | | NOTE: This is a candidate for the stable branches.
* r600g: fix RSQ of negative value on CaymanMarek Olšák2012-09-271-0/+5
| | | | NOTE: This is a candidate for the stable branches.
* r600g: fix instance divisor on CaymanMarek Olšák2012-09-271-19/+35
| | | | | | Not sure if this is the best way to fix it. NOTE: This is a candidate for the stable branches.
* r600g: flush FMASK and CMASK when changing colorbuffers on EvergreenMarek Olšák2012-09-276-1/+18
| | | | | | This fixes rare graphical corruption. NOTE: This is a candidate for the stable branches.
* r600g: use invalid DB hardware formats to disable depth/stencilMarek Olšák2012-09-273-2/+23
|
* intel: Fix segfault in intel_texsubimage_tiled_memcpyChad Versace2012-09-271-2/+1
| | | | | | | | | | | | | | | The function segfaulted when a game called glTexSubImage2D on a texture with internalformat/format/type = GL_SLUMINANCE8/GL_BGRA/GL_UNSIGNED_BYTE. The function only supports MESA_FORMAT_ARGB8888 and returns early if it detects an unsupported format. Clearly, its detection condition was insufficient. This patch fixes it to explicity check for MESA_FORMAT_ARGB8888. Note: This is a candidate for the 9.0 branch (fixes 413c491). Reviewed-and-tested-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Do texture swizzling in hardware on Haswell.Kenneth Graunke2012-09-262-5/+52
| | | | | | | | | | | | | | | | | | | Haswell supports EXT_texture_swizzle and legacy DEPTH_TEXTURE_MODE swizzling by setting SURFACE_STATE entries. This means we don't have to bake the swizzle settings into the shader code by emitting MOV instructions, and thus don't have to recompile shaders whenever the swizzles change. Unfortunately, we can't handle GL_ALPHA this way: unlike all the others, which store the comparison result in the .r channel (and possibly others as well), GL_ALPHA puts it in the .a channel. The GLSL 1.30+ style functions which return a float always simply return the .r channel, which would be zero if we handled this as a surface override. In this case, fall back to doing it the old way. DEPTH_TEXTURE_MODE = GL_ALPHA isn't an interesting performance path anyway. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Refactor texture swizzle generation into a helper.Kenneth Graunke2012-09-263-49/+60
| | | | | | | It's going to be reused in a second place soon. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* radeon/llvm: improve select_cc lowering to generate CND* more oftenVincent Lejeune2012-09-274-41/+103
| | | | | | | | v2: - Simplify isZero() - Remove a unused function prototype - Clean whitespace trails Reviewed-by: Tom Stellard <[email protected]>
* intel: Fix size of temporary etc1 bufferChad Versace2012-09-261-3/+4
| | | | | | | | | | | | | | | | Fixes valgrind errors in piglit test oes_compressed_etc1_rgb8_texture-miptree: an invalid write in _mesa_store_compressed_store_texsubimage() at line 4406 and invalid reads in texcompress_etc_tmp.h:etc1_parse_block(). The calculation of the size of the temporary etc1 buffer allocated by intel_miptree_map_etc1() was incorrect. Sometimes the allocated buffer was too small, sometimes too large. This patch corrects the size to that expected by _mesa_store_compressed_store_texsubimage(). Note: This is candidate for the 9.0 branch. Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* radeonsi: fix truncated register define.Alex Deucher2012-09-261-3/+3
| | | | Signed-off-by: Alex Deucher <[email protected]>
* mesa: move _mesa_es_error_check_format_and_type() to glformats.cBrian Paul2012-09-265-69/+73
| | | | | | Where the non-ES _mesa_error_check_format_and_type() function lives. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: move GL_HALF_FLOAT_OES definition to glheader.hBrian Paul2012-09-263-11/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: minor fix to glTexSubImage error messageBrian Paul2012-09-261-1/+2
|
* mesa: consolidate sub-texture error checking codeBrian Paul2012-09-261-168/+94
| | | | | | Do all error checking of glTexSubImage, glCopyTexSubImage and glCompressedTexSubImage's xoffset, yoffset, zoffset, width, height, and depth params in one place.
* mesa: consolidate glTexSubImage() error checkingBrian Paul2012-09-261-40/+38
|
* mesa: consolidate glCompressedTexSubImage() error checkingBrian Paul2012-09-261-77/+88
| | | | | Do all the checking in one function instead of two and fix up some of the error checking.alignment check
* mesa: consolidate subtexture xoffset/yoffset/width/height error checking codeBrian Paul2012-09-261-88/+73
| | | | | This is the code that checks if a subtexture region is aligned to the compressed format's block size.
* mesa: consolidate glCopyTexSubImage error checkingBrian Paul2012-09-261-79/+60
| | | | Do all the checking in one function instead of two.
* mesa: fix incorrect error for glCompressedSubTexImageBrian Paul2012-09-261-3/+3
| | | | | | | | | If a subtexture region isn't aligned to the compressed block size, return GL_INVALID_OPERATION, not gl_INVALID_VALUE. NOTE: This is a candidate for the stable branches. Reviewed-by: Eric Anholt <[email protected]>
* radeonsi: move draw cmds to si_commands.cChristian Koenig2012-09-263-14/+35
| | | | | Signed-off-by: Christian Koenig <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: start seperating commands into si_commands.cChristian Koenig2012-09-263-4/+10
| | | | | Signed-off-by: Christian Koenig <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: get rid of evergreen_hw_context.cChristian Koenig2012-09-263-50/+3
| | | | | Signed-off-by: Christian Koenig <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove unused codeChristian Koenig2012-09-261-19/+0
| | | | | Signed-off-by: Christian Koenig <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: start reworking inferred state handlingChristian König2012-09-264-6/+4
| | | | | | | | | | | Instead of tracking the inferred state changes separately just check if queued and emitted states are the same. This patch just reworks the update of the SPI map between vs and ps, but there are probably more cases like this. Signed-off-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gles3: Prohibit set/get of GL_FRAMEBUFFER_SRGB.Paul Berry2012-09-252-2/+9
| | | | | | | | | | | | | | | | GLES 3 supports sRGB functionality, but it does not expose the GL_FRAMEBUFFER_SRGB enable/disable bit. Instead the implementation is expected to behave as though that bit is always enabled. This patch ensures that ctx->Color.sRGBEnabled (the internal variable tracking GL_FRAMEBUFFER_SRGB) is initially true in GLES 2/3 contexts, and that it cannot be modified through the GLES 3 API. This is safe for GLES 2, since ctx->Color.sRGBEnabled has no effect on non-sRGB formats, and GLES 2 doesn't support any sRGB formats. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* meta: Properly save/restore GL_FRAMEBUFFER_SRGB in Meta.Paul Berry2012-09-252-27/+21
| | | | | | | | | | | | | | | | | | | Previously, meta logic was saving and restoring the value of GL_FRAMEBUFFER_SRGB in an ad-hoc fashion. As a result, it was not properly disabled and/or restored for some meta operations. This patch causes GL_FRAMEBUFFER_SRGB to be saved/restored in the conventional way of meta-ops (using _mesa_meta_begin() and _mesa_meta_end()). It is now reliably saved/restored for _mesa_meta_BlitFramebuffer, _mesa_meta_GenerateMipmap, and decompress_texture_image, and preserved for all other meta ops. Fixes piglit tests "ARB_framebuffer_sRGB/blit renderbuffer {linear_to_srgb,srgb} scaled {disabled,enabled}". Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* enable: Create _mesa_set_framebuffer_srgb() function for use by meta ops.Paul Berry2012-09-252-3/+22
| | | | | | | | | | | | | | | GLES3 supports sRGB formats, but it does not support the GL_FRAMEBUFFER_SRGB enable/disable flag (instead it behaves as if this flag is always enabled). Therefore, meta ops that need to disable GL_FRAMEBUFFER_SRGB will need a backdoor mechanism to do so when the API is GLES3. We were already doing a similar thing for GL_MULTISAMPLE, which has the same constraints. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* targets/xorg-i915: Rename driver to i915_drv.so.Matt Turner2012-09-252-2/+2
| | | | | | | modesetting_drv.so is undescriptive and collides with xf86-video-modesetting. Reviewed-by: Jakob Bornecrantz <[email protected]>
* intel: Improve teximage perf for Google Chrome paint rects (v3)Chad Versace2012-09-253-0/+186
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch reduces the time spent in glTexImage and glTexSubImage by over 5x on Sandybridge for the workload described below. It adds a new fast path for glTexImage2D and glTexSubImage2D, intel_texsubimage_tiled_memcpy, which is optimized for Google Chrome's paint rectangles. The fast path is implemented only for 2D GL_BGRA textures for chipsets with a LLC. === Performance Analysis === Workload description: Personalize your google.com page with a wallpaper. Start chromium with flags "--ignore-gpu-blacklist --enable-accelerated-painting --force-compositing-mode". Start recording with chrome://tracing. Visit google.com and wait for page to finish rendering. Measure the time spent by process CrGpuMain in GLES2DecoderImpl::HandleTexImage2D and HandleTexSubImage2D. System config: cpu: Sandybridge Mobile GT2+ (0x0126) kernel 3.4.9 x86_64 chromium 21.0.1180.89 (154005) Statistics: | N Median Avg Stddev --------------|------------------------- before (msec) | 8 472.5 463.75 72.6 after (msec) | 8 78.0 79.6 5.7 Arithmetic difference at 95.0% confidence: -384.1 +/- 55.2 msec -82.8% +/- 11.9% Ratio at 95.0% confidence: 5.81 +/- 0.119 v2: - Replace check for `intel->gen >= 6` with `intel->has_llc`, per danvet. - Fix typo in comment, s/throuh/through/. - Swap 'before' and 'after' rows in stat table. v3: - If the current batch references the bo, then flush batch before mapping the bo. Found by Chris. - Restrict supported texture images to level 0 of target GL_TEXTURE_2D. This avoids an arithmetic bug in calculating image offsets within the miptree, found by Paul. This restriction does not diminish this patch's benefit to Chrome OS performance. - Use less instructions for bit6 swizzling, suggested by Paul. - Remove erroneous comment about Y-tiling, for Paul. - Print perf_debug messages when flushing and stalling. - Update stats in commit message; run workload under a release build rather than a debug build. Note: This is a candidate for the 9.0 branch. Acked-by: Eric Anholt <[email protected]> CC: Stéphane Marchesin <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* clover: Fix build with libclang v3.2Tom Stellard2012-09-251-0/+5
| | | | Reviewed-by: Francisco Jerez <[email protected]>
* clover: Query device for CL_DEVICE_MAX_MEM_ALLOC_SIZE v2Tom Stellard2012-09-253-1/+9
| | | | | | | | v2: - Use driver reported values and don't correct them to the OpenCL required minimum. Reviewed-by: Francisco Jerez <[email protected]>
* gallium: Add PIPE_COMPUTE_CAP_MAX_MEM_ALLOC_SIZE v2Tom Stellard2012-09-253-1/+20
| | | | | | v2: - Add comment in screen.rst - Report OpenCL required minimum for r600g
* r600g: Handle multiple kernels in the same program v2Tom Stellard2012-09-255-21/+84
| | | | | v2: - Use pc parameter of launch_grid
* clover: Handle multiple kernels in the same program v2Blaž Tomažič2012-09-252-33/+37
| | | | | | | | v2: Tom Stellard - Use pc parameter of launch_grid() Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* mesa: remove 'struct' from texenv_fragment_programBrian Paul2012-09-251-13/+13
| | | | | | texenv_fragment_program is declared as a class. Fixes warnings with MSVC. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Allow fast depth clears if scissoring doesn't do anything.Kenneth Graunke2012-09-251-1/+13
| | | | | | | | | | | | | | | | A game we're working with leaves scissoring enabled, but frequently sets the scissor rectangle to the size of the whole screen. In that case, scissoring has no effect, so it's safe to go ahead with a fast clear. Chad believe this should help with Oliver McFadden's "Dante" as well. v2/Chad: Use the drawbuffer dimensions rather than the miptree slice dimensions. The miptree slice may be slightly larger due to alignment restrictions. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]> Reviewed-and-tested-by: Oliver McFadden <[email protected]>
* i965: Don't spill "smeared" registers.Paul Berry2012-09-251-0/+15
| | | | | | | | | | | | Fixes an assertion failure when compiling certain shaders that need both pull constants and register spilling: brw_eu_emit.c:204: validate_reg: Assertion `execsize >= width' failed. NOTE: This is a candidate for release branches. Signed-off-by: Paul Berry <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nv50/ir/ra: Fix register interference tracking.Jay Cornwall2012-09-251-4/+4
| | | | See fdo bug 55224.
* i965/blorp: Fix sRGB MSAA resolves.Paul Berry2012-09-242-8/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e2249e8c4d06a85d6389ba1689e15d7e29aa4dff (i965/blorp: Add support for blits between SRGB and linear formats) changed blorp to always configure surface states for in linear format (even if the underlying surface is sRGB). This allowed sRGB-to-linear and linear-to-sRGB blits to occur without causing the image to be inappropriately brightened or darkened. However, it broke sRGB MSAA resolves, since they rely on the destination buffer format being sRGB in order to ensure that samples are averaged together in sRGB-correct fashion. This patch fixes the problem by instead configuring the source buffer to use the *same* format as the destination buffer. This ensures that the image won't be brightened or darkened, but preserves proper sRGB averaging. Fixes piglit tests "EXT_framebuffer_multisample/accuracy srgb". Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55265 NOTE: This is a candidate for stable release branches. Reviewed-by: Eric Anholt <[email protected]> Reviewed-and-tested-by: Kenneth Graunke <[email protected]>
* darwin: do not create double-buffered offscreen pixel formatsJonas Maebe2012-09-241-1/+1
| | | | | | http://xquartz.macosforge.org/trac/ticket/536 Signed-off-by: Jeremy Huddleston Sequoia <[email protected]>
* radeon/llvm: Fix instruction encoding for r600 family GPUsTom Stellard2012-09-243-15/+14
| | | | | | Tested-by: Michel Dänzer <[email protected]> https://bugs.freedesktop.org/show_bug.cgi?id=55217
* build: remove signbit check in configure.acBrian Paul2012-09-241-7/+0
| | | | | | | We now have a fallback macro in imports.h This reverts part of 0f3ba405. Reviewed-by: Matt Turner <[email protected]>
* mesa: add signbit() macroBrian Paul2012-09-241-0/+7
| | | | Reviewed-by: Matt Turner <[email protected]>
* r600g: Set RADEON_FLUSH_KEEP_TILING_FLAGS when emitting compute csTom Stellard2012-09-241-1/+7
|
* build: substitute X11_INCLUDES variableRobert Bragg2012-09-241-0/+1
| | | | | | | | | | There are a few automake files that reference $(X11_INCLUDES) such as src/glx/Makefile.am but configure.ac wasn't declaring the variable for substitution. This would break builds of glx if libxcb, for example, was installed in its own prefix since AM_CFLAGS wouldn't coincidentally list the needed include path in that case. Reviewed-by: Matt Turner <[email protected]>
* Use signbit() in IS_NEGATIVE and DIFFERENT_SIGNSMatt Turner2012-09-242-19/+9
| | | | | | | | | | signbit() appears to be available everywhere (even MSVC according to MSDN), so let's use it instead of open-coding some messy and confusing bit twiddling macros. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=54805 Reviewed-by: Paul Berry <[email protected]> Suggested-by: Ian Romanick <[email protected]>
* clover: Silence narrowing conversion warnings in resource.cpp.Francisco Jerez2012-09-241-3/+3
|
* clover: Handle NULL value for clEnqueueNDRangeKernel local_work_sizeTom Stellard2012-09-241-7/+6
| | | | [ Francisco Jerez: Slight simplification. ]
* i965/blorp: Increase Y alignment for multisampled stencil blits.Paul Berry2012-09-241-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is a band-aid fix for a bug in commit 5fd67fa (i965/blorp: Reduce alignment restrictions for stencil blits), which causes multisampled stencil blits to work incorrectly on Sandy Bridge. When blitting to or from a normal stencil buffer, we have to use a coordinate transformation that swizzles coordinates to account for the fact that stencil buffers use W tiling, but the most similar tiling format available for textures and render targets is Y tiling. The differences between W and Y tiling cause pixels to be scrambled within a block of size 8x4 (width x height) as measured relative to a W tile, or 16x2 as measured relative to a Y tile. So in order to make sure that pixels at the edges of the blit aren't lost, we need to align the rendering rectangle (and the buffer sizes) to multiples of the 8x4 block size. This alignment happens in the brw_blorp_blit_params constructor, whereas the determination of how to swizzle the coordinates happens during code generation, in the brw_blorp_blit_program class. When blitting to or from a multisampled stencil buffer, the coordinate swizzling is more complex, because it has to account for the interleaving pattern of samples, which uses 4x4 blocks for 4x MSAA and 8x4 blocks for 8x MSAA. The end result is that if multisampling is in use, the 16x2 block size (relative so a Y tile) needs to be expanded to 16x4, and the corresponding size relative to a W tile expands to 8x8. The problem doesn't affect Ivy Bridge severely enough to crop up in Piglit tests because on Ivy Bridge we have to disable multisampling when blitting *to* a multisampled stencil buffer (the blorp compiler generates code to compensate for the fact that multisampling is disabled). However I suspect a bug is still present because we don't disable multisampling when blitting *from* a multisampled stencil buffer. This patch fixes the problem by doubling the vertical alignment requirement when blitting to or from a multisampled stencil buffer, and multisampling has not been disabled. In the long run I would like to rework the brw_blorp_blit_params constructor--it's difficult to follow and has had several subtle bugs like this one. However this band-aid fix should be suitable for cherry-picking to release branches. Fixes Piglit tests "unaligned-blit {2,4} stencil {msaa,upsample}" on Sandy Bridge. NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <[email protected]>