summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/vs: Fix up swizzle for dereference_array of matrices.Eric Anholt2012-05-171-2/+2
| | | | | | | | | Fixes assertion failure in piglit: vs-mat2-struct-assignment.shader_test vs-mat2-array-assignment.shader_test Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Do more register coalescing by using the interference graph.Eric Anholt2012-05-172-0/+62
| | | | | | | | | | | | | | By using the live variables code for determining interference, we can handle coalescing in the presence of control flow, which the other register coalescing path couldn't. Total instructions: 207184 -> 206990 74/1246 programs affected (5.9%) 33993 -> 33799 instructions in affected programs (0.6% reduction) There is a newerth shader that loses out, because of some extra MOVs that now get their dead-code nature obscured by coalescing. This should be fixed by doing better at dead code elimination.
* i965/blorp: Move exec() out of brw_blorp_params.Paul Berry2012-05-153-6/+9
| | | | | | | | | No functional change. This patch replaces the brw_blorp_params::exec() method with a global function brw_blorp_exec() that performs the operation described by the params data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Initial implementation of MSAA.Paul Berry2012-05-1523-121/+662
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables MSAA for Gen6, by modifying intel_mipmap_tree to understand multisampled buffers, adapting the rendering pipeline setup to enable multisampled rendering, and adding multisample resolve operations to brw_blorp_blit.cpp. Some preparation work is also included for Gen7, but it is not yet enabled. MSAA support is still fairly preliminary. In particular, the following are not yet supported: - Fully general blits between MSAA and non-MSAA buffers. - Formats other than RGBA8, DEPTH24, and STENCIL8. - Centroid interpolation. - Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE, GL_SAMPLE_COVERAGE_INVERT). Fixes piglit tests "EXT_framebuffer_multisample/accuracy" on i965/Gen6. v2: - In intel_alloc_renderbuffer_storage(), quantize the requested number of samples to the next higher sample count supported by the hardware. This ensures that a query of GL_SAMPLES will return the correct value. It also ensures that MSAA is fully disabled on Gen7 for now (since Gen7 MSAA support doesn't work yet). - When reading from a non-MSAA surface, ensure that s_is_zero is true so that we won't try to read from a nonexistent sample.
* i965/gen6+: Add code to perform blits on the render path ("blorp").Paul Berry2012-05-158-27/+1730
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch expands the "blorp" component to be able to perform blits as well as HiZ resolves. The new blitting code is located in brw_blorp_blit.cpp. This includes the necessary fragment shader code to look up pixels in the source buffer (which is configured as a texture) and output them to the destination buffer (which is configured as the render target). Most of the time the fragment shader code is simple and straightforward, since it merely has to apply a coordinate offset, read from the texture, and write to the render target. However, in the case of blitting stencil buffers, things are more complicated, since the GPU stores stencil data using W tiling, and W tiling is not supported for textures or render targets. So, we set up the stencil buffers as Y tiled, and emit fragment shader code that adjusts the coordinates to account for the difference between W and Y tiling. Furthermore, since a rectangular region in W tiling does not necessarily correspond to a rectangular region in Y tiling, we widen the rectangle primitive to the nearest tile boundary and have the fragment shader "kill" any pixels that don't fall inside the actual desired destination rectangle. All of this is a necessary prerequisite for implementing MSAA, since we'll need to be able to blit between multisample color, depth, and stencil buffers and their non-multisampled counterparts, and none of the existing blitting mechanisms support multisampling. In addition, the new blitting code should speed up operations where we previously fell back to software rasterization, such as blitting of stencil buffers. The current fallback sequence is: first we try to do a blit using the hardware blitting engine. If that fails we try to do a blit using the render path. If that also fails then we do the blit using a meta-op (which may or may not fall back to software rasterization). Note that blitting using the render path has some limitations at the moment: it only supports a few formats, and it doesn't support clipping or scissoring. These limitations will be addressed in future patch series. v2: - Add the code that configures the WM program to gen{6,7}_emit_wm_config() and gen7_emit_ps_config() rather than creating separate ...enable() functions. - Call intel_prepare_render before determining which miptrees we are blitting from/to, because it may cause miptrees to be reallocated. - Allow the blit to mirror X and/or Y coordinates. - Disable blorp blits on Gen7 for now, since they aren't working yet.
* i965: Expose surface setup internals for use by blits.Paul Berry2012-05-153-2/+4
| | | | | | | | This patch exposes the functions brw_get_surface_tiling_bits and gen7_set_surface_tiling, so that they can be re-used when setting up surface states in gen6_blorp.cpp and gen7_blorp.cpp. Reviewed-by: Chad Versace <[email protected]>
* i965: split gen{6,7}_blorp_exec functions into manageable chunks.Paul Berry2012-05-153-522/+647
| | | | | | | | | | | | | | | | | This patch splits up the gen6_blorp_exec and gen7_blorp_exec functions, which were very long, into simple component functions. With a few exceptions, there is one function per state packet. This will allow blit functionality to be added without significantly complicating the code. Reviewed-by: Chad Versace <[email protected]> v2: Rename the functions gen{6,7}_emit_wm_disable() to gen{6,7}_emit_wm_config() (since the WM is not actually disabled during HiZ ops; it simply doesn't have a program). Also, on gen7, split out the configration of 3DSTATE_PS to a separate function gen7_emit_ps_config().
* i965: Parameterize HiZ code to prepare for adding blitting.Paul Berry2012-05-157-177/+335
| | | | | | | | | | | | | | | | | | | This patch groups together the parameters used by the HiZ functions into a new data structure, brw_hiz_resolve_params, rather than passing each parameter individually between the HiZ functions. This data structure is a subclass of brw_blorp_params, which represents the parameters of a general-purpose blit or resolve operation. A future patch will add another subclass for blits. In addition, this patch generalizes the (width, height) parameters to a full rect (x0, y0, x1, y1), since blitting operations will need to be able to operate on arbitrary rectangles. Also, it renames several of the HiZ functions to reflect the expanded role they will serve. v2: Rename brw_hiz_resolve_params to brw_hiz_op_params. Move gen{6,7}_blorp_exec() functions back into gen{6,7}_blorp.h. Reviewed-by: Chad Versace <[email protected]>
* i965: Implement guardband clipping on Ivybridge.Kenneth Graunke2012-05-152-5/+15
| | | | | | | | | | | Improves performance in Citybench: - 320x240: 9.19589% +/- 0.557621% - 1280x480: 3.90797% +/- 0.774429% No apparent difference in OpenArena. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement guardband clipping on Sandybridge.Kenneth Graunke2012-05-152-10/+15
| | | | | | | | | | | Improves performance in Citybench: - 320x240: 19.8008% +/- 0.937818% - 1280x480: 6.53856% +/- 0.859083% No apparent difference in OpenArena nor Xonotic. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965/fs: Jump from discard statements to the end of the program when ↵Eric Anholt2012-05-144-126/+5
| | | | | | | | | | done." This reverts commit 31866308fcf989df992ace28b5b986c3d3770e90. Fixes piglit glsl-fs-discard-exit-3 and unigine tropics rendering. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Remove the requirement of no dead code for interference checks.Eric Anholt2012-05-141-12/+12
| | | | | | | | This will be convenient when I want to comment out optimization code to see the raw program being optimized, but more importantly will let the interference check be used during optimization. Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for copy propagation.Eric Anholt2012-05-145-0/+143
| | | | | | | | | | | | We could do more by handling abs/negate and non-GRF sources, but this is a good start. Improves tropics performance 0.30% +/- .17% (n=43). shader-db results: Total instructions: 208032 -> 207184 60/1246 programs affected (4.8%) 23286 -> 22438 instructions in affected programs (3.6% reduction) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: When doing no work for live interval calculation, do no allocation.Eric Anholt2012-05-141-7/+7
| | | | | | | | | When I had a bug causing the backend to never finish optimizing, it also sent me deep into swap. This avoids extra memory allocation per trip through optimization, and thus may reduce the peak memory allocation of the driver even in the success case. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7: Set tile_x/y to 0 in the no-stencil case.Eric Anholt2012-05-141-1/+1
| | | | Fixes compiler warnings.
* intel: Fix signed/unsigned comparison warnings.Eric Anholt2012-05-142-5/+6
|
* intel: Fix compile warning from 7b6424143d8bf572cadd46adcbaa91d2a5598635Eric Anholt2012-05-141-2/+2
|
* intel: Fix compiler warning from 3cd7bee48f7caf7850ea64d40f43875d4c975507Eric Anholt2012-05-141-2/+0
|
* i965/fs: Add a local common subexpression elimination pass.Kenneth Graunke2012-05-144-0/+201
| | | | | | | | | | | | | | | | | Total instructions: 18210 -> 17836 49/163 programs affected (30.1%) 12888 -> 12514 instructions in affected programs (2.9% reduction) This reduces Lightsmark's "Scale down filter" shader from 395 instructions to 283, a whopping 28%. It also reduces register pressure significantly: the SIMD8 program now uses 29 registers instead of 101, giving us more than enough room for a SIMD16 program. v2: Add && !inst->conditional_mod to the "skip some instructions" check. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Use a const reference in fs_reg::equals instead of a pointer.Kenneth Graunke2012-05-143-16/+16
| | | | | | | | | | This lets you omit some ampersands and is more idiomatic C++. Using const also marks the function as not altering either register (which was obvious, but nice to enforce). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nouveau/vieux: finish != flush, how about we do that..Ben Skeggs2012-05-123-0/+23
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* i965/hiz: Convert gen{6,7}_hiz.h to gen{6,7}_blorp.hPaul Berry2012-05-105-5/+5
| | | | | | | This patch renames the gen6_hiz.h and gen7_hiz.h files to correspond to the renames of the corresponding .cpp files (see previous commit). Reviewed-by: Chad Versace <[email protected]>
* i965/hiz: Convert gen{6,7}_hiz.c to C++Paul Berry2012-05-103-3/+3
| | | | | | | | | | | This patch converts the files gen6_hiz.c and gen7_hiz.c to C++, in preparation for expanding the HiZ code to support arbitrary blits. The new files are called gen6_blorp.cpp and gen7_blorp.cpp to reflect the expanded role that this code will serve--"blorp" stands for "BLit Or Resolve Pass". Reviewed-by: Chad Versace <[email protected]>
* i965/hiz: Make void pointer type casts explicitPaul Berry2012-05-101-5/+7
| | | | | | | | | Previous to this patch, gen6_hiz.c contained two implicit type casts from void * to a a non-void pointer type. This is allowed in C but not in C++. This patch makes the type casts explicit, so that gen6_hiz.c can be converted into a C++ file. Reviewed-by: Chad Versace <[email protected]>
* intel: Work around differences between C and C++ scoping rules.Paul Berry2012-05-102-25/+29
| | | | | | | | | | | | | | In C++, if a struct is defined inside another struct, or its name is first seen inside a struct or function, the struct is nested inside the namespace of the struct or function it appears in. In C, all structs are visible from toplevel. This patch explicitly moves the decalartions of intel_batchbuffer to toplevel, so that it does not get nested inside a namespace when header files are included from C++. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel: Add extern "C" declarations to headersPaul Berry2012-05-1010-1/+75
| | | | | | | | These declarations are necessary to allow C++ code to call C code without causing unresolved symbols (which would make the driver fail to load). Reviewed-by: Chad Versace <[email protected]>
* i965: fix wrong cube/3D texture layoutYuanhan Liu2012-05-091-2/+3
| | | | | | | | | | | | | | | | | | | | | | Fix wrong cube/3D texture layout for the tailing levels whose width or height is smaller than the align unit. From 965 B-spec http://intellinuxgraphics.org/VOL_1_graphics_core.pdf at page 135: All of the LOD=0 q-planes are stacked vertically, then below that, the LOD=1 qplanes are stacked two-wide, then the LOD=2 qplanes are stacked four-wide below that, and so on. Thus we should always inrease pack_x_nr, which results to the pitch of LODn may greater than the pitch of LOD0. So we should refactor mt->total_width when needed. This would fix the following webgl test case on all gen4 platforms: conformance/textures/texture-size-cube-maps.html NOTE: This is a candidate for stable release branches. Signed-off-by: Yuanhan Liu <[email protected]>
* mesa: move gl_client_array*[] from vbo_draw_func into gl_contextMarek Olšák2012-05-083-7/+5
| | | | | | | | | | | | | | | | | | In the future we'd like to treat vertex arrays as a state and not as a parameter to the draw function. This is the first step towards that goal. Part of the goal is to avoid array re-validation for every draw call. This commit adds: const struct gl_client_array **gl_context::Array::_DrawArrays. The pointer is changed in: * vbo_draw_method * vbo_rebase_prims - unused by gallium * vbo_split_prims - unused by gallium * st_RasterPos Reviewed-by: Brian Paul <[email protected]>
* i965/Gen7: Work around GPU hangs due to misaligned depth coordinate offsets.Paul Berry2012-05-072-0/+54
| | | | | | | | | | | | | | | | | | | | In i965 Gen7, Mesa has for a long time used the "depth coordinate offset X/Y" settings (in 3DSTATE_DEPTH_BUFFER) to cause the GPU to render to miplevels other than 0. Unfortunately, this doesn't work, because these offsets must be aligned to multiples of 8, and miplevels in the depth buffer are only guaranteed to be aligned to multiples of 4. When the offsets aren't aligned to a multiple of 8, the GPU sometimes hangs. As a temporary measure, to avoid GPU hangs, this patch smashes the 3 LSB's of "depth coordinate offset X/Y" to 0. This results in incorrect rendering to mipmapped depth textures, but that seems like a reasonable stopgap while we figure out a better solution. Avoids GPU hangs in piglit test "depthstencil-render-miplevels" at texture sizes that are not powers of 2. Reviewed-by: Chad Verace <[email protected]>
* i965/Gen6: Work around GPU hangs due to misaligned depth coordinate offsets.Paul Berry2012-05-072-0/+54
| | | | | | | | | | | | | | | | | | | | | | | In i965 Gen6, Mesa has for a long time used the "depth coordinate offset X/Y" settings (in 3DSTATE_DEPTH_BUFFER) to cause the GPU to render to miplevels other than 0. Unfortunately, this doesn't work, because these offsets must be aligned to multiples of 8, and miplevels in the depth buffer are only guaranteed to be aligned to multiples of 4. When the offsets aren't aligned to a multiple of 8, the GPU sometimes hangs. As a temporary measure, to avoid GPU hangs, this patch smashes the 3 LSB's of "depth coordinate offset X/Y" to 0. This results in incorrect rendering to mipmapped depth textures, but that seems like a reasonable stopgap while we figure out a better solution. (Note that we have only ever observed this GPU hang on Gen6 when HiZ is enabled, so another possible stopgap would be to disable HiZ). Avoids GPU hangs in piglit test "depthstencil-render-miplevels" at texture sizes that are not powers of 2. Reviewed-by: Chad Verace <[email protected]>
* i965: Fix mipmap offsets for HiZ and separate stencil buffers.Paul Berry2012-05-078-60/+319
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When rendering to a miplevel other than 0 within a color, depth, stencil, or HiZ buffer, we need to tell the GPU to render to an offset within the buffer, so that the data is written into the correct miplevel. We do this using a coarse offset (in pages), and a fine adjustment (the so-called "tile_x" and "tile_y" values, which are measured in pixels). We have always computed the coarse offset and fine adjustment using intel_renderbuffer_tile_offsets() function. This worked fine for color and combined depth/stencil buffers, but failed to work properly when HiZ and separate stencil were in use. It failed to work because there is only one set of fine adjustment controls shared by the HiZ, depth, and stencil buffers, so we need to choose tile_x and tile_y values that are compatible with the tiling of all three buffers, and then compute separate coarse offsets for each buffer. This patch fixes the HiZ and separate stencil case by replacing the call to intel_renderbuffer_tile_offsets() with calls to two functions: intel_region_get_tile_masks(), which determines how much of the adjustment can be performed using offsets and how much can be performed using tile_x and tile_y, and intel_region_get_aligned_offset(), which computes the coarse offset. intel_region_get_tile_offsets() is still used for color renderbuffers, so to avoid code duplication, I've re-worked it to use intel_region_get_tile_masks() and intel_region_get_aligned_offset(). On i965 Gen6, fixes piglit tests "texturing/depthstencil-render-miplevels 1024 X" where X is one of (depth, depth_and_stencil, depth_stencil_single_binding, depth_x, depth_x_and_stencil, stencil, stencil_and_depth, stencil_and_depth_x). On i965 Gen7, the variants of "texturing/depthstencil-render-miplevels" that contain a stencil buffer still fail, due to another problem: Gen7 seems to ignore the 3 LSB's of the tile_y adjustment (and possibly also tile_x). v2: Removed spurious comments. Added assertions to check preconditions of intel_region_get_aligned_offset(). Reviewed-by: Chad Versace <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel: Disable ARB_framebuffer_object in ES contextsChad Versace2012-05-071-2/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes ARB_framebuffer_object from the GLES1 and GLES2 extension lists in intel_extensions_es.c. Fixes a crash in the Android browser on Ice Cream Sandwich. The Android browser crashed because it did the following, which is legal in GLES2 but not in ARB_framebuffer_object. glGenFramebuffers(1, &fb); glBindFramebuffer(GL_FRAMEBUFFER, fb); // render render render... glDeleteFramebuffers(1, &fb); // go do other stuff... glBindFramebuffer(GL_FRAMEBUFFER, fb); // This bind unexpectedly failed, and the app panics. The semantics of glBindFramebuffer specified by ARB_framebuffer_object (a desktop GL extension) and GLES2 specs are incompatible. The ideal solution to fix this is to create separate API entry points for glBindFramebuffer, one for GL and the other for GLES2. But, until that work is complete, disabling ARB_framebuffer_object in GLES2 contexts safely fixes the problem. Likewise, the semantics of glBindFramebuffer in ARB_framebuffer_object and of glBindFramebufferOES in OES_framebuffer_object (a GLES1 extension) are incompatible. Even though the functions have different names, the semantic difference still results in a bug because both API calls are implemented by a single function, _mesa_BindFramebufferEXT, which handles the semantic difference incorrectly. Again, disabling ARB_framebuffer_object in GLES1 contexts safely fixes this problem. According to the ARB_framebuffer_object spec, the extension is an amalgamation of EXT_framebuffer_object EXT_framebuffer_blit EXT_packed_depth_stencil EXT_framebuffer_multisample By disabling this extension, however, no functionality is removed from GLES1 and GLES2 contexts because 1) the first three extensions are explicitly enabled in Intel's ES extension lists and 2) no functionality of the last extension is exposed in an ES context. Note: This is a candidate for the 8.0 branch. See-also: http://www.mail-archive.com/[email protected]/msg21006.html CC: Charles Johnson <[email protected]> CC: Sean Kelley <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: Remove pointless software fallback for glBitmap on Gen6.Kenneth Graunke2012-05-041-4/+0
| | | | | | | | | | We already have a meta path below that works just fine; no apparent regressions in oglconform. NOTE: This is a candidate for the 8.0 branch. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46834 Reviewed-by: Chad Versace <[email protected]>
* i965/fs: Fix regression in comparison handling from ANDs change.Eric Anholt2012-05-042-0/+18
| | | | | | | I had fixed up the logic ops for delayed ANDing, but not equality comparisons on bools. Fixes new piglit fs-bool-less-compare-true. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48629
* i965: Add a comment about the state flag for sRGBEnabled.Eric Anholt2012-05-042-4/+10
| | | | I thought this might be _NEW_COLOR, but it isn't.
* intel: Return success when asked to allocate a 0-width/height renderbuffer.Eric Anholt2012-05-041-0/+3
| | | | | | | | | | | | | It seems silly that GL lets you allocate these given that they're framebuffer attachment incomplete, but the webgl conformance tests actually go looking to see if the getters on 0-width/height depth/stencil renderbuffers return good values. By failing out here, they all got smashed to 0, which turned out to be correct for all the getters they tested except for GL_RENDERBUFFER_INTERNAL_FORMAT. Now, by succeeding but not making a miptree, that one also returns the expected value. Acked-by: Kenneth Graunke <[email protected]>
* i965: Add support for GL_ARB_draw_buffers_blend.Eric Anholt2012-05-042-6/+10
| | | | | | Tested with piglit fbo-draw-buffers-blend and intel oglconform. Reviewed-by: Kenneth Graunke <[email protected]>
* gbm: Add gbm_bo_write entry pointKristian Høgsberg2012-05-032-2/+31
| | | | | | | | | | | | | | | | This new gbm entry point allows writing data into a gbm bo. The bo has to be created with the GBM_BO_USE_WRITE flag, and it's only required to work for GBM_BO_USE_CURSOR_64X64 bos. The gbm API is designed to be the glue layer between EGL and KMS, but there was never a mechanism initialize a buffer suitable for use with KMS hw cursors. The hw cursor bo is typically not compatible with anything EGL can render to, and thus there's no way to get data into such a bo. gbm_bo_write() fills that gap while staying out of the efficient cpu->gpu pixel transfer business. Reviewed-by: Ander Conselvan de Oliveira <[email protected]>
* dri/nv10-nv20: Add support for S3TCViktor Novotný2012-05-024-0/+28
| | | | | Signed-off-by: Viktor Novotný <[email protected]> Signed-off-by: Francisco Jerez <[email protected]>
* dri/nouveau: Add general support for compressed formats.Viktor Novotný2012-05-024-33/+138
| | | | | Signed-off-by: Viktor Novotný <[email protected]> Signed-off-by: Francisco Jerez <[email protected]>
* xlib: use _mesa_is_winsys/user_fbo() helpersBrian Paul2012-05-011-5/+6
| | | | Reviewed-by: Eric Anholt <[email protected]>
* intel: use _mesa_is_winsys/user_fbo() helpersBrian Paul2012-05-0116-32/+48
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nouveau: use _mesa_is_winsys/user_fbo() helpersBrian Paul2012-05-013-3/+7
| | | | Reviewed-by: Eric Anholt <[email protected]>
* radeon: use _mesa_is_winsys/user_fbo() helpersBrian Paul2012-05-014-10/+14
| | | | Reviewed-by: Alex Deucher <[email protected]>
* i965: Support Android RGBX8888 format for EGL generated imagesSean V Kelley2012-04-302-0/+12
| | | | | | | | | | Enabled MESA_FORMAT_RGBX8888_REV for RGBX. Android software requires RGBX8888 format to be supported for software rendering. That requires EGL to be capable of generating images from this format. Signed-off-by: Sean V Kelley <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* intel: set dri_format field for all imagesAnder Conselvan de Oliveira2012-04-301-0/+18
| | | | | | Only images created with intel_create_image() had the field properly set. Set it also on intel_dup_image(), intel_create_image_from_name() and intel_create_image_from_renderbuffer().
* intel: properly return the image format on intel_query_imageAnder Conselvan de Oliveira2012-04-301-1/+2
|
* autoconf: pass -Wall to automakeDylan Noblesmith2012-04-292-6/+6
| | | | | | | And fix these warning that appear at autoreconf time: "`:='-style assignments are not portable" v2: Fix the recently-converted-to-automake r600.
* i965/fs: Fix FB writes that tried to use the non-existent m16 register.Kenneth Graunke2012-04-271-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | A little analysis shows that the worst-case value for "nr" is 17: - base_mrf = 2 ... 2 - header present (say gen == 5) ... 4 - aa_dest_stencil_reg (stencil test) ... 5 - SIMD16 mode: += 4 * reg_width ... 13 - source_depth_to_render_target ... 15 - dest_depth_reg ... 17 This resulted in us setting base_mrf to 2 and mlen to 15. In other words, we'd try to use m2..m16. But m16 doesn't exist pre-Gen6. Also, the instruction scheduler data structures use arrays of size 16, so this would cause us to access them out of bounds. While the debugger system routine may need m0 and m1, we don't use it today, so the simplest solution is just to move base_mrf back to 1. That way, our worst case message fits in m1..m15, which is legal. An alternative would be to fail on SIMD16 in this case, but that seems a bit unfortunate if there's no real need to reserve m0 and m1. Fixes new piglit test shaders/depth-test-and-write on Ironlake. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48218 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Fix texelFetchOffset()Eric Anholt2012-04-241-3/+23
| | | | | | It appears that when using 'ld' with the offset bits, address bounds checking happens before the offset is applied, so parts of the drawing in piglit texelFetchOffset() with a negative texcoord go black.