summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* st/mesa: set stObj->lastLevel in guess_and_alloc_textureVadim Girlin2012-05-231-0/+2
| | | | | | | | | | Fixes lockups/asserts with depthstencil-render-miplevels tests and r600g. Should also fix https://bugs.freedesktop.org/show_bug.cgi?id=50033 NOTE: This is a candidate for the 8.0 branch. Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Completely annotate the batch bo when aub dumping.Paul Berry2012-05-226-27/+117
| | | | | | | | | | | | | | | | | | | Previously, when the environment variable INTEL_DEBUG=aub was set, mesa would simply instruct DRM to start dumping data to an .aub file, but we would not provide DRM with any information about the format of the data in various buffers. As a result, a lot of the data in the generate .aub file would be unannotated, making further data analysis difficult. This patch causes the entire contents of each batch buffer to be annotated using the data in brw->state_batch_list (which was previously used only to annotate the output of INTEL_DEBUG=bat). This includes data that was allocated by brw_state_batch, such as binding tables, surface and sampler states, depth/stencil state, and so on. The new annotation mechanism requires DRM version 2.4.34. Reviewed-by: Eric Anholt <[email protected]>
* intel: When AUB dumping, flush before emitting final bitmap command.Paul Berry2012-05-221-1/+3
| | | | | | | | | | | | | When we are generating an AUB dump, we make a final call to aub_dump_bmp() as the context is being destroyed, to ensure that any rendering performed before the application exits can be seen during a simulation run. However, we were doing this before flushing the batch buffer; as a result simulation runs would not always see the effect of all rendering commands. This patch flushes the batch buffer just before making the final call to aub_dump_bmp(), to ensure that all rendering is properly captured in the final bitmap.
* st/mesa: use pipe_sampler_view_release() in st_destroy_context_priv()Brian Paul2012-05-191-1/+1
| | | | | | | | Fixes another case of sampler views being created by one context, shared by another, then deleted by the first, leaving a dangling pipe context pointer. Reviewed-by: José Fonseca <[email protected]>
* mesa: use F_TO_I() instead of IROUND()Brian Paul2012-05-194-130/+130
| | | | | | | | Use it where performance matters more and the exact method of float->int conversion/rounding isn't terribly important. There should no net change here since F_TO_I() is the new name of the old IROUND() function. Reviewed-by: José Fonseca <[email protected]>
* mesa: reimplement IROUND(), add F_TO_I()Brian Paul2012-05-191-21/+36
| | | | | | | | | | | | | | The different implementations of IROUND() behaved differently and in the case of fistp, depended on the current x86 FPU rounding mode. This caused some tests like piglit roundmode-pixelstore and roundmode-getintegerv to fail on 32-bit x86 but pass on 64-bit x86. Now IROUND() always rounds to the nearest integer (away from zero). The new F_TO_I function converts a float to an int by whatever means is fastest. We'll use this where we're more concerned with performance and not too worried to how the conversion is done. Reviewed-by: José Fonseca <[email protected]>
* mesa: fix Z32_FLOAT -> uint conversion functionsBrian Paul2012-05-191-2/+2
| | | | | | | | The IROUND converted all arguments to 0 or 1. That's not what we wanted. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: José Fonseca <[email protected]>
* st/mesa: remove unused pipe variableBrian Paul2012-05-191-1/+0
|
* st/mesa: added st_print_current_vertex_program(), for debuggingBrian Paul2012-05-192-0/+27
|
* mesa: add GLSL_REPORT_ERRORS debug flagBrian Paul2012-05-192-0/+15
| | | | | | | If the MESA_GLSL env var contains "errors", GLSL compilation and link errors will be reported to stderr. Reviewed-by: Ian Romanick <[email protected]>
* mesa: add some comments on shaderapi.c functionsBrian Paul2012-05-191-1/+14
|
* mesa: Remove undefinition of _P symbol.Vinson Lee2012-05-181-6/+0
| | | | | | | IRIX isn't used anymore. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* state_tracker: remove sw_primitive_restart from st_contextJordan Justen2012-05-172-2/+0
| | | | | | | | The VBO module now can handle primitive restart in software if required. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* state_tracker: remove software handling of primitive restartJordan Justen2012-05-171-178/+2
| | | | | | | | The VBO module now can handle primitive restart in software if required. Therefore this support is no londer required. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* state_tracker: set PrimitiveRestartInSoftware if neededJordan Justen2012-05-171-0/+1
| | | | | | | | | If the PIPE_CAP_PRIMITIVE_RESTART screen param is not set, then enable PrimitiveRestartInSoftware to enable software primitive restart support in the VBO module. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* vbo: use software primitive restart in the VBO moduleJordan Justen2012-05-171-6/+37
| | | | | | | | | | When PrimitiveRestartInSoftware is set, the VBO module will handle primitive restart scenarios before calling the vbo->draw_prims drawing function. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add PrimitiveRestartInSoftware to gl_context.ConstJordan Justen2012-05-172-0/+8
| | | | | | | | | | | If set, then the VBO module will handle all primitive restart scenarios before calling the driver draw_prims. Software primitive restart support is disabled by default. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* vbo: add software primitive restart supportJordan Justen2012-05-174-0/+241
| | | | | | | | | | | vbo_sw_primitive_restart implements primitive restart in software by splitting primitive draws apart. This is based on similar support in mesa/state_tracker/st_draw.c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Check for framebuffer completeness before looking at the rb.Eric Anholt2012-05-171-6/+6
| | | | | | | | | Otherwise, an incomplete framebuffer could have a NULL _ColorReadBuffer and we'd deref that. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Fix assertion failure when a cube face is not present.Eric Anholt2012-05-171-1/+2
| | | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Fix up swizzle for dereference_array of matrices.Eric Anholt2012-05-171-2/+2
| | | | | | | | | Fixes assertion failure in piglit: vs-mat2-struct-assignment.shader_test vs-mat2-array-assignment.shader_test Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Throw error on glGetActiveUniform inside Begin/End.Eric Anholt2012-05-171-0/+2
| | | | | | | | Fixes piglit GL_ARB_shader_objeccts/getactiveuniform-beginend. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Do more register coalescing by using the interference graph.Eric Anholt2012-05-172-0/+62
| | | | | | | | | | | | | | By using the live variables code for determining interference, we can handle coalescing in the presence of control flow, which the other register coalescing path couldn't. Total instructions: 207184 -> 206990 74/1246 programs affected (5.9%) 33993 -> 33799 instructions in affected programs (0.6% reduction) There is a newerth shader that loses out, because of some extra MOVs that now get their dead-code nature obscured by coalescing. This should be fixed by doing better at dead code elimination.
* st/mesa: set PIPE_BIND_STREAM_OUTPUT for TFB target in st_bufferobj_dataChristoph Bumiller2012-05-171-0/+3
|
* i965/blorp: Move exec() out of brw_blorp_params.Paul Berry2012-05-153-6/+9
| | | | | | | | | No functional change. This patch replaces the brw_blorp_params::exec() method with a global function brw_blorp_exec() that performs the operation described by the params data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6: Initial implementation of MSAA.Paul Berry2012-05-1523-121/+662
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch enables MSAA for Gen6, by modifying intel_mipmap_tree to understand multisampled buffers, adapting the rendering pipeline setup to enable multisampled rendering, and adding multisample resolve operations to brw_blorp_blit.cpp. Some preparation work is also included for Gen7, but it is not yet enabled. MSAA support is still fairly preliminary. In particular, the following are not yet supported: - Fully general blits between MSAA and non-MSAA buffers. - Formats other than RGBA8, DEPTH24, and STENCIL8. - Centroid interpolation. - Coverage parameters (glSampleCoverage, GL_SAMPLE_ALPHA_TO_COVERAGE, GL_SAMPLE_ALPHA_TO_ONE, GL_SAMPLE_COVERAGE, GL_SAMPLE_COVERAGE_VALUE, GL_SAMPLE_COVERAGE_INVERT). Fixes piglit tests "EXT_framebuffer_multisample/accuracy" on i965/Gen6. v2: - In intel_alloc_renderbuffer_storage(), quantize the requested number of samples to the next higher sample count supported by the hardware. This ensures that a query of GL_SAMPLES will return the correct value. It also ensures that MSAA is fully disabled on Gen7 for now (since Gen7 MSAA support doesn't work yet). - When reading from a non-MSAA surface, ensure that s_is_zero is true so that we won't try to read from a nonexistent sample.
* i965/gen6+: Add code to perform blits on the render path ("blorp").Paul Berry2012-05-158-27/+1730
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch expands the "blorp" component to be able to perform blits as well as HiZ resolves. The new blitting code is located in brw_blorp_blit.cpp. This includes the necessary fragment shader code to look up pixels in the source buffer (which is configured as a texture) and output them to the destination buffer (which is configured as the render target). Most of the time the fragment shader code is simple and straightforward, since it merely has to apply a coordinate offset, read from the texture, and write to the render target. However, in the case of blitting stencil buffers, things are more complicated, since the GPU stores stencil data using W tiling, and W tiling is not supported for textures or render targets. So, we set up the stencil buffers as Y tiled, and emit fragment shader code that adjusts the coordinates to account for the difference between W and Y tiling. Furthermore, since a rectangular region in W tiling does not necessarily correspond to a rectangular region in Y tiling, we widen the rectangle primitive to the nearest tile boundary and have the fragment shader "kill" any pixels that don't fall inside the actual desired destination rectangle. All of this is a necessary prerequisite for implementing MSAA, since we'll need to be able to blit between multisample color, depth, and stencil buffers and their non-multisampled counterparts, and none of the existing blitting mechanisms support multisampling. In addition, the new blitting code should speed up operations where we previously fell back to software rasterization, such as blitting of stencil buffers. The current fallback sequence is: first we try to do a blit using the hardware blitting engine. If that fails we try to do a blit using the render path. If that also fails then we do the blit using a meta-op (which may or may not fall back to software rasterization). Note that blitting using the render path has some limitations at the moment: it only supports a few formats, and it doesn't support clipping or scissoring. These limitations will be addressed in future patch series. v2: - Add the code that configures the WM program to gen{6,7}_emit_wm_config() and gen7_emit_ps_config() rather than creating separate ...enable() functions. - Call intel_prepare_render before determining which miptrees we are blitting from/to, because it may cause miptrees to be reallocated. - Allow the blit to mirror X and/or Y coordinates. - Disable blorp blits on Gen7 for now, since they aren't working yet.
* i965: Expose surface setup internals for use by blits.Paul Berry2012-05-153-2/+4
| | | | | | | | This patch exposes the functions brw_get_surface_tiling_bits and gen7_set_surface_tiling, so that they can be re-used when setting up surface states in gen6_blorp.cpp and gen7_blorp.cpp. Reviewed-by: Chad Versace <[email protected]>
* i965: split gen{6,7}_blorp_exec functions into manageable chunks.Paul Berry2012-05-153-522/+647
| | | | | | | | | | | | | | | | | This patch splits up the gen6_blorp_exec and gen7_blorp_exec functions, which were very long, into simple component functions. With a few exceptions, there is one function per state packet. This will allow blit functionality to be added without significantly complicating the code. Reviewed-by: Chad Versace <[email protected]> v2: Rename the functions gen{6,7}_emit_wm_disable() to gen{6,7}_emit_wm_config() (since the WM is not actually disabled during HiZ ops; it simply doesn't have a program). Also, on gen7, split out the configration of 3DSTATE_PS to a separate function gen7_emit_ps_config().
* i965: Parameterize HiZ code to prepare for adding blitting.Paul Berry2012-05-157-177/+335
| | | | | | | | | | | | | | | | | | | This patch groups together the parameters used by the HiZ functions into a new data structure, brw_hiz_resolve_params, rather than passing each parameter individually between the HiZ functions. This data structure is a subclass of brw_blorp_params, which represents the parameters of a general-purpose blit or resolve operation. A future patch will add another subclass for blits. In addition, this patch generalizes the (width, height) parameters to a full rect (x0, y0, x1, y1), since blitting operations will need to be able to operate on arbitrary rectangles. Also, it renames several of the HiZ functions to reflect the expanded role they will serve. v2: Rename brw_hiz_resolve_params to brw_hiz_op_params. Move gen{6,7}_blorp_exec() functions back into gen{6,7}_blorp.h. Reviewed-by: Chad Versace <[email protected]>
* i965: Implement guardband clipping on Ivybridge.Kenneth Graunke2012-05-152-5/+15
| | | | | | | | | | | Improves performance in Citybench: - 320x240: 9.19589% +/- 0.557621% - 1280x480: 3.90797% +/- 0.774429% No apparent difference in OpenArena. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Implement guardband clipping on Sandybridge.Kenneth Graunke2012-05-152-10/+15
| | | | | | | | | | | Improves performance in Citybench: - 320x240: 19.8008% +/- 0.937818% - 1280x480: 6.53856% +/- 0.859083% No apparent difference in OpenArena nor Xonotic. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965/fs: Jump from discard statements to the end of the program when ↵Eric Anholt2012-05-144-126/+5
| | | | | | | | | | done." This reverts commit 31866308fcf989df992ace28b5b986c3d3770e90. Fixes piglit glsl-fs-discard-exit-3 and unigine tropics rendering. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Remove the requirement of no dead code for interference checks.Eric Anholt2012-05-141-12/+12
| | | | | | | | This will be convenient when I want to comment out optimization code to see the raw program being optimized, but more importantly will let the interference check be used during optimization. Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Add support for copy propagation.Eric Anholt2012-05-145-0/+143
| | | | | | | | | | | | We could do more by handling abs/negate and non-GRF sources, but this is a good start. Improves tropics performance 0.30% +/- .17% (n=43). shader-db results: Total instructions: 208032 -> 207184 60/1246 programs affected (4.8%) 23286 -> 22438 instructions in affected programs (3.6% reduction) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: When doing no work for live interval calculation, do no allocation.Eric Anholt2012-05-141-7/+7
| | | | | | | | | When I had a bug causing the backend to never finish optimizing, it also sent me deep into swap. This avoids extra memory allocation per trip through optimization, and thus may reduce the peak memory allocation of the driver even in the success case. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen7: Set tile_x/y to 0 in the no-stencil case.Eric Anholt2012-05-141-1/+1
| | | | Fixes compiler warnings.
* intel: Fix signed/unsigned comparison warnings.Eric Anholt2012-05-142-5/+6
|
* intel: Fix compile warning from 7b6424143d8bf572cadd46adcbaa91d2a5598635Eric Anholt2012-05-141-2/+2
|
* intel: Fix compiler warning from 3cd7bee48f7caf7850ea64d40f43875d4c975507Eric Anholt2012-05-141-2/+0
|
* i965/fs: Add a local common subexpression elimination pass.Kenneth Graunke2012-05-144-0/+201
| | | | | | | | | | | | | | | | | Total instructions: 18210 -> 17836 49/163 programs affected (30.1%) 12888 -> 12514 instructions in affected programs (2.9% reduction) This reduces Lightsmark's "Scale down filter" shader from 395 instructions to 283, a whopping 28%. It also reduces register pressure significantly: the SIMD8 program now uses 29 registers instead of 101, giving us more than enough room for a SIMD16 program. v2: Add && !inst->conditional_mod to the "skip some instructions" check. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Use a const reference in fs_reg::equals instead of a pointer.Kenneth Graunke2012-05-143-16/+16
| | | | | | | | | | This lets you omit some ampersands and is more idiomatic C++. Using const also marks the function as not altering either register (which was obvious, but nice to enforce). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: print the Git SHA1 in GL_VERSION for ES1 and ES2.Oliver McFadden2012-05-141-2/+10
| | | | | Signed-off-by: Oliver McFadden <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: GLES specifies restrictions on uniform matrix transpose.Oliver McFadden2012-05-141-0/+10
| | | | | | | | | GL_INVALID_VALUE is generated if transpose is not GL_FALSE. http://www.khronos.org/opengles/sdk/docs/man/xhtml/glUniform.xml Signed-off-by: Oliver McFadden <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nouveau/vieux: finish != flush, how about we do that..Ben Skeggs2012-05-123-0/+23
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* mesa: add DEBUG_INCOMPLETE_TEXTURE, DEBUG_INCOMPLETE_FBO flagsBrian Paul2012-05-114-24/+21
| | | | | | | Instead of having to hack the code to enable these debugging options, set them through the MESA_DEBUG env var. Reviewed-by: Eric Anholt <[email protected]>
* mesa: implement DEBUG_ALWAYS_FLUSH debug optionBrian Paul2012-05-113-0/+32
| | | | | | | | This flag has been around for a while but it wasn't actually used anywhere. Now, setting this flag causes a glFlush() to be issued after each drawing call (including glBegin/End, glDrawElements, glDrawArrays, glDrawPixels, glCopyPixels and glBitmap).
* mesa: define DEBUG_SILENT flag, use in output_if_debug()Brian Paul2012-05-113-12/+13
|
* mesa: clean-up the debug/verbose flag setup codeBrian Paul2012-05-111-26/+46
| | | | Split the verbose and debug flag setup code into separate functions.
* mesa: do FLUSH_VERTICES() in _mesa_flush/finish()Brian Paul2012-05-111-2/+4
| | | | | | | This was being done in the _mesa_Flush/Finish() calls but if there was an internal call to _mesa_flush/finish() the FLUSH_VERTICES() wouldn't happen. Looks like only the intel and radeon drivers made such calls in MakeCurrent().