summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* r600g: enable DUAL_EXPORT mode when possible on r6xx/r7xxJerome Glisse2012-06-273-18/+57
| | | | | | | DUAL_EXPORT can be enabled on r6xx/r7xx when all CBs use 16-bit export and there is no depth/stencil export. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: enable DUAL_EXPORT mode when possibleVadim Girlin2012-06-274-6/+55
| | | | | | | | It seems DUAL_EXPORT on evergreen may be enabled when all CBs use 16-bit export mode (EXPORT_4C_16BPC), also there should be at least one CB, and the PS shouldn't export depth/stencil. Signed-off-by: Vadim Girlin <[email protected]>
* r600g: avoid unnecessary shader exports v2Vadim Girlin2012-06-276-28/+41
| | | | | | | | | | | | In some cases TGSI shader has more color outputs than the number of CBs, so it seems we need to limit the number of color exports. This requires different shader variants depending on the nr_cbufs, but on the other hand we are doing less exports, which are very costly. v2: fix various piglit regressions Signed-off-by: Vadim Girlin <[email protected]> Signed-off-by: Jerome Glisse <[email protected]>
* r600g: cache shader variants instead of rebuilding v3Vadim Girlin2012-06-275-94/+217
| | | | | | | | | | | | | | | | | | | Shader variants are stored in the list, the key for lookup is based on the states that require different hw shaders - currently it's rctx->two_side (all gpus) and rctx->nr_cbufs (evergreen/cayman, when writes_all property is set). v2: - use simple list instead of keymap as suggested by Marek on irc - call r600_adjust_gprs from r600_bind_vs_shader for r6xx/r7xx (r600_shader_select isn't used for vertex shaders currently) v3: - fix call to r600_adjust_gprs - do it after updating current shader Improves performance for some apps, e.g. FlightGear - see https://bugs.freedesktop.org/show_bug.cgi?id=50360 Signed-off-by: Vadim Girlin <[email protected]>
* svga: handle missing PIPE_CAP_x queriesBrian Paul2012-06-261-9/+14
| | | | | | And fix incorrect error message for a bad shader type/number. Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: handle more PIPE_CAP_x queriesBrian Paul2012-06-261-4/+48
| | | | | | | | | As with the previous commit for softpipe. v2: remove 'default' case to get compile-time warning Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* softpipe: handle more PIPE_CAP_x queriesBrian Paul2012-06-261-3/+31
| | | | | | | | | | These all return zero. Add a debug_printf() to catch the default case so we don't accidently mishandle something important in the future. v2: remove 'default' case to get compile-time warning Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* svga: return 1 for PIPE_CAP_MIXED_COLORBUFFER_FORMATSBrian Paul2012-06-261-1/+3
| | | | | | | | | | | This is actually required for GL_ARB_framebuffer_object, but the state tracker doesn't currently check it. Direct3D 9 allows mixed format color buffers with some restrictions. Setting this allows Unigine Heaven 2.5 and 3.0 to run. Tested both on GL and D3D hosts. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* glsl: fix comment typoBrian Paul2012-06-261-1/+1
|
* u2f_emit: Fix type parameter in LLVM call.Olivier Galibert2012-06-261-1/+1
| | | | | | | | The type is the destination type (i.e. float vector) and not the source type. Fixes piglit fs-{in,de}crement-uint. Signed-off-by: Olivier Galibert <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* i965/msaa: Set KILL_ENABLE when GL_ALPHA_TO_COVERAGE enabled.Paul Berry2012-06-262-4/+6
| | | | | | | | | | | | | | | i965 hardware needs to be informed of situations in which it's possible for pixels (or samples) to be discarded for reasons other than depth/stencil testing (e.g. due to an explicit "discard" in the fragment shader). One of these situations is when GL_ALPHA_TO_COVERAGE is enabled, since that can cause samples to be discarded by the color calculator when the pixel's alpha value is less than 1.0. Without this patch, GL_ALPHA_TO_COVERAGE does not take effect on depth buffers. Reviewed-by: Anuj Phogat <[email protected]>
* i965/msaa: Implement GL_SAMPLE_ALPHA_TO_{COVERAGE,ONE}.Paul Berry2012-06-261-1/+9
| | | | | | | | | | | | | | | | | | | | | | This patch enables the multisampling parameters GL_SAMPLE_ALPHA_TO_COVERAGE and GL_SAMPLE_ALPHA_TO_ONE, which allow the fragment shader's alpha output to be converted into a sample coverage mask and ignored for blending. i965 supports these parameters through the BLEND_STATE structure. The GL spec allows, but does not require, the implementation to dither the conversion from alpha to a sample coverage mask, so that alpha values that aren't a multiple of 1/num_samples result in the correct proportion of samples being lit. A bit exists in the BLEND_STATE structure to enable this functionality, but according to the hardware docs it must be disabled on Sandy Bridge (see the Sandy Bridge PRM, Vol2, Part1, p379: AlphaToCoverage Dither Enable). So it is enabled for Gen7 only. Fixes piglit tests "EXT_framebuffer_multisample/sample-alpha-to-{coverage,one} {2,4}". Reviewed-by: Anuj Phogat <[email protected]>
* i965/msaa: Implement glSampleCoverage.Paul Berry2012-06-264-7/+26
| | | | | | | | | | | | | | This patch enables glSampleCoverage() functionality, which allows the client program to specify that only a portion of the samples be lit up when performing multisampled rendering. i965 supports glSampleCoverage() through the 3DSTATE_SAMPLE_MASK command packet, which allows the driver to specify a bitfield indicating which samples to light up. Fixes piglit tests "EXT_framebuffer_multisample/sample-coverage {2,4} {inverted,non-inverted}". Reviewed-by: Anuj Phogat <[email protected]>
* st/wgl: Add a few more comments.José Fonseca2012-06-262-6/+38
|
* r600g: don't disable streamout if it hasn't been startedMarek Olšák2012-06-261-1/+1
|
* u_blitter: disable streamout before renderingMarek Olšák2012-06-261-0/+10
| | | | | | This fixes piglit EXT_transform_feedback tests: - intervening-read output - intervening-read prims_written
* i965/fs: Fix conversions float->bool, int->boolChad Versace2012-06-251-7/+7
| | | | | | | | | | | | | | | | | Fixes gles2conform GL.equal.equal_bvec2_frag. This fixes brw_fs_visitor's translation of ir_unop_f2b. It used CMP to convert the float to one of 0 or ~0. However, the convention in the compiler is that true is represented by 1, not ~0. This patch adds an AND to convert ~0 to 1. By inspection, a similar problem existed with ir_unop_i2b, with a similar fix. [v2 kayden]: eliminate extra temporary register. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49621 Signed-off-by: Chad Versace <[email protected]>
* st/wgl: 80-column wrappingBrian Paul2012-06-252-7/+12
|
* mesa: new MESA_LOG_FILE env var to log errors, warnings, etc., to a fileBrian Paul2012-06-251-2/+12
| | | | Reviewed-by: Jose Fonseca <[email protected]>
* r600g: inline r600_blit_push_depth and use resource_copy_regionMarek Olšák2012-06-253-18/+11
| | | | | We are going to have a separate resource for depth texturing and transfers and this is just a transfer thing.
* r600g: split flushed depth texture creation and flushingMarek Olšák2012-06-255-16/+34
|
* i965/msaa: Add backend support for centroid interpolation.Paul Berry2012-06-253-11/+32
| | | | | | | | | | | | | | | This patch causes the fragment shader to be configured correctly (and the correct code to be generated) for centroid interpolation. This required two changes: brw_compute_barycentric_interp_modes() needs to determine when centroid barycentric coordinates need to be included in the pixel shader thread payload, and fs_visitor::emit_general_interpolation() needs to interpolate using the correct set of barycentric coordinates. Fixes piglit tests "EXT_framebuffer_multisample/interpolation {2,4} centroid-edges" on i965. Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Refactor interpolation code to prepare for adding centroid support.Paul Berry2012-06-252-8/+17
| | | | Reviewed-by: Eric Anholt <[email protected]>
* i965/msaa: Adapt clip setup for centroid noperspective interpolation.Paul Berry2012-06-253-2/+6
| | | | | | | | | | | | | | | | | | To save time, we only instruct the clip stage of the pipeline to compute noperspective barycentric coordinates if those coordinates are needed by the fragment shader. Previously, we would determine whether the coordinates were needed by seeing whether the fragment shader used the BRW_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC interpolation mode. However, with MSAA, it's possible that the fragment shader might use BRW_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC instead. In the future, when we support ARB_sample_shading, it might use BRW_WM_NONPERSPECTIVE_SAMPLE_BARYCENTRIC. This patch modifies the upload_clip_state() functions to check for all three possible noperspective interpolation modes. Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add IsCentroid bitfield to gl_fragment_program.Paul Berry2012-06-252-1/+11
| | | | | | | | | This bitfield tells the back-ends which of a fragment shader's inputs require centroid interpolation. It is only set for GLSL fragment shaders, since assembly fragment shaders don't support centroid interpolation. Reviewed-by: Eric Anholt <[email protected]>
* st/mesa: added some simple fbo debugging/helper codeBrian Paul2012-06-251-1/+25
|
* llvmpipe: fix the LP_NO_RAST debug optionBrian Paul2012-06-254-24/+22
| | | | | | | | It was only no-oping the clear() function, not actual triangle rasterization. Move the no_rast field from lp_context down into lp_rasterizer so it's accessible where it's needed. Reviewed-by: Jose Fonseca <[email protected]>
* scons: Add glsl/glcpp to the include path.Vinson Lee2012-06-231-2/+2
| | | | | | | | | | Fixes this build failure on Solaris. Compiling build/sunos-debug/glsl/glcpp/glcpp-lex.c ... "src/glsl/glcpp/glcpp-lex.l", line 30: cannot find include file: "glcpp-parse.h" Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* automake: add missing inclusion of GL headersLaurent Carlier2012-06-222-0/+2
| | | | | | | Building fail when GL headers are not installed in the system, so add inclusion of these headers. Signed-off-by: Brian Paul <[email protected]>
* mesa: #define fprintf to be __mingw_fprintf() on Mingw32Brian Paul2012-06-221-0/+10
| | | | | | So that formats such as "%llx" are understood. Reviewed-by: Kenneth Graunke <[email protected]>
* svga: init pointer to NULL to silence MSVC warningBrian Paul2012-06-221-1/+1
|
* clover: Add --with-clang-libdir option and verify CLANG_RESOURCE_DIRTom Stellard2012-06-221-1/+1
| | | | | | | | | | | | | | $CLANG_RESOURCE_DIR is the directory that contains all resources needed by clang to compile programs. When clover uses clang to compile kernels it needs to specify a resource dir, so that clang can find its internal headers (e.g. stddef.h). clang defines $CLANG_RESOURCE_DIR as $CLANG_LIBDIR/clang/$CLANG_VERSION This patch adds the --with-clang-libdir option in order to accommodate clang intalls to non-standard locations, and it also adds a check to the configure script to verify that $CLANG_RESOURCE_DIR/include contains the necessary header files.
* i965: Compute dFdy() correctly for FBOs.Paul Berry2012-06-226-9/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | On i965, dFdx() and dFdy() are computed by taking advantage of the fact that each consecutive set of 4 pixels dispatched to the fragment shader always constitutes a contiguous 2x2 block of pixels in a fixed arrangement known as a "sub-span". So we calculate dFdx() by taking the difference between the values computed for the left and right halves of the sub-span, and we calculate dFdy() by taking the difference between the values computed for the top and bottom halves of the sub-span. However, there's a subtlety when FBOs are in use: since FBOs use a coordinate system where the origin is at the upper left, and window system framebuffers use a coordinate system where the origin is at the lower left, the computation of dFdy() needs to be negated for FBOs. This patch modifies the fragment shader back-ends to negate the value of dFdy() when an FBO is in use. It also modifies the code that populates the program key (brw_wm_populate_key() and brw_fs_precompile()) so that they always record in the program key whether we are rendering to an FBO or to a window system framebuffer; this ensures that the fragment shader will get recompiled when switching between FBO and non-FBO use. This will result in unnecessary recompiles of fragment shaders that don't use dFdy(). To fix that, we will need to adapt the GLSL and NV_fragment_program front-ends to record whether or not a given shader uses dFdy(). I plan to implement this in a future patch series; I've left FIXME comments in the code as a reminder. Fixes Piglit test "fbo-deriv". NOTE: This is a candidate for stable release branches. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: minor transform feedback commentsBrian Paul2012-06-221-0/+2
|
* mesa: fix comments on UBO buffer binding functionsBrian Paul2012-06-221-4/+7
| | | | The old comments were for transform feedback.
* draw: Handle the case when there isn't a fragment shader.Olivier Galibert2012-06-221-10/+17
| | | | | Signed-off-by: Olivier Galibert <[email protected]> Signed-off-by: José Fonseca <[email protected]>
* r600g: Unify SURFACE_SYNC packet emission for 3D and computeTom Stellard2012-06-213-101/+30
| | | | | | Drop the compute specific evergreen_set_buffer_sync() function and instead use the r600_surface_sync_command atom for emitting SURFACE_SYNC packets.
* r600g: Enable reusing of compute stateTom Stellard2012-06-211-6/+9
|
* r600g: Fix reading vtx instruction offset from bytestreamTom Stellard2012-06-211-1/+1
|
* radeon/llvm: Turn on the BitExtract peephole optimizationTom Stellard2012-06-212-5/+32
| | | | | Thie BitExtract optimization folds a mask and shift operation together into a single instruction (BFE_UINT).
* radeon/llvm: Lower ROTL to BIT_ALIGNTom Stellard2012-06-216-1/+54
|
* radeon/llvm: Use the VLIW Scheduler for R600->NITom Stellard2012-06-2112-8/+75
| | | | | | | | | | | | | | | | | | | It's not optimal, but it's better than the register pressure scheduler that was previously being used. The VLIW scheduler currently ignores all the complicated instruction groups restrictions and just tries to fill the instruction groups with as many instructions as possible. Though, it does know enough not to put two trans only instructions in the same group. We are able to ignore the instruction group restrictions in the LLVM backend, because the finalizer in r600_asm.c will fix any illegal instruction groups the backend generates. Enabling the VLIW scheduler improved the run time for a sha1 compute shader by about 50%. I'm not sure what the impact will be for graphics shaders. I tested Lightsmark with the VLIW scheduler enabled and the framerate was about the same, but it might help apps that use really big shaders.
* mesa: set GL_ARB_uniform_buffer_object extension year to 2009Brian Paul2012-06-211-1/+1
|
* mesa: Add a comment explaining my thoughts on glBindBufferBase().Eric Anholt2012-06-211-0/+26
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for glGetIntegeri_v from GL_ARB_uniform_buffer_object.Eric Anholt2012-06-211-0/+24
| | | | | | | Fixes piglit ARB_uniform_buffer_object/getintegeri_v. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for glBindBufferBase/Range on GL_UNIFORM_BUFFER.Eric Anholt2012-06-211-0/+85
| | | | | | | | | | | | Fixes piglits: GL_ARB_uniform_buffer_object/bindbuffer-general-point. GL_ARB_uniform_buffer_object/negative-bindbuffer-buffer GL_ARB_uniform_buffer_object/negative-bindbuffer-index GL_ARB_uniform_buffer_object/negative-bindbuffer-target GL_ARB_uniform_buffer_object/negative-bindbufferrange-range Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Move glBindBufferBase and glBindBufferRange() to bufferobj.Eric Anholt2012-06-214-59/+103
| | | | | | | | | | | The rest of the TFB implementation remains in transformfeedback.c, and this will be shared with UBOs. v2: Move the size/offset checks shared with UBOs to common code as well. (Kenneth's review) Reviewed-by: Brian Paul <[email protected]> (v1) Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Move buffer object dispatch setup to bufferobj.c.Eric Anholt2012-06-213-11/+21
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add indexed binding points for uniform buffer objects.Eric Anholt2012-06-212-0/+51
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add support for the GL_UNIFORM_BUFFER general binding point.Eric Anholt2012-06-213-0/+22
| | | | | | | Fixes piglit ARB_uniform_buffer_object/buffer-targets. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>