summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/xorg: fix masked transformationsLucas Stach2012-07-201-20/+40
| | | | | | | | | | Someone tried to be clever and "optimized" add_vertex_data2() to just use two points for the texture coordinates and then reuse individual components. Sadly this is not how matrix multiplication works. Fixes rendercheck -t tmcoords Signed-off-by: Lucas Stach <[email protected]>
* i965/blorp: Use IMS layout when texturing from depth/stencil surfaces.Paul Berry2012-07-201-23/+43
| | | | | | | | | | | | | | Previously, on Gen7, when texturing from a depth or stencil surface, the blorp engine would configure the 3D pipeline as though the input surface was non-multisampled, and perform the necessary coordinate transformations in the fragment shader to account for the IMS layout. This meant outputting a lot of extra fragment shader code, and it raised some uncertainty about how to deal with very large surfaces. This patch modifies blorp to configure the 3D pipeline properly for IMS layout when reading from depth and stencil surfaces. Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp: Loosen assertions in compute_msaa_layout_for_pipeline.Paul Berry2012-07-201-7/+2
| | | | | | | | Previously, on Gen7, compute_msaa_layout_for_pipeline() would verify that IMS layout is not used. However, now that we configure SURFACE_STATE correctly for IMS surfaces, IMS layout is available. Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp: Configure SURFACE_STATE correctly for IMS surfaces.Paul Berry2012-07-203-6/+14
| | | | | | | | | | | | | This patch modifies gen7_set_surface_num_multisamples() to set up the SURFACE_STATE appropriately for texturing from IMS format MSAA surfaces (which are only used on Gen7 for depth and stencil buffers). Since the function now sets more than just the number of multisamples, it's been renamed to gen7_set_surface_msaa(). This will make it possible to remove some kludginess from the blorp engine. Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp: Optimize manual_blend() for compressed multisampled surfaces.Paul Berry2012-07-201-0/+23
| | | | | | | | | | When downsampling a compressed multisampled surface, we can take a shortcut to downsample any pixels that were completely covered by a single primitive. In this case, the first color value we fetch is the correct final color for the downsampled pixel, so we can skip the rest of the blending operation. Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp: Fix integer downsampling on Gen7.Paul Berry2012-07-202-11/+55
| | | | | | | | | | | | | | | | When downsampling an integer-format buffer on Gen7, we need to use the "avg" instruction rather than the "add" instruction, to ensure that we don't overflow the range of 32-bit integers. Also, we need to use the proper register type (BRW_REGISTER_TYPE_D or BRW_REGISTER_TYPE_UD) for intermediate color data and for writing to the render target. Note: this patch causes blorp to use the proper register type for all operations (downsampling, upsampling, and ordinary blits). Strictly speaking, this is only necessary for downsampling, because the other operations exclusively use MOV instructions on the color data. But it's simpler to use the proper register type in all cases. Reviewed-by: Anuj Phogat <[email protected]>
* i965/blorp: Modify manual_blend() to avoid unnecessary loss of precision.Paul Berry2012-07-201-27/+90
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When downsampling from an MSAA image to a single-sampled image, it is inevitable that some loss of numerical precision will occur, since we have to use 32-bit floating point registers to hold the intermediate results while blending. However, it seems reasonable to expect that when all samples corresponding to a given pixel have the exact same color value, there will be no loss of precision. Previously, we averaged samples as follows: blend = (((sample[0] + sample[1]) + sample[2]) + sample[3]) / 4 This had the potential to lose numerical precision when all samples have the same color value, since ((sample[0] + sample[1]) + sample[2]) may not be precisely representable as a 32-bit float, even if the individual samples are. This patch changes the formula to: blend = ((sample[0] + sample[1]) + (sample[2] + sample[3])) / 4 This avoids any loss of precision in the event that all samples are the same, by ensuring that each addition operation adds two equal values. As a side benefit, this puts the formula in the form we will need in order to implement correct blending of integer formats. Reviewed-by: Anuj Phogat <[email protected]>
* i965: Add support for AVG instruction.Paul Berry2012-07-202-0/+23
| | | | | | | | | | | | | From the Ivy Bridge PRM, Vol4 Part3 p152: "The avg instruction performs component-wise integer average of src0 and src1 and stores the results in dst. An integer average uses integer upward rounding. It is equivalent to increment one to the addition of src0 and src1 and then apply an arithmetic right shift to this intermediate value." Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965: Replace fs_visitor::kill_emitted with gl_fragment_program::UsesKill.Paul Berry2012-07-202-4/+1
| | | | | | | The kill_emitted variable was duplicating the functionality of gl_fragment_program::UsesKill. There's no need for both. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Set gl_fragment_program::UsesKill in do_set_program_inouts.Paul Berry2012-07-204-33/+14
| | | | | | | | | | | | | | | | | | | | | Previously, the code for setting this flag for GLSL programs was duplicated in three places: brw_link_shader(), glsl_to_tgsi_visitor, and ir_to_mesa_visitor. In addition to the unnecessary duplication, there was a performance problem on i965: brw_link_shader() set the flag before doing its final round of optimizations, which meant that if the optimizations managed to eliminate all the discard operations, the flag would still be set, resulting (at least in theory) in slower performance. This patch consolidates all of the code that sets UsesKill for GLSL programs into do_set_program_inouts(), which already is doing a similar job for UsesDFdy, and which occurs after i965's final round of optimizations. Non-GLSL programs (ARB programs and the state tracker's glBitmap program) are unaffected. Reviewed-by: Eric Anholt <[email protected]>
* gallium-egl: Move wayland query_buffer implementationKristian Høgsberg2012-07-197-32/+54
| | | | | | | Move it to native_wayland_drm_bufmgr_helper.c which only gets compiled when wayland is enabled and which already includes the right headers. Signed-off-by: Kristian Høgsberg <[email protected]>
* softpipe: Fix segfault with fbo-cubemap.Olivier Galibert2012-07-191-1/+6
| | | | | | | | | | | | | | | | | | | The cube sampler generates two-dimensional texture coordinates and hence passes NULL for the array for the third one. The actual 2D sampler, lower in the pipe, knew not to used that array since it didn't need it. But the samplers have become single-texel and the coordinate array dereference has been moved up one step, to a level where the code does not know only two coordinates are used. Hence the segfault. The simplest fix by far is to add a third dummy coordinate array in the call to the next pipe step, which will be dereferenced to an harmless 0 which then will be happily ignored by the sampler. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=52250 Signed-off-by: Olivier Galibert <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* wayland: Support EGL_WIDTH and EGL_HEIGHT queries for wl_bufferKristian Høgsberg2012-07-193-5/+23
| | | | | | We're going to make the public wl_buffer struct as small as possible. Signed-off-by: Kristian Høgsberg <[email protected]>
* wayland: Use existing EGL_TEXTURE_FORMAT for querying wl_buffer texture formatKristian Høgsberg2012-07-194-55/+46
| | | | | | | | We also reuse EGL_TEXTURE_RGBA and EGL_TEXTURE_RGB, adding only the new planar YUV texture formats: EGL_TEXTURE_Y_U_V_WL, EGL_TEXTURE_Y_UV_WL and EGL_TEXTURE_Y_XUXV_WL. Signed-off-by: Kristian Høgsberg <[email protected]>
* gallium-egl: Implement eglQueryWaylandBufferWLKristian Høgsberg2012-07-191-1/+31
| | | | | | Support this query for gallium EGL too. Signed-off-by: Kristian Høgsberg <[email protected]>
* glsl: Remove open coded version of ir_variable::interpolation_string().Kenneth Graunke2012-07-191-15/+1
| | | | | | | | Presumably the function didn't exist when we wrote this code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Avoid unnecessary recompiles for shaders that don't use dFdy().Paul Berry2012-07-194-14/+10
| | | | | | | | | | | | The i965 back-end needs to compile dFdy() differently for FBOs and window system framebuffers, because Y coordinates are flipped between the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs). This patch avoids unnecessarily recompiling shaders that don't use dFdy(), by only setting render_to_fbo in the wm program key if the shader actually uses dFdy(). Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Set UsesDFdy appropriately for GLSL shaders.Paul Berry2012-07-191-5/+17
| | | | | | | | | | | | | | | | This patch updates the ir_set_program_inouts_visitor so that it also sets gl_fragment_program::UsesDFdy. This is a bit of a hack (since dFdy() isn't an input or an output), but there's no other obvious visitor to squeeze this functionality into, and it would be silly to create a brand new visitor just for this purpose. v2: use local 'fprog' var to avoid repeated casting. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Set UsesDFdy appropriately for assembly programs.Paul Berry2012-07-193-0/+4
| | | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add UsesDFdy to struct gl_fragment_program.Paul Berry2012-07-192-0/+3
| | | | | | | | | | | | The i965 back-end needs to compile dFdy() differently for FBOs and window system framebuffers, because Y coordinates are flipped between the two (see commit 82d2596: i965: Compute dFdy() correctly for FBOs). This boolean will allow it to avoid unnecessarily recompiling shaders that don't use dFdy(). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* drirc: Add disable_blend_func_extended workaround for Unigine OilRush.Kenneth Graunke2012-07-191-0/+6
| | | | | | | | | | The previous commit implemented the workaround, cited a bug report about OilRush, but actually only enabled the workaround for the demos. Turn it on for OilRush too. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291 Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Add a driconf option to disable GL_ARB_blend_func_extended.Kenneth Graunke2012-07-194-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Unigine Heaven (at least) has a bug where it incorrectly uses the GL_ARB_blend_func_extended extension. Dual source blending allows two color outputs per render target; individual shader outputs can be assigned to be either the first or second blending input by setting the 'index' via one of two methods: - An API call: glBindFragDataLocationIndexed() - The GLSL 'layout' qualifier provided by GL_ARB_explicit_attrib_location Both of these only work on user defined fragment shader outputs; it's an error to use either on built-in outputs like gl_FragData. Unigine uses gl_FragData and gl_FragColor exclusively, and doesn't even attempt to use either method to set index == 1. However, it does set the blending function to SRC1 enums, which requires a fragment shader output with index == 1 or else rendering is undefined. In other words, enabling ARB_blend_func_extended causes Unigine to render incorrectly, resulting in an apparent regression, even though our driver code (as far as I can tell) is perfectly fine. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=50291 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: remove stale commentBrian Paul2012-07-181-1/+0
|
* mesa: use gl_program cast wrappersBrian Paul2012-07-186-49/+37
| | | | | | | In a few cases, remove unneeded casts. And fix a few other const-correctness issues. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add some gl_program cast wrappersBrian Paul2012-07-181-0/+42
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* r600g: setup streamout before calling last r600_need_cs_space before drawingMarek Olšák2012-07-181-6/+6
| | | | | This fixes CS checker errors due to registers not being initialized, because the flush occured after dirty state was emitted but before drawing.
* i965/fs: Make register spill/unspill only do the regs for that instruction.Eric Anholt2012-07-181-33/+33
| | | | | | | | | | | | | | | | | | Previously, if we were spilling the result of a texture call, we would store all 4 regs, then for each use of one of those regs as the source of an instruction, we would unspill all 4 regs even though only one was needed. In both lightsmark and l4d2 with my current graphics config, the shaders that produce spilling do so on split GRFs, so this doesn't help them out. However, in a capture of the l4d2 shaders with a different snapshot and playing the game instead of using a demo, it reduced one shader from 2817 instructions to 2179, due to choosing a now-cheaper texture result to spill instead of piles of texcoords. v2: Fix comment noted by Ken, and fix the if condition associated with it for the current state of what constitutes a partial write of the destination. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* i965/fs.h: Refactor tests for instructions modifying a register.Eric Anholt2012-07-184-34/+16
| | | | | | | | | | There's one instance of a potential behavior change: propagate_constants may now propagate into a part of a vgrf after a different part of it was overwritten by a send that returns multiple registers. I don't think we ever generate IR that meets that condition, but it's something to note if we bisect behavior change to this. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Replace usage is_tex() with regs_written() checks.Eric Anholt2012-07-181-9/+9
| | | | | | | | | | In these places, we care about any sort of send that hits more than one reg, not just textures. We don't yet have anything else returning more than one reg, so there's no change. v2: Use mlen instead of is_tex() for the is-it-a-send check. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Rename virtual_grf_next to virtual_grf_count.Eric Anholt2012-07-186-22/+21
| | | | | | | "count" is a more useful name, since most of the time we're using it for looping over the variables. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Move a block out of a loop in live variables setup.Eric Anholt2012-07-181-4/+5
| | | | | | This was accidentally copy-and-pasted inside. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/msaa: Disable alpha-to-{coverage, one} when drawbuffer zero is in ↵Anuj Phogat2012-07-181-7/+21
| | | | | | | | | | | | | | | | | | | integer format OpenGL specification 3.3 (page 196), section 4.1.3 says: If drawbuffer zero is not NONE and the buffer it references has an integer format, the SAMPLE_ALPHA_TO_COVERAGE and SAMPLE_ALPHA_TO_ONE operations are skipped." This should work properly even if there are other draw buffers that are not in integer format. This patch makes following piglit tests pass on mesa: int-draw-buffers-alpha-to-coverage int-draw-buffers-alpha-to-one Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Anuj Phogat <[email protected]>
* st/xorg: attach EDID to outputsLucas Stach2012-07-181-1/+36
| | | | | | | | Allows tools like GNOME's monitor configuration to show meaningful names. v2: fix resource leak Signed-off-by: Lucas Stach <[email protected]>
* st/xorg: remove superfluous memsetLucas Stach2012-07-181-2/+0
| | | | | | exaDriverAlloc() uses calloc, which already initialises pExa to zero. Signed-off-by: Lucas Stach <[email protected]>
* st/xorg: reorder exa context creation and use screen param queriesLucas Stach2012-07-181-7/+8
| | | | | | | | | Gives the x-server a more accurate description of the exa hardware capabilities. v2: drop NPOT check Signed-off-by: Lucas Stach <[email protected]>
* softpipe: Take all lods into account when texture sampling.Olivier Galibert2012-07-182-766/+645
| | | | | | | | | | | | This patch churns a lot because it needs to change 4-wide filters into single pixel filters, since each fragment may use a different filter. The only case not entirely supported is the anisotropic filtering. Not sure what we want to do there, since a full quad is required by that filter. Signed-off-by: Olivier Galibert <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g: implement wait-free buffer transfer for DISCARD_RANGEMarek Olšák2012-07-184-16/+50
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* r600g: accelerate buffer copyingMarek Olšák2012-07-181-23/+47
| | | | | | | This will be useful for efficient handling of the DISCARD transfer flags. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* r600g: update R600_MAX_DRAW_CS_DWORDS to take draw-opaque into accountMarek Olšák2012-07-182-4/+2
|
* r600g: move VGT_STRMOUT_DRAW_OPAQUE_OFFSET initialization into invariant stateMarek Olšák2012-07-183-1/+3
|
* r600g: only set the index type if drawing is indexedMarek Olšák2012-07-181-4/+5
|
* r600g: remove debug code for streamoutMarek Olšák2012-07-181-11/+0
|
* r600g: inline r600_context_draw_opaque_countMarek Olšák2012-07-183-32/+21
|
* r600g: fix alphatest without a colorbuffer on evergreenMarek Olšák2012-07-181-1/+4
|
* r600g: fix alphatest without a colorbuffer on r6xx-r7xxMarek Olšák2012-07-181-6/+10
|
* r600g: always derive alphatest state from the first colorbufferMarek Olšák2012-07-184-14/+22
|
* r600g: atomize alphatest stateMarek Olšák2012-07-186-46/+52
|
* r600g: try to fix line stippling with lineloopsMarek Olšák2012-07-181-1/+2
| | | | The piglit test is failing, but visually it looks almost correct.
* r600g: optimize uploading depth texturesMarek Olšák2012-07-181-11/+5
| | | | | | | Make it only copy the portion of a depth texture being uploaded and not the whole 2D layer. There is also a little code cleanup.
* r600g: remove needless wrapper r600_texture_depth_flushMarek Olšák2012-07-183-35/+15
|