summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* intel: Add I8 and L8 to intel_mesa_format_to_rb_datatype().Eric Anholt2011-04-181-0/+2
| | | | | | Fixes warnings in fbo-storage-formats. Reviewed-by: Brian Paul <[email protected]>
* Revert "intel: Add spans code for the ARB_texture_rg support."Eric Anholt2011-04-181-122/+0
| | | | | | | | | This reverts what remains of commit 28bab24e1698843e27d27204a1117066e7ffeabb. It was garbage, trying to use a MESA_FORMAT enum as a preprocessor token, and I don't know how I thought it was even tested. Reviewed-by: Brian Paul <[email protected]>
* intel: Use mesa core's R8, RG88, R16, RG1616 RB accessors.Eric Anholt2011-04-181-25/+4
| | | | | | | Fixes: ARB_texture_rg/fbo-alphatest-formats Reviewed-by: Brian Paul <[email protected]>
* intel: Use Mesa core's renderbuffer accessors for depth.Eric Anholt2011-04-181-33/+15
| | | | | | | | | | | Since we're using GTT mappings now (no manual detiling), there's really nothing special to accessing these buffers, other than needing the new RowStride field of gl_renderbuffer to accomodate padding. Reduces the driver size by 2.7kb, and improves glean depthStencil performance 3-10x (!) Reviewed-by: Brian Paul <[email protected]>
* intel: Use _mesa_base_tex_format for FBO texture attachments.Eric Anholt2011-04-181-1/+1
| | | | | | | | | | | | The _mesa_base_fbo_format variant doesn't handle some texture internalformats, such as "3". Fixes: fbo-blending-formats. fbo-alphatest-formats EXT_texture_sRGB/fbo-alphatest-formats Reviewed-by: Brian Paul <[email protected]>
* i965: Quit spamming gen6 DP read/write send instructions with gen5 bits.Eric Anholt2011-04-171-6/+0
| | | | | | This was copy-and-paste from originally trying to get DP read/write working reliably, and notably for other common messages (URB, sampler) we weren't doing this.
* i965/fs: Add gen6 register spilling support.Eric Anholt2011-04-175-31/+58
| | | | | | | | | | | Most of this is code movement to get the scratch space allocated in a shared location. Other than that, the only real changes are that the old oword block messages now operate on oword-aligned areas (with new messages for unaligned access, which we don't do), and that the caching control is in the SFID part of the descriptor instead of message control. Fixes glsl-fs-convolution-1.
* r300/compiler: Fix incorrect presubtract conversionTom Stellard2011-04-161-0/+24
| | | | | | | ADD instructions with constant swizzles can't be converted to presubtract operations. NOTE: This is a candidate for the 7.9 and 7.10 branches.
* Revert "r300/compiler: Don't try to convert RGB to Alpha in full instructions"Marek Olšák2011-04-151-2/+1
| | | | | | This reverts commit cd2857fae16e1352f39b37f611797e66619d3fe5. It breaks Unigine Heaven.
* mesa: finish up ARB_texture_floatMarek Olšák2011-04-152-2/+0
| | | | | | | | | | | | | | | Squashed commit of the following: Author: Marek Olšák <[email protected]> mesa: handle floating-point formats in _mesa_base_fbo_format mesa: add ARB/ATI_texture_float, remove MESAX_texture_float commit 123bb110852739dffadcc81ad80b005b1c4f586d Author: Luca Barbieri <[email protected]> Date: Wed Aug 25 01:35:42 2010 +0200 mesa: compute floatMode for FBOs and return it on RGBA_FLOAT_MODE
* i965/fs: Constant-fold immediates in src0 of SEL instructions.Eric Anholt2011-04-134-0/+16
| | | | | | | | | | | This is like what we do for add/mul, but we have to invert the predicate to choose the other source instead. This removes 5 extra moves of constants in nexuiz shaders. No statistically significant performance difference on my Sandybridge laptop (n=5). Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Constant-fold immediates in src0 of CMP instructions.Eric Anholt2011-04-133-0/+45
| | | | | | | This is like what we do with add/mul, but we also have to flip the conditional test. Reviewed-by: Ian Romanick <[email protected]>
* i965: Change assertion condition from implicit to explicitChad Versace2011-04-121-2/+1
| | | | | | | | | | | | | ... because grokking explicit assertions requires fewer neurons. In brw_misc_state.c:emit_depthbuffer, change assertion condition tiling != I915_TILING_X && tiling != I915_TILING_NONE to tiling == I915_TILING_Y Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Define BRW_DEPTHFORMAT_D24_UNORM_X8_UINTChad Versace2011-04-121-0/+1
| | | | | | | This depth format was added in Gen5. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Document brw_context.state.depth_regionChad Versace2011-04-122-1/+23
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Remove unnecessary release/reference of brw_context.state.depth_regionChad Versace2011-04-121-6/+4
| | | | | | | | Release the old depth region and reference the new one *only* if it has changed. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]>
* i965: Add comments about URB size units and limits.Kenneth Graunke2011-04-122-4/+10
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Acked-by: Chris Wilson <[email protected]>
* i965: Never enable the GS on Gen6.Kenneth Graunke2011-04-121-32/+16
| | | | | | | | | | | | | Prior to Gen6, we use the GS for breaking down quads, quad-strips, and line loops. On Gen6, earlier stages already take care of this, so we never need the GS. Since this code is likely completely untested, remove it for now. We can write new code when enabling real geometry shaders. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Revert "i965: Reinstate max-index paranoia"Chris Wilson2011-04-121-1/+1
| | | | | | | | | | | | | | | | | | This reverts commit b4cbd2b312d53a50603e2cda925711bc9def4517. It looked like a safe sanity check. It missed the issue of the start of the buffer not being at 0, but even that was not enough to explain why setting the max vertex index caused glyphs to be dropped from the game 'Achron'. Instead, the issue appears to be related to the use of the vertex bias and so we would need to re-emit the max-index every time we adjusted the bias, so re-emitting the relocations and defeating the original optimisation. Reported-and-tested-by: Thomas Jones <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35163 Signed-off-by: Chris Wilson <[email protected]>
* nouveau_vieux: fix build since sampler objects mergeDave Airlie2011-04-123-12/+12
|
* r600: silence various compiler warningsBrian Paul2011-04-117-10/+19
|
* Merge branch 'arb_sampler_objects'Brian Paul2011-04-1134-228/+232
|\
| * mesa: fixup r600 DRI driver for sampler object changesBrian Paul2011-04-114-26/+26
| |
| * mesa: move sampler state into new gl_sampler_object typeBrian Paul2011-04-1034-218/+222
| | | | | | | | | | | | gl_texture_object contains an instance of this type for the regular texture object sampling state. glGenSamplers() generates new instances of gl_sampler_object which can override that state with glBindSampler().
* | Revert "i965: clear global offset to zero in m0.2 for VS DP read."Zou Nan hai2011-04-121-9/+0
| | | | | | | | | | This reverts commit 66b66295d0bc856c69fdcccc22575580c7ecee16. it was already fixed by commit 9d60a7ce08a67eb8b79c60f829d090ba4a37ed7e
* | i965: Remove hint_gs_always and resulting dead codeIan Romanick2011-04-113-76/+13
| | | | | | | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | intel: Fix ROUND_DOWN_TO macroIan Romanick2011-04-111-3/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the macro would (ALIGN(value - alignment - 1, alignment)). At the very least, this was missing parenthesis around "alignment - 1". As a result, if value was already aligned, it would be reduced by alignment. Condisder: x = ROUND_DOWN_TO(256, 128); This becomes: x = ALIGN(256 - 128 - 1, 128); Or: x = ALIGN(127, 128); Which becomes: x = 128; This macro is currently only used in brw_state_batch (brw_state_batch.c). It looks like the original version of this macro would just use too much space in the batch buffer. It's possible, but not at all clear to me from the code, that the original behavior is actually desired. In any case, this patch does not cause any piglit regressions on my Ironlake system. I also think that ALIGN_FLOOR would be a better name for this macro, but ROUND_DOWN_TO matches rounddown in the Linux kernel. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Keith Whitwell <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | i965: Move the SF VP from state caching to state streaming.Eric Anholt2011-04-113-8/+17
| | | | | | | | | | | | | | | | | | This is a 49.6% +/- 2.0% (n=9, IPS outlier removed) performance improvement for the hacked-up-for-cache-misses scissor-many, and no statistically significant performance difference for the hacked-up-for-cache-hits version (n=9, IPS outlier removed). No statistically significant performance difference from ETQW (n=5) from these last two commits.
* | i965: Change the SF unit from state caching to state streaming.Eric Anholt2011-04-113-107/+45
| | | | | | | | | | | | | | | | | | This is a 28.1% +/- 1.4% (n=10) performance improvement for the hacked-up-for-cache-misses scissor-many (n=10), and no statistically significant wall-time performance difference for the hacked-up-for-cache-hits version (n=9, first outlier in each removed since IPS was warming up. User time increased by about 4.7%, but kernel time decreased equivalently).
* | i965: Turn SF unit and viewport structs into pointers to prep for streaming.Eric Anholt2011-04-111-69/+70
|/ | | | I wanted to separate this mechanical change from the actual work.
* Fix GET_PROGRAM_NAME() on Solaris to not try to modify a read-only stringAlan Coopersmith2011-04-081-1/+19
| | | | Signed-off-by: Alan Coopersmith <[email protected]>
* i965/fs: Remove broken optimization for live intervals in loops.Eric Anholt2011-04-081-4/+2
| | | | | | | | | The theory here was to detect a temporary variable used within a loop, and avoid considering it live across the entire loop. However, it was overeager and failed when the first definition of the variable appeared within the loop but was only conditionally defined. Fixes glsl-fs-loop-redundant-condition.
* i965: clear global offset to zero in m0.2 for VS DP read.Zou Nan hai2011-04-071-0/+9
| | | | Signed-off-by: Zou Nan hai <[email protected]>
* r300/compiler: Don't try to convert RGB to Alpha in full instructionsTom Stellard2011-04-061-1/+2
| | | | Note: This is a candidate for the 7.10 branch.
* dri: Remove driver date from renderer stringIan Romanick2011-04-0516-50/+14
| | | | | | Reviewed-by: Corbin Simpson <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Sedat Dilek <[email protected]>
* r300/compiler: Fix vertex shader MAD instructions with constant swizzlesTom Stellard2011-04-051-0/+18
| | | | NOTE: This is a candidate for the 7.9 and 7.10 branches.
* r300c: fix build since last r300g commitDave Airlie2011-04-051-1/+1
|
* r300g: fix RG/LATC1_SNORM by doing UNORM->SNORM conversion in the shaderMarek Olšák2011-04-052-6/+60
|
* r300/compiler: implement the CND opcodeMarek Olšák2011-04-056-1/+16
| | | | No one uses it now, but I will need it for a lowering pass.
* r300/compiler: set the MSB of ADDR for inline constantsMarek Olšák2011-04-051-2/+5
| | | | The docs say so.
* i965: Add the missing supportable EXT_texture_snorm formatsIan Romanick2011-04-042-0/+9
| | | | | | | | | | | | This class of hardware can natively sample all of the snorm surface formats that DX10 requires, but it can't do some of the legacy GL formats. In particular, all of the alpha, luminance, and intensity formats are unsupported. This partially fixes the breakage in glean's pixelFormats test since GL_EXT_texture_snorm support was added to Mesa. Reviewed-by: Kenneth Graunke <[email protected]>
* r300/compiler: apply the texture swizzle to shadow pass and fail values tooMarek Olšák2011-04-041-8/+20
| | | | | | | | | | | | Piglit tests: - glsl-fs-shadow2d-01 - glsl-fs-shadow2d-02 - glsl-fs-shadow2d-03 - fs-shadow2d-red-01 - fs-shadow2d-red-02 - fs-shadow2d-red-03 NOTE: This is a candidate for the stable branches.
* r300/compiler: propagate SaturateMode down to the result of shadow comparisonMarek Olšák2011-04-041-0/+3
| | | | NOTE: This is a candidate for the stable branches.
* r600c: add new ontario pci idsAlex Deucher2011-04-042-0/+4
| | | | Signed-off-by: Alex Deucher <[email protected]>
* Revert "r300/compiler: Remove obsolete compiler passes"Tom Stellard2011-04-027-0/+415
| | | | | | | This reverts commit 9f013a8233197d4a0482661cb37cfeac1a61b804. These passes are still need for non-GLSL paths like g3dvl and ARB programs.
* i965/fs: Switch W and 1/W in Sandybridge interpolation setup.Kenneth Graunke2011-04-021-4/+4
| | | | | | | | | | | | Various documentation mentions that "W" is handed to the WM stage, but further digging seems to indicate that they really mean 1/W. The code here is still unclear, but changing this fixes piglit test "fragcoord_w" on Sandybridge as well as a Khronos ES2 conformance test. I also tested 3DMarkMobile ES2.0's taiji and hoverjet demos, as well as Nexuiz, just to be safe. NOTE: This is a candidate for the 7.10 branch.
* i965: Fix null register use in Sandybridge implied move resolution.Kenneth Graunke2011-04-031-9/+8
| | | | | | | | | | | Fixes regressions caused by commit 9a21bc6401, namely GPU hangs when running gnome-shell or compiz (Mesa bugs #35820 and #35853). I incorrectly refactored the case that dealt with ARF_NULL; even in that case, the source register needs to be changed to the MRF. NOTE: This is a candidate for the 7.10 branch (if 9a21bc6401 is cherry-picked, take this one too).
* i965: Fix the VS thread limits for GT1, and clarify the WM limits on both.Eric Anholt2011-04-013-4/+13
|
* r300/compiler: Remove obsolete compiler passesTom Stellard2011-03-317-415/+0
| | | | | | Branch emulation and loop unrolling are done in the GLSL frontend. Transforming loops is no longer needed for fragment shaders, but it is still necessary for vertex shaders.
* intel: Fix regression in clear_with_blit from 7bae1c3dChris Wilson2011-03-311-11/+12
| | | | | | | | | | Oops, the mask was being used in the loop to determine whether to use include the stencil || depth values. This began to fail when mask was cleared at the beginning of the loop. So reorder the tests and do the work up-front along with determining the depth_stencil value to use. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=35822 Signed-off-by: Chris Wilson <[email protected]>