summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen7: Skip checking if we need a GS program for now.Eric Anholt2012-02-211-1/+0
| | | | | | | We always say no. Improves VS state change microbenchmark performance 7.68747% +/- 1.40826% (n=10). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compute required barycentric interp modes once at FS compile time.Eric Anholt2012-02-214-20/+17
| | | | | | | Improves VS state change microbenchmark performance 1.78817% +/- 0.556878% (n=25). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move VUE map computation to once at VS compile time.Eric Anholt2012-02-2112-49/+42
| | | | | | | | | | With this and the previous patch, 640x480 nexuiz is running 0.169118% +/- 0.0863696% faster (n=121). On a VS state change microbenchmark, performance is increased 8.28645% +/- 0.460478% (n=52). v2: Fix CACHE_NEW_VS comment. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the userclip flag for the VUE map come from VS prog data.Eric Anholt2012-02-2110-39/+29
| | | | | | | | This reduces recomputation of state based on non-clipping-related transform changes, and is a step toward removing VUE map recomputation. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Make the dummy fragment shader work in SIMD16 mode.Kenneth Graunke2012-02-181-5/+7
| | | | | | | | | | If you're resorting to the dummy shader, you've probably already turned off SIMD16 mode. But if you didn't, it would die in a fire. We could either fail to compile in SIMD16 mode...or just fix it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix GPU hangs in the dummy fragment shader.Kenneth Graunke2012-02-181-0/+2
| | | | | | | | The dummy FB write failed to specify EOT and a message length, causing the GPU to hang. Now we can enjoy "everyone's favorite color" again. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Fix rendering from textures after RenderTexture().Eric Anholt2012-02-175-61/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There's a serious trap for drivers: RenderTexture() does not indicate that the texture is currently bound to the draw buffer, despite FinishRenderTexture() signaling that the texture is just now being unbound from the draw buffer. We were acting as if RenderTexture() *was* the start of rendering and that we could make texturing incoherent with the current contents of the renderbuffer. This caused intel oglconform sRGB Mipmap.1D_textures to fail, because we got a call to TexImage() and thus RenderTexture() on a texture bound to a framebuffer that wasn't the draw buffer, so we skipped validating the new image into the texture object used for rendering. We can't (easily) make RenderTexture() indicate the start of drawing, because both our driver and gallium are using it as the moment to set up the renderbuffer wrapper used for things like MapRenderbuffer(). Instead, postpone the setup of the workaround render target miptree until update_renderbuffer time, so that we no longer need to skip validation of miptrees used as render targets. As a bonus, this should make GL_NV_texture_barrier possible. (This also fixes a regression in the gen4 small-mipmap rendering since 3b38b33c1648b07e75dc4d8340758171e109c598, which switched set_draw_offset from image->mt to irb->mt but didn't move the irb->mt replacement up before set_draw_offset). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44961 NOTE: This is a candidate for the 8.0 branch.
* intel: Improve the fallback debug for framebuffer status checks.Eric Anholt2012-02-171-2/+17
|
* i965: Emit Ivybridge VS workaround flushes.Kenneth Graunke2012-02-154-2/+29
| | | | | | | | | | | | | I recently discovered this text in the BSpec. It seems wise to comply, though I haven't observed it to fix anything yet. Fixes a regression in glean/fbo since 28cfa1fa213fe. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45221 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Take # of components into account in try_rewrite_rhs_to_dst.Kenneth Graunke2012-02-151-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | Commit dc7f449d1ac53a66e6efb56ccf2a5953418a26ca introduced a new method for avoiding MOVs: try to rewrite the destination of the instruction that produced the RHS so it writes into the LHS. Unfortunately, this is not safe for swizzled texturing operations, as they return a set of four contiguous registers. Consider the following: (assign (x) (var_ref vec_ctor_x) (swiz x (tex vec4 (var_ref m_sampY) (var_ref m_cordY) 0 1 ()))) In this case, the source and destination registers are equal, since reg_offset is 0 for both. Yet, this is only a partial move: the texture operation generates four registers, and the LHS only covers one. Fixes color distortion in XBMC when using GLSL shaders. NOTE: This is a candidate for the 8.0 branch (with the previous commit). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44333 Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965/fs: Add a new fs_inst::regs_written function.Kenneth Graunke2012-02-151-0/+12
| | | | | | | | | | | | | Certain instructions write more than one register. Texturing, for example, returns 4 registers. (We set rlen to 4 even for TXS and float shadow sampling.) Some math functions return 2. Most return 1. The next commit introduces a use of this function. NOTE: This is a candidate for the 8.0 branch (dependency of a fix). Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* meta: Avoid FBO resizing/reallocating in decompress_texture_imageAnuj Phogat2012-02-151-1/+1
| | | | | | | | | | Reallocate/resize decompress FBO only if texture image width/height is greater than existing decompress FBO width/height. This is a candidate for stable branches. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i915: Fix type of "specoffset" variable.Paul Berry2012-02-141-1/+1
| | | | | | | | | | | | | | | | | Commit 2e5a1a2 (intel: Convert from GLboolean to 'bool' from stdbool.h.) converted the "specoffset" local variable (in intel_tris.c) from a GLboolean to a bool. However, GLboolean was the wrong type for specoffset--it should have been a GLuint (to match the declaration of specoffset in struct intel_context). This patch changes specoffset to the proper type. Fixes piglit test general/two-sided-lighting-separate-specular. This is a candidate for stable branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45917 Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Enable register spilling on gen7 too.Eric Anholt2012-02-141-2/+0
| | | | | | | | | | It turns out the same messages work on gen7, we were just being paranoid. Fixes the penumbra shadows mode of Lightsmark since the register allocation fix. NOTE: This is a candidate for release branches. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Report the failure message when failing to compile the fragment shader.Eric Anholt2012-02-141-0/+3
| | | | | | | | We just abort later, but at least this should result in more informative bug reports. NOTE: This is a candidate for release branches. Reviewed-by: Kenneth Graunke <[email protected]>
* meta: Add pixel store/pack operations in decompress_texture_imageAnuj Phogat2012-02-131-5/+3
| | | | | | | | | | | | | | | | | | | | This patch adds the pixel store operations in decompress_texture_image(). decompress_texture_image() is used in glGetTexImage() for compressed textures with unsigned, normalized values. It also fixes the failures in intel oglconform pxstore-gettex due to following sub test cases: - Test all mipmaps with byte swapping enabled - Test all small mipmaps with all allowable alignment values - Test subimage packing for all mipmap levels Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40864 Note: This is a candidate for stable branches Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965: Fix border color on Ironlake.Kenneth Graunke2012-02-101-1/+1
| | | | | | | | | | | | | | | | | Ironlake appears to check our pointer against the General State Base Address upper bound, rather than ignoring the zero bound as it ought. Unfortunately, since we leave GSBA set to zero, there is no logical upper bound. Set it to the maximum possible value, which should work since our virtual addresses only go up to 2GB. +94 piglits. NOTE: This is a candidate for stable release branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=28924 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Add support for generating MADs.Eric Anholt2012-02-103-0/+56
| | | | | | | | | | | | | Improves nexuiz performance 0.65% +/- .10% (n=5) on my gen6, and .39% +/- .11% (n=10) on gen7. No statistically significant performance difference on warsow (n=5, but only one shader has MADs). v2: Add support for MADs in 16-wide by using compression control. v3: Don't generate MADs when it will force an immediate to be moved to a temp. (it's not clear whether this is a win or not, but it should result in less questionable change to codegen compared to v2). Reviewed-by: Kenneth Graunke <[email protected]> (v2)
* i965/fs: Add missing register allocation for 3rd sources.Eric Anholt2012-02-101-0/+2
| | | | | | | Our only instruction with a 3rd source so far was linterp, and that value was never register-allocated. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for the MAD opcode on gen6+.Eric Anholt2012-02-105-20/+342
| | | | | | v2: Fix MRF handling on gen7. Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* radeon: build fix after 9d9111108eadd65708899284b1cfa9ca425f3ac8Alex Deucher2012-02-101-1/+1
| | | | Signed-off-by: Alex Deucher <[email protected]>
* meta: replace abort() with _mesa_problem()Brian Paul2012-02-101-1/+2
| | | | Reviewed-by: José Fonseca <[email protected]>
* i965/gen7: Fix the length of the MULTISAMPLE state packet in the HiZ op.Eric Anholt2012-02-091-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/gen7: Fix the length of the DS state packet in the HiZ op.Eric Anholt2012-02-091-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* i965/gen7: Fix GPU hangs from the HiZ op.Eric Anholt2012-02-091-2/+3
| | | | | | The wm max threads is in the same dword as the dispatch enable. The hardware gets super angry if you set max threads to 0, even if you aren't dispatching threads.
* i965: Remove file i965/junk, accidentally added in 7b36c68Chad Versace2012-02-081-0/+0
|
* i965: Remove broken symlink to intel_decode.c.Kenneth Graunke2012-02-071-1/+0
| | | | Eric removed intel_decode.c in 61b9ccd9e298ca1d3db55aee0cb2ff78662d6fa6.
* i965/fs: Implement GL_CLAMP behavior on texture rectangles on gen6+.Eric Anholt2012-02-071-5/+49
| | | | | | | | | | | We were doing saturate-based clamping on the [0,width] or [0,height] coordinate, which meant only the first pixel was addressable. Fixes piglit ARB_texture_rectangle/texwrap-RECT-bordercolor NOTE: This is a candidate for the 8.0 release branch. Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Move GL_CLAMP handling to coordinate setup.Eric Anholt2012-02-071-29/+21
| | | | | | | | | We should be able to merge self-move instruction into the MRF move anyway, and this simplifies things for the next commit. NOTE: This is a candidate for the 8.0 release branch. Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix HiZ change compiler warning.Eric Anholt2012-02-071-1/+0
|
* i965: Rewrite the HiZ opChad Versace2012-02-0719-517/+1146
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The HiZ op was implemented as a meta-op. This patch reimplements it by emitting a special HiZ batch. This fixes several known bugs, and likely a lot of undiscovered ones too. ==== Why the HiZ meta-op needed to die ==== The HiZ op was implemented as a meta-op, which caused lots of trouble. All other meta-ops occur as a result of some GL call (for example, glClear and glGenerateMipmap), but the HiZ meta-op was special. It was called in places that Mesa (in particular, the vbo and swrast modules) did not expect---and were not prepared for---state changes to occur (for example: glDraw; glCallList; within glBegin/End blocks; and within swrast_prepare_render as a result of intel_miptree_map). In an attempt to work around these unexpected state changes, I added two hooks in i965: - A hook for glDraw, located in brw_predraw_resolve_buffers (which is called in the glDraw path). This hook detected if a predraw resolve meta-op had occurred, and would hackishly repropagate some GL state if necessary. This ensured that the meta-op state changes would not intefere with the vbo module's subsequent execution of glDraw. - A hook for glBegin, implemented by brwPrepareExecBegin. This hook resolved all buffers before entering a glBegin/End block, thus preventing an infinitely recurring call to vbo_exec_FlushVertices. The vbo module calls vbo_exec_FlushVertices to flush its vertex queue in response to GL state changes. Unfortunately, these hooks were not sufficient. The meta-op state changes still interacted badly with glPopAttrib (as discovered in bug 44927) and with swrast rendering (as discovered by debugging gen6's swrast fallback for glBitmap). I expect there are more undiscovered bugs. Rather than play whack-a-mole in a minefield, the sane approach is to replace the HiZ meta-op with something safer. ==== How it was killed ==== This patch consists of several logical components: 1. Rewrite the HiZ op by replacing function gen6_resolve_slice with gen6_hiz_exec and gen7_hiz_exec. The new functions do not call a meta-op, but instead manually construct and emit a batch to "draw" the HiZ op's rectangle primitive. The new functions alter no GL state. 2. Add fields to brw_context::hiz for the new HiZ op. 3. Emit a workaround flush when toggling 3DSTATE_VS.VsFunctionEnable. 4. Kill all dead HiZ code: - the function gen6_resolve_slice - the dirty flag BRW_NEW_HIZ - the dead fields in brw_context::hiz - the state packet manipulation triggered by the now removed brw_context::hiz::op - the meta-op workaround in brw_predraw_resolve_buffers (discussed above) - the meta-op workaround brwPrepareExecBegin (discussed above) Note: This is a candidate for the 8.0 branch. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43327 Reported-by: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44927 Reported-by: [email protected] Signed-off-by: Chad Versace <[email protected]>
* intel: Avoid divide by zero for very small linear blitsIan Romanick2012-02-071-1/+1
| | | | | | | | | | | | | | | | | | | | | | If size is small (such as 1), pitch = ROUND_DOWN_TO(MIN2(size, (1 << 15) - 1), 4); makes pitch = 0. Then height = size / pitch; causes a division-by-zero exception. If pitch is zero, set height to 1 and avoid the division. This fixes piglit's bin/getteximage-formats test and glean's bufferObject test. NOTE: This is a candidate for the 8.0 release branch. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44971
* intel: Remove num_mapped_regions assertion from _intel_batchbuffer_flushIan Romanick2012-02-071-7/+0
| | | | | | | | | | | | | | There are cases where a buffer can be mapped while another buffer is flushed. This can happen in the CopyPixels meta-op path for piglit's fbo-mipmap-copypix. After some discussion with Eric, it seems this assertion is no longer necessary, and it has always been too strict. NOTE: This is a candidate for the 8.0 branch. Signed-off-by: Ian Romanick <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=43328 Cc: Eric Anholt <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* dri: Don't build libdricommon.la if we don't need itJon TURNEY2012-02-061-1/+5
| | | | | | | | | | | Refine 80aa78142d12b21dd7d4f0edc786af98a159a80f "dri: make sure to build libdricommon.la" so we don't build libdricommon if we aren't building a dri driver which needs it (i.e. if we are just building swrast) In particular, this restores the ability to build the swrast dri driver without having to have a xf86drm.h Signed-off-by: Jon TURNEY <[email protected]>
* intel: check for LLC support when reading mapsEugeni Dodonov2012-02-041-1/+1
| | | | | | | | This checks for advertised LLC support by the GPU instead of relying on the GPU generation for detection. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Eugeni Dodonov <[email protected]>
* intel: verify if hardware has LLC supportEugeni Dodonov2012-02-044-0/+12
| | | | | | | | | Rely on libdrm HAS_LLC parameter to verify if hardware supports it. In case the libdrm version does not supports this check, fallback to older way of detecting it which assumed that GPUs newer than GEN6 have it. Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Eugeni Dodonov <[email protected]>
* intel: FBOs with texture border are unsupportedIan Romanick2012-02-031-0/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FBOs differ from textures in a significant way. With textures, we can strip the border and get correct rendering except when the application fetches texels outside [0,1]. With an FBO, the pixel at (0,0) is in the border. The ARB_framebuffer_object spec says: "If the attached image is a texture image, then the window coordinates (x[w], y[w]) correspond to the texel (i, j, k), from figure 3.10 as follows: i = (x[w] - b) j = (y[w] - b) k = (layer - b) where <b> is the texture image's border width..." Since the border doesn't exist, we can never render any pixels in the correct location. Just mark these FBOs FRAMEBUFFER_UNSUPPORTED. NOTE: This is a candidate for the 8.0 branch. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=42336
* dri: Add Unigine Tropics as an app that requires the GLSL warn workaround.Eric Anholt2012-02-031-0/+3
| | | | | | | I wasn't seeing it be needed because of the previous bug. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eugeni Dodonov <[email protected]>
* dri: Fix typo in xml file that made all applications use the workaround.Eric Anholt2012-02-031-1/+1
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eugeni Dodonov <[email protected]>
* Revert "Fix underlinking in libOSMesa since commit adefee5 "Always build ↵Brian Paul2012-02-021-2/+0
| | | | | | | shared glapi"" This reverts commit 4e5a8937d1a1bfb2a3bd067ed01e036728675fc2. ... to fix build with --enable-osmesa
* Revert "automake: src/mesa/drivers/osmesa"Matt Turner2012-01-314-101/+53
| | | | This reverts commit 275ac7e5c1fd6c1847a428192fe259e50690fced.
* Revert "automake: src/glsl and src/glsl/glcpp"Matt Turner2012-01-311-1/+1
| | | | This reverts commit 9947656168d09f9019600fccc42ca8e0de49b83a.
* osmesa: set RefCount = 1 in new_osmesa_renderbuffer()Brian Paul2012-01-311-0/+1
| | | | | This was lost during the renderbuffer overhaul work. Fixes a failed refcount assertion.
* osmesa: Fix osmesa_context.DataType type.Vinson Lee2012-01-311-1/+1
| | | | | | | | | | | | | | | | | | Fixes these GCC warnings. osmesa.c: In function ‘osmesa_renderbuffer_storage’: osmesa.c:417: warning: comparison is always false due to limited range of data type osmesa.c:423: warning: comparison is always false due to limited range of data type osmesa.c:431: warning: comparison is always false due to limited range of data type osmesa.c:437: warning: comparison is always false due to limited range of data type osmesa.c:447: warning: comparison is always false due to limited range of data type osmesa.c:453: warning: comparison is always false due to limited range of data type osmesa.c:463: warning: comparison is always false due to limited range of data type osmesa.c:466: warning: comparison is always false due to limited range of data type osmesa.c:476: warning: comparison is always false due to limited range of data type osmesa.c:479: warning: comparison is always false due to limited range of data type Signed-off-by: Vinson Lee <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* automake: src/glsl and src/glsl/glcppMatt Turner2012-01-301-1/+1
| | | | | | Reviewed-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]> Signed-off-by: Matt Turner <[email protected]>
* automake: src/mesa/drivers/osmesaMatt Turner2012-01-304-53/+101
|
* dri: Add a default drirc to be installed to provide application workarounds.Eric Anholt2012-01-302-0/+9
| | | | | | | | Specifially, this being present works around a bug in Unigine Sanctuary on i965 which previously resulted in bad rendering. NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add a driconf option to force GLSL extension behavior to "warn".Eric Anholt2012-01-303-1/+14
| | | | | | | | | This can be used to work around broken application behavior, like in Unigine where it attempts to use texture arrays without declaring either "#extension GL_EXT_texture_array : enable" or "#version 130". NOTE: This is a candidate for the 8.0 branch. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Use libdrm's decode functionality instead of the gpu-tools copy.Eric Anholt2012-01-305-2795/+41
| | | | | | While typing out the new decode, I added a fallback mode for dumping when we fail to re-map the BO after execution. This should get us a minimal dump when trying to dump a batch that results in a GPU hang.
* i965: Fix segfault with INTEL_DEBUG=batch on gen7 with samplers present.Eric Anholt2012-01-301-1/+0
| | | | This was a leftover from the conversion of this file for state streaming.