summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* clover: add dynamic_cast results checking down in clSetKernelArgument() code ↵Dmitry Cherkassov2013-03-241-0/+12
| | | | | | | path. Signed-off-by: Dmitry Cherkassov <[email protected]> Signed-off-by: Francisco Jerez <[email protected]>
* gallivm: Add code for rgb9e5 shared exponent format to float conversionRoland Scheidegger2013-03-243-3/+118
| | | | | | | | | | | | And use this (and the code for r11g11b10 packed float to float conversion) in the soa texturing code (the generated code looks quite good). Should be an order of magnitude faster probably than using the fallback (not measured). Tested with piglit texwrap GL_EXT_packed_float and GL_EXT_texture_shared_exponent respectively (didn't find much else using it). Reviewed-by: Jose Fonseca <[email protected]>
* gallium,st/mesa: don't use blit-based transfers with software rasterizersMarek Olšák2013-03-2315-1/+35
| | | | | | | | | The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very fast with software rasterizer. Now Gallium drivers have the ability to turn them off. Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* st/mesa: implement blit-based ReadPixelsMarek Olšák2013-03-233-13/+189
| | | | | | | | | | | | | | | Initial version contributed by: Martin Andersson <[email protected]> This is only used if the memcpy path cannot be used and if no transfer ops are needed. It's pretty similar to our TexImage and GetTexImage implementations. The motivation behind this is to be able to use ReadPixels every frame and still have at least 20 fps (or 60 fps with a powerful GPU and CPU) instead of 0.5 fps. Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* mesa: add common format-independent memcpy-based ReadPixels pathMarek Olšák2013-03-234-37/+167
| | | | | | | | | | | I'll need the _mesa_readpixels_needs_slow_path function for the blit-based version, but it's also useful to have this memcpy-based path in one place and not scattered across several functions. v2: add "const" to function parameters Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* mesa: add helper func for checking combined depthstencil buffers from st/mesaMarek Olšák2013-03-235-44/+35
| | | | | Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* mesa: add a common function returning transfer ops for ReadPixelsMarek Olšák2013-03-231-20/+74
| | | | | | | | | | | I'll need both new functions for later. For now, it consolidates the code for determining what the transfer ops should be and makes it a little bit smarter. v2: added "const" Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* mesa: handle HALF_FLOAT like FLOAT in get_tex_rgbaMarek Olšák2013-03-231-0/+1
| | | | | | | NOTE: This is a candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]>
* llvmpipe: add EXT_packed_float render target format supportRoland Scheidegger2013-03-224-2/+382
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | New conversion code to handle conversion from/to r11g11b10 AoS to/from SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA (which works pretty much the same as r11g11b10 except for the packing). (This code should also be used for texture sampling instead of relying on u_format conversion but it's not yet, so rgb9e5 is unused.) Unfortunately a crazy amount of hacks is necessary to get the conversion code running in llvmpipe's generate_unswizzled_blend, which isn't well suited for formats where the storage representation has nothing to do with what's needed for blending (moreover, the conversion will convert from packed AoS values, which is the storage format, to float SoA values, because this is much more natural for the conversion, and likewise from SoA values to packed AoS values - but the "blend" (which includes trivial things like partial mask) works on AoS values, so incoming fs values will go SoA->AoS, values from destination will go packed AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably isn't the most efficient way though the shuffles are probably bearable). Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter), still need to verify Inf/NaNs (where most of the complexity in the conversion comes from actually). v2: drop the (very bogus) rgb9e5 part, and do component extraction in the helper code for r11g11b10 to float conversion, making the code slightly more compact (suggested by Jose), now that there are no other callers left this works quite well. (Could do the same for the opposite way but it's less than ideal there, final part of packing needs to be done in caller anyway and there'd be another conditional.) v3: minor style and comment fixes. Also fix a potential issue with negative zero being potentially returned by max(src, zero) as we don't have well-defined min/max behavior (fortunately no additonal cost). Reviewed-by: Jose Fonseca <[email protected]>
* r600g: Honour legacy debugging environment variablesMichel Dänzer2013-03-221-0/+10
| | | | | | | | This helps minimize confusion / effort when moving between branches or helping others. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno: add pipe->blitRob Clark2013-03-211-3/+48
| | | | Signed-off-by: Rob Clark <[email protected]>
* i965: Add a driconf option to disable flush throttling.Paul Berry2013-03-214-2/+15
| | | | | | | | | | | | | | | Normally when submitting the first batch buffer after a flush, we check whether the GPU has completed processing of the first batch buffer of the previous frame. If it hasn't, we wait for it to finish before submitting any more batches. This prevents GPU-heavy and CPU-light applications from racing too far ahead of the current frame, but at the expense of possibly lower frame rates. Sometimes when benchmarking we want to disable this mechanism. This patch adds the driconf option "disable_throttling" to disable the throttling mechanism. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Implement TEXTURE_IMMUTABLE_LEVELS for ES 3.0.Matt Turner2013-03-213-0/+14
| | | | | | | NOTE: This is a candidate for the 9.1 branch. Fixes piglit's texture-immutable-levels test. Reported-by: Marek Olšák <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glx: Build with VISIBILITY_CFLAGS in automakeAdam Jackson2013-03-211-0/+1
| | | | | | Note: This is a candidate for the stable branches. Signed-off-by: Adam Jackson <[email protected]>
* softpipe: silence some asst. MSVC type warnings in sp_tex_sample.cBrian Paul2013-03-211-6/+6
|
* softpipe: silence some MSVC signed/unsigned warningsBrian Paul2013-03-212-2/+2
|
* softpipe: silence some MSVC float/double warningsBrian Paul2013-03-211-6/+6
|
* rbug: silence some MSVC signed/unsigned warningsBrian Paul2013-03-212-2/+2
|
* postprocess: silence some MSVC float/int warningsBrian Paul2013-03-212-4/+4
|
* meta: fix incorrect slice, r coordinate computationBrian Paul2013-03-211-4/+9
| | | | | | | | The arithmetic to convert a 3D texture slice to an R coordinate was incorrect. Found when MSVC warned of a divide by zero. Note that we don't actually ever hit this path. We don't decompress slices of 3D textures and we don't support 3D mipmap generation yet.
* vega: fix MSVC warning about missing return statementBrian Paul2013-03-211-0/+1
|
* meta: minor indentation fixBrian Paul2013-03-211-1/+1
|
* radeonsi: Emit pixel shader state even when only the vertex shader changedMichel Dänzer2013-03-211-0/+5
| | | | | | | | Fixes random failures with piglit glsl-max-varyings. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Christian König <[email protected]>
* i965/vs: Add IR dumping for immediates.Kenneth Graunke2013-03-201-0/+16
| | | | | | | This makes dump_instructions more useful. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Add built-in functions for GLSL 1.50.Kenneth Graunke2013-03-202-0/+1145
| | | | | | | | | | | | This makes basic built-in functions work in GLSL 1.50. It supports everything except the new Geometry Shader functions. The new 150.glsl file is 140.glsl plus ARB_texture_multisample.glsl; 150.frag is identical to 140.frag except for the #version bump. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: Add sampler2DMS/sampler2DMSArray types to GLSL 1.50.Kenneth Graunke2013-03-202-1/+12
| | | | | | | | | GLSL 1.50 includes support for the new sampler types introduced by the ARB_texture_multisample extension. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: Bump standalone compiler versions to 1.50.Kenneth Graunke2013-03-202-3/+3
| | | | | | | | The version bumps are necessary in order to compile built-ins for 1.50. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Don't use texture swizzling to force alpha to 1.0 if unnecessary.Kenneth Graunke2013-03-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 33599433c7 began setting the texture swizzle mode to XYZ1 for RED, RG, and RGB textures in order to force alpha to 1.0 in case we actually stored the texture as RGBA. This had a unforseen performance implication: the shader precompile assumes that the texture swizzle mode will be XYZW for non-shadow sampler types. By setting it to XYZ1, this means every shader used with a RED, RG, or RGB texture has to be recompiled. This is a very common case. Unfortunately, there's no way to improve the precompile, since RGBA textures still need XYZW, and there's no way to know by looking at the shader source what texture formats might be used. However, we only need to smash alpha to 1.0 if the texture's memory format actually has alpha bits. If not, the sampler already returns 1.0 for us without any special swizzling. XRGB8888, for example, is a very common case where this occurs. This partially fixes a performance regression since commit 33599433c7. More work is required to fully fix it in all cases. This at least helps Warsow. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Carl Worth <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Don't print a fatal-looking message if intelCreateContext fails.Kenneth Graunke2013-03-201-1/+0
| | | | | | | | | | | | | | | | | | | | With the old context creation mechanism, an application asked the GL to give it a context. Failing to produce a context was a fatal error. Now, with GLX_ARB_create_context, the application can request a specific version. If it's higher than the maximum version we support, context creation will fail. But this is a normal error that applications recover from. In particular, the new glxinfo tries to create OpenGL 4.3, 4.2, 4.1, 4.0, 3.3, and 3.2 contexts before finally succeeding at creating a 3.1 context. This led to it printing the following message 6 times: "brwCreateContext: failed to init intel context" There's no need to alarm users (and developers) with such a message. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/gen7: Align all depth miplevels to 8 in the X direction.Eric Anholt2013-03-201-1/+9
| | | | | | | | | | | | | On an INTEL_DEBUG=perf piglit run on IVB, reduces the instances of "HW workaround: blit" (the printouts from the misaligned-depth workaround blits) from 725 to 675. It doesn't totally eliminate the workaround blit, because we still have problems with Y offsets that we can't fix (since texturing can only align miplevels up to 2 or 4, not 8). No regressions on piglit/es3conform on IVB. Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0: fix max varying count, move CLIPVERTEX,FOG out of the wayChristoph Bumiller2013-03-203-12/+36
| | | | | | The card spews an error if I use all 128 generic slots. Apparently the real limit isn't just dictated by the address space layout.
* gallium: add TGSI_SEMANTIC_TEXCOORD,PCOORD v3Christoph Bumiller2013-03-2028-92/+171
| | | | | | | | | | | | | | | This makes it possible to identify gl_TexCoord and gl_PointCoord for drivers where sprite coordinate replacement is restricted. The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings should be hidden behind the GENERIC semantic or not. With this patch only nvc0 and nv30 will request that they be used. v2: introduce a CAP so other drivers don't have to bother with the new semantic v3: adapt to introduction gl_varying_slot enum
* gallium-egl: Fix compile errors introduced in de315f76aKristian Høgsberg2013-03-195-5/+5
| | | | | | | | The commit changed API in a helper library shared by both egl_dri2 and the gallium egl state tracker, but only egl_dri2 was updated to use the new interface. Tested-by: Giulio Camuffo <[email protected]>
* i965/fs: Avoid unnecessary recompiles due to POS bit of proj_attrib_mask.Paul Berry2013-03-192-2/+12
| | | | | | | | | | | | | | | | | | | | | | Previous to this patch, when using fixed function fragment shading, bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask was being set differently during precompiles and normal usage. During precompiles it was being set only if the fragment shader reads from window position (which it never does), so it was always being set to 0. During normal usage it was being set if the vertex shader writes to all 4 components of gl_Position (which it usually does), so it was usually being set to 1. As a result, we were almost always doing an extra recompile for the fixed function fragment shader. The recompile was totally unnecessary, though, because brw_wm_prog_key::proj_attrib_mask is only consulted for fs_visitor::emit_general_interpolation(), which isn't used for VARYING_SLOT_POS. This patch avoids the unnecessary recompile by always setting bit VARYING_BIT_POS of brw_wm_prog_key::proj_attrib_mask to 1. Reviewed-by: Kenneth Graunke <[email protected]>
* ff_fragment_shader: Don't do unnecessary (and dangerous) uniform setup.Paul Berry2013-03-191-16/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, right after calling _mesa_glsl_link_shader(), the fixed function fragment shader code made several calls with the ostensible purpose of setting up uniforms for the fragment shader it just created. These calls are unnecessary, since _mesa_glsl_link_shader() calls driver->LinkShader(), which takes care of calling these functions (or their equivalent). Also, they are dangerous to call after _mesa_glsl_link_shader() has returned, because on back-ends such as i965 which do precompilation, _mesa_glsl_link_shader() may have already cached pointers to the existing uniform structures; attempting to set up the uniforms again invalidates those cached pointers. It was only by sheer coincidence that this wasn't manifesting itself as a bug. It turns out that i965's precompile mechanism was always setting bit 0 of brw_wm_prog_key::proj_attrib_mask to 0 for fixed function fragment shaders, but during normal usage this bit usually gets set to 1. As a result, the precompiled shader (with its invalid uniform pointers) was not being used. I'm about to introduce some changes that cause bit 0 of proj_attrib_mask to be set consistently between precompilation and normal usage, so to avoid regressions I need to get rid of the dangerous duplicate uniform setup code first. Reviewed-by: Ian Romanick <[email protected]>
* i965: Avoid unnecessary copy when depthstencil workaround invoked by clear.Paul Berry2013-03-1910-17/+52
| | | | | | | | | | | | | | | | | | | | | | | | | Since apps typically begin rendering with a call to glClear(), it is likely that when brw_workaround_depthstencil_alignment() moves a miplevel to a temporary buffer, it can avoid doing a blit, since the contents of the miplevel are about to be erased. This patch adds the necessary plumbing to determine when brw_workaround_depthstencil_alignment() is being called as a consequence of glClear(), and avoids the unnecessary blit when it is safe to do so. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> v2: Eliminate unnecessary call to _mesa_is_depthstencil_format(). Fix handling of depth buffer in depth/stencil format. v3: Use correct bitfields for clear_mask. Fix handling of depth buffer in depth/stencil format when hardware uses separate stencil. When invalidating, make sure we still reassociate the image to the new miptree. Reviewed-by: Eric Anholt <[email protected]>
* r600g: don't emit SQ_DYN_GPR_RESOURCE_LIMIT_1 on caymanAlex Deucher2013-03-191-1/+0
| | | | | | | | | Doesn't exist on the asic and will cause a CS rejection if VM is disabled. Note: this is a candidate for the 9.1 branch. Signed-off-by: Alex Deucher <[email protected]>
* r600g: emit DB_SRESULTS_COMPARE_STATE0 on r6xx/r7xxAlex Deucher2013-03-192-1/+3
| | | | | | Not using HiS yet, but matches what we do on evergreen+. Signed-off-by: Alex Deucher <[email protected]>
* winsys/svga: improve error/debug message outputBrian Paul2013-03-193-25/+35
| | | | | | | | Use vmw_printf() just for extra debugging info (off by default). Use vmw_error() for real errors/failures/etc that we definitely want to report. Reviewed-by: José Fonseca <[email protected]>
* tgsi: fix uninitialized declaration array fieldsBrian Paul2013-03-191-0/+13
| | | | | | Fixes a few regressions since the TGSI array changes. Reviewed-by: José Fonseca <[email protected]>
* egl_dri2: Lower __DRI_IMAGE version requirement back to 1Kristian Høgsberg2013-03-192-2/+13
| | | | | We check the extension version manually instead and verify that we have the createImageFromFds function before enabling prime fd passing.
* radeon/llvm: Do not link against libgallium when building statically.Maarten Lankhorst2013-03-191-1/+4
| | | | | | | NOTE: This is a candidate for the 9.1 branch. Tested-by: Vincent Lejeune <[email protected]> Signed-off-by: Maarten Lankhorst <[email protected]>
* gles2: Add an ABI-check testMatt Turner2013-03-192-0/+294
| | | | | | | | Checks that no functions are exported that are not part of the ABI. Note that currently we are exporting functions that are aliased to functions that are part of the ABI. They shouldn't be exported, but the XML descriptions don't adequately describe this case.
* gles1: Add an ABI-check testMatt Turner2013-03-192-0/+256
| | | | | | | | Checks that no functions are exported that are not part of the ABI. Note that currently we are exporting functions that are aliased to functions that are part of the ABI. They shouldn't be exported, but the XML descriptions don't adequately describe this case.
* gallium/egl: fix out-of-tree buildAndreas Boll2013-03-191-1/+1
| | | | | | | | | | Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/15-fix-oot-build.diff;h=7040999a22d3937d0578cfd85ee2c71d7dc614bb;hb=refs/heads/ubuntu%2B1 NOTE: This is a candidate for the 9.1 branch. Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* osmesa: fix out-of-tree buildAndreas Boll2013-03-191-0/+1
| | | | | | | | | | | | Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/14-fix-osmesa-build.diff;h=00581d0e1833c5492d9050e1bf3d5e658cad782e;hb=refs/heads/ubuntu%2B1 v2: Move the added line immediately after -I$(top_srcdir)/src/mapi NOTE: This is a candidate for the 9.1 and 9.0 branches. Acked-by: Kenneth Graunke <[email protected]> (v1) Reviewed-by: Matt Turner <[email protected]>
* mesa: use ieee fp on s390 and m68kAndreas Boll2013-03-191-1/+2
| | | | | | | | | | | | | | Taken from downstream: http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/02_use-ieee-fp-on-s390-and-m68k.patch;h=d3d6c1d7fec3c72ecf320706167deb61c52636c3;hb=refs/heads/ubuntu%2B1 Fixes Debian bug #349437. Patch written by David Nusinow. NOTE: This is a candidate for stable branches. Acked-by: Kenneth Graunke <[email protected]> Acked-by: Matt Turner <[email protected]>
* gallivm: fix return opcode handling in main function of a shaderRoland Scheidegger2013-03-192-3/+18
| | | | | | | | | | | | | | | | | | | If we're in some conditional or loop we must not return, or the code after the condition is never executed. (v2): And, we also can't just continue as nothing happened, since the mask update code would later check if we actually have a mask, so we need to remember that there was a return in main where we didn't exit (to illustrate this, a ret in a if clause would cause a mask update which is still ok as we're in a conditional, but after the endif the mask update code would drop the mask hence bringing execution back to pixels which should have their execution mask set to zero by the ret). Thanks to Christoph Bumiller for figuring this out. This fixes https://bugs.freedesktop.org/show_bug.cgi?id=62357. Note: This is a candidate for the stable branches. Reviewed-by: Jose Fonseca <[email protected]>
* freedreno: clear fixesRob Clark2013-03-192-5/+16
| | | | | | Some fixes for clearing only depth or only stencil. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: enable indirect adressingChristian König2013-03-192-6/+1
| | | | | | | Fixing 16 piglit tests. Signed-off-by: Christian König <[email protected]> Reviewed-by: Tom Stellard <[email protected]>