summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: do constant buffer bounds checking in shadersZack Rusin2014-01-165-39/+154
| | | | | | | | | | | | | | | | | | | | It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/util: easy fixes for NULL colorbuffersMarek Olšák2014-01-132-1/+7
| | | | Reviewed-by: Brian Paul <[email protected]>
* st/mesa: bind NULL colorbuffers as specified by glDrawBuffersMarek Olšák2014-01-132-0/+25
| | | | | | | | | | | | | | | | | | | | An example why it is required: Let's say there's a fragment shader writing to gl_FragData[0..1]. The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0}); That means gl_FragData[0] is unused and gl_FragData[1] is written to GL_COLOR_ATTACHMENT0. st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0] was written to GL_COLOR_ATTACHMENT0, which was wrong. This commit fixes it, but drivers must also be fixed not to crash when binding NULL colorbuffers. There is also a new set of piglit tests for this. The MSAA state also had to be fixed not to crash when reading fb->cbufs[0]. Reviewed-by: Brian Paul <[email protected]>
* cso_context: Fix cso_context::sample_mask initial value.José Fonseca2014-01-071-1/+1
| | | | | | | | | | | | The initial value of cso_context::sample_mask_saved is irrelevant as it will be overwritten with cso_context::sample_mask in cso_save_sample_mask. Therefore it is cso_context::sample_mask that needs to be properly initialized. This fixes regressions in blits and mipmap generation after adding support for sample_mask to llvmpipe. Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/draw: remove double semicolonDave Airlie2014-01-071-1/+1
| | | | | | code cleanup. Signed-off-by: Dave Airlie <[email protected]>
* pipe_loader/sw: close dev->lib when initialization failsAaron Watry2013-12-231-1/+4
| | | | | | | | Prevents a memory leak. Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* gallium/u_blitter: implement shader-based MSAA resolve with bilinear filteringMarek Olšák2013-12-143-31/+149
| | | | | | | | | For scaled resolve. The filter is only good for magnification. If somebody has an idea how to implement a good filter for minification, I'm all ears. I'd have to use derivatives probably. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: implement shader-based MSAA resolveMarek Olšák2013-12-143-23/+158
| | | | | | | | | We need this for integer formats and upside-down blits, which Radeons don't support for MSAA resolving. It can be used by calling util_blitter_blit. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: remove useless parameters from some functionsMarek Olšák2013-12-142-22/+13
| | | | Reviewed-by: Brian Paul <[email protected]>
* gallivm: fix pointer type for stmxcsr/ldmxcsrRoland Scheidegger2013-12-141-2/+7
| | | | | | | | The argument is a i8 pointer not a i32 pointer (even though the value actually stored/loaded IS i32). Older llvm versions didn't care but 3.2 and newer do leading to crashes. Reviewed-by: Zack Rusin <[email protected]>
* swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)Dave Airlie2013-12-131-1/+1
| | | | | | | | | | | | | | | | | | | | | | | This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Dave Airlie <[email protected]> swrast: add support for copy_sub_buffer
* util: fix compile breakageBrian Paul2013-12-121-1/+1
| | | | D'oh!
* util: move variable declaration out of for-loopBrian Paul2013-12-121-1/+3
| | | | To fix MSVC build.
* gallium/util: implement new color clear API in u_blitterMarek Olšák2013-12-121-3/+42
|
* gallium: allow choosing which colorbuffers to clearMarek Olšák2013-12-123-5/+6
| | | | | | | | | | | | | | Required for glClearBuffer, which only clears one colorbuffer attachment. Example: If the first colorbuffer is float and the second one is int: pipe->clear(pipe, PIPE_CLEAR_COLOR0, float_clear_color, ...); pipe->clear(pipe, PIPE_CLEAR_COLOR1, int_clear_color, ...); This doesn't need any driver changes yet, because all drivers just use: if (flags & PIPE_CLEAR_COLOR) .. The drivers which support GL 3.0 will have to implement it properly though.
* draw: fix vbuf caching of vertices with inject front faceZack Rusin2013-12-101-0/+1
| | | | | | | | | | | | | | | Caching in the vbuf module meant that once a vertex has been emitted it was cached, but it's possible for a vertex at the same location to be emitted again, but this time with a different front-face semantic. Caching was causing the first version of the vertex to be emitted, which resulted in the renderer getting incorrect front-face attributes. By reseting the vertex_id (which is used for caching) we make sure that once a front-face info has been injected the vertex will endup getting emitted. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix blending with half-float formatsZack Rusin2013-12-102-0/+82
| | | | | | | | | | | | | The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Zack Rusin <[email protected]>
* st/mesa: implement layered framebuffer clear for the clear_with_quad fallbackMarek Olšák2013-12-032-0/+25
| | | | Same approach as in u_blitter.
* gallium/util: implement layered framebuffer clear in u_blitterMarek Olšák2013-12-036-25/+106
| | | | | | | | | | | | | All bound layers (from first_layer to last_layer) should be cleared. This uses a vertex shader which outputs gl_Layer = gl_InstanceID, so each instance goes to a different layer. By rendering a quad and setting the instance count to the number of layers, it will trivially clear all layers. This requires AMD_vertex_shader_layer (or PIPE_CAP_TGSI_VS_LAYER), which only radeonsi supports at the moment. r600 could do this too. Standard DX11 hardware will have to use a geometry shader though, which has higher overhead.
* trace: Dump PIPE_QUERY_* enums.José Fonseca2013-11-282-0/+36
| | | | Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/cso: fix sampler / sampler_view countsRoland Scheidegger2013-11-281-11/+16
| | | | | | | | | Now that it is possible to query drivers for the max sampler view it should be safe to increase this without crashing. Not entirely convinced this really works correctly though if state trackers using non-linked sampler / sampler_views use this. Reviewed-by: Jose Fonseca <[email protected]>
* gallium: new shader cap bit for the amount of sampler viewsRoland Scheidegger2013-11-281-0/+2
| | | | | | | | | Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <[email protected]>
* tgsi: Prevent emission of instructions with empty writemask.José Fonseca2013-11-222-0/+42
| | | | | | | These degenerate instructions can often be emitted by state trackers when the semantics of instructions don't match precisely. Reviewed-by: Brian Paul <[email protected]>
* tgsi: Rework calls to ureg_emit_insn().José Fonseca2013-11-221-96/+104
| | | | | | Mere syntactical change. Reviewed-by: Brian Paul <[email protected]>
* gallium: Make TGSI_SEMANTIC_FOG register four-component wide.José Fonseca2013-11-212-12/+1
| | | | | | | | | | | | | | | | | | | | | | | D3D9 Shader Model 2 restricted the fog register to one component, http://msdn.microsoft.com/en-us/library/windows/desktop/bb172945.aspx , but that restriction no longer exists in Shader Model 3, and several WHCK tests enforce that. So this change: - lifts the single-component restriction TGSI_SEMANTIC_FOG from Gallium interface - updates the Mesa state tracker to enforce output fog has (f, 0, 0, 1) - draw module was updated to leave TGSI_SEMANTIC_FOG output registers alone Several gallium drivers that are going out of their way to clear TGSI_SEMANTIC_FOG components could be simplified in the future. Thanks to Si Chen and Michal Krol for identifying the problem. Testing done: piglit fogcoord-*.vpfp tests Reviewed-by: Roland Scheidegger <[email protected]>
* tgsi_exec: Fix mask calculation for emit_kill_if.José Fonseca2013-11-211-0/+3
| | | | | | | | | | Same as Si Chen's commit e7a5905d8a3960b0981750f8131e3af9acbfcdb8 for tgsi_exec module. Not actually tested, because softpipe is failing the test that caught this bug due to unrelated issues. Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: Ignore unknown file type in non-debug builds.Vinson Lee2013-11-201-0/+1
| | | | | | | Fixes "Uninitialized pointer read" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* u_gen_mipmap: Use untampered cubemap texture coords when generating mipmaps.José Fonseca2013-11-205-6/+19
| | | | | | | | | | | | | | | | | | | It's not necessary to scale down cubemap texture coords when generating mipmaps: we are doing a 2x minification therefore it's guaranteed that the texture coords will always be at least 1 texel away of the edges. Scaling down can actually be harmful, as it may cause artefacts when generating mipmaps with nearest filtering. Sample points will lie exactly in the middle each 2x2 texels, so the scaling factor was causing different texels to be take on each quadrant of the cube face. This is apparent with a 1x1 checkerboard pattern in the base mipmap level: instead of next mipmap level receiving a constant color throughout the face, it will have different colors for each quadrant of the face. The behaviour for blits is left untouched for now, but the cubemap texture coord scaling hack should be reconsidered eventually. Reviewed-by: Brian Paul <[email protected]>
* gallivm: Fix mask calculation for emit_kill_if.Si Chen2013-11-191-5/+8
| | | | | | | | | | The exec_mask must be taken in consideration, just like emit_kill above. The tgsi_exec module has the same bug and should be fixed in a future change. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* postprocess: document the pp_init() function.Brian Paul2013-11-181-1/+8
| | | | Reviewed-by: Marek Olšák <[email protected]>
* postprocess: move #defines to filters.hBrian Paul2013-11-182-3/+4
| | | | | | They're not needed in postprocess.h Reviewed-by: Marek Olšák <[email protected]>
* postprocess: refactor header files, etcBrian Paul2013-11-188-47/+70
| | | | | | | | | Move private data structures and function prototypes out of the public postprocess.h header file. Create a pp_private.h for the shared, private data structures, functions. Remove pp_program.h header. Reviewed-by: Marek Olšák <[email protected]>
* postprocess: rename program to pp_programBrian Paul2013-11-188-23/+23
| | | | | | To match the pp_ namespace convention. Reviewed-by: Marek Olšák <[email protected]>
* postprocess: simplify pp_free() codeBrian Paul2013-11-181-14/+13
| | | | Reviewed-by: Marek Olšák <[email protected]>
* indices: add comments, assertions in u_indices.c fileBrian Paul2013-11-151-0/+26
| | | | | Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium/pipe_loader: un-reference udev resources when we're done with them.Aaron Watry2013-11-151-0/+3
| | | | | | Reviewed-by: Tom Stellard <[email protected]> CC: "10.0" <[email protected]>
* gallivm: Compile flag to debug TGSI execution through printfs.José Fonseca2013-11-145-47/+222
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is similar to tgsi_exec.c's DEBUG_EXECUTION compile flag. I had prototyped this for a while while debugging an issue, but finally cleaned this up and added a few more bells and whistles. v2: Use '$' as marker; better output. Thanks to Brian, Zack and Roland reviews. Here is a sample output. CONST[0].x = 0.00625000009 0.00625000009 0.00625000009 0.00625000009 CONST[0].y = -0.00714285718 -0.00714285718 -0.00714285718 -0.00714285718 CONST[0].z = -1 -1 -1 -1 CONST[0].w = 1 1 1 1 IN[0].x = 143.5 175.5 175.5 143.5 IN[0].y = 123.5 123.5 155.5 155.5 IN[0].z = 0 0 0 0 IN[0].w = 1 1 1 1 $ 1: RCP TEMP[0].w, IN[0].wwww TEMP[0].w = 1 1 1 1 $ 2: MAD TEMP[0].xy, IN[0], CONST[0], CONST[0].zwzw TEMP[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 TEMP[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 3: MUL OUT[0].xy, TEMP[0], TEMP[0].wwww OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 $ 4: MUL OUT[0].z, IN[0].zzzz, TEMP[0].wwww OUT[0].z = 0 0 0 0 $ 5: MOV OUT[0].w, TEMP[0] OUT[0].w = 1 1 1 1 $ 6: END OUT[0].x = -0.103124976 0.0968750715 0.0968750715 -0.103124976 OUT[0].y = 0.117857158 0.117857158 -0.110714316 -0.110714316 OUT[0].z = 0 0 0 0 OUT[0].w = 1 1 1 1
* gallivm,llvmpipe: fix float->srgb conversion to handle NaNsRoland Scheidegger2013-11-144-26/+43
| | | | | | | | | | | | | | | | | | | | | | | | d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offsetRoland Scheidegger2013-11-121-8/+13
| | | | | | | | | | Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <[email protected]>
* gallium: fix build on GNU/Hurd due to missing PIPE_OS_HURD detectionCyril Brulebois2013-11-121-6/+6
| | | | | | | | Thanks to Pino Toscano. Patch from Debian package. Cc: "10.0" <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* util: set all unused cbufs to NULL in util_copy_framebuffer_state()Brian Paul2013-11-111-1/+2
| | | | | | This helps fix an issue in the svga driver, and is just safer all-around. Reviewed-by: José Fonseca <[email protected]>
* vl: use a separate context for shader based decode v2Christian König2013-11-082-61/+124
| | | | | | | | This makes VDPAU thread save again. v2: fix some memory leaks reported by Aaron Watry. Signed-off-by: Christian König <[email protected]>
* gallivm: deduplicate some indirect register address codeRoland Scheidegger2013-11-081-157/+96
| | | | | | | | | There's only one minor functional change, for immediates the pixel offsets are no longer added since the values are all the same for all elements in any case (it might be better if those weren't stored as soa vectors in the first place maybe). Reviewed-by: Zack Rusin <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-077-19/+80
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: fix build on GNU/kFreeBSDFabio Pedretti2013-11-061-1/+1
| | | | | | | Patch from Debian package Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Andreas Boll <[email protected]>
* gallivm: fix indirect addressing of inputsRoland Scheidegger2013-11-061-17/+28
| | | | | | | | | | | | We weren't adding the soa offsets when constructing the indices for the gather functions. That meant that we were always returning the data in the first element. (Copied straight from the same fix for temps.) While here fix up a couple of broken comments in the fetch functions, plus don't name a straight float type float4 which is just confusing. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* gallivm: optimize lp_build_minify for sseRoland Scheidegger2013-11-053-13/+54
| | | | | | | | | | | | | | | | | SSE can't handle true vector shifts (with variable shift count), so llvm is turning them into a mess of extracts, scalar shifts and inserts. It is however possible to emulate them in lp_build_minify with float muls, which should be way faster (saves over 20 instructions per 8-wide lp_build_minify). This wouldn't work for "generic" 32bit shifts though since we've got only 24bits of mantissa (actually for left shifts it would work by using sse41 int mul instead of float mul but not for right shifts). Note that this has very limited scope for now, since this is only used with per-pixel lod (otherwise we're avoiding the non-constant shift count by doing per-quad shifts manually), and only 1d textures even then (though the latter should change). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util/u_format: take normalized flag in consideration in ↵José Fonseca2013-11-051-0/+3
| | | | | | util_format_is_rgba8_variant Just happened to notice it was missing while looking at it.
* gallivm: Remove llvm::DisablePrettyStackTrace for LLVM >= 3.4.Vinson Lee2013-11-041-0/+2
| | | | | | | | | | | | | LLVM 3.4 r193971 removed llvm::DisablePrettyStackTrace and made the pretty stack trace opt-in rather than opt-out. The default value of DisablePrettyStackTrace has changed to true in LLVM 3.4 and newer. Signed-off-by: Vinson Lee <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=60929 Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* tgsi/scan: set maximum index for each constant bufferMarek Olšák2013-11-042-1/+13
|