summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/llvmpipe
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: use pipe_sampler_view_release() to avoid segfaultJonathan Liu2013-12-221-0/+6
| | | | | | | | | This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <[email protected]> Signed-off-by: Jonathan Liu <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: get rid of barycentric calculation of a0Roland Scheidegger2013-12-141-66/+4
| | | | | | | Didn't really work as well as hoped (in particular it was not generally more accurate), will solve this differently. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: (trivial) get rid of triangle subdivision codeRoland Scheidegger2013-12-143-182/+1
| | | | | | | | This code was always problematic, and with 64bit rasterization we no longer need it at all. Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)Dave Airlie2013-12-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Dave Airlie <[email protected]> swrast: add support for copy_sub_buffer
* llvmpipe: add plumbing for ARB_depth_clampMatthew McClure2013-12-113-35/+60
| | | | | | | | | | With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when clamping the fragment's zw value. To support this, the variant key now includes the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's (depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: add a very useful (disabled) debugging outputZack Rusin2013-12-101-0/+20
| | | | | | | | Disabled by default, but it's very useful when needed. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix blending with half-float formatsZack Rusin2013-12-101-5/+26
| | | | | | | | | | | | | The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Zack Rusin <[email protected]>
* llvmpipe: clamp fragment shader depth write to the current viewport depth range.Matthew McClure2013-12-0913-29/+255
| | | | | | | | | | | | | | | | | With this patch, generate_fs_loop will clamp any fragment shader depth writes to the viewport's min and max depth values. Viewport selection is determined by the geometry shader output for the viewport array index. If no index is specified, then the default viewport index is zero. Semantics for this path can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx. lp_jit_viewport was created to store viewport information visible to JIT code, and is validated when the LP_NEW_VIEWPORT dirty flag is set. lp_rast_shader_inputs is responsible for passing the viewport_index through the rasterizer stage to fragment stage (via lp_jit_thread_data). Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium: add support for AMD_vertex_shader_layerMarek Olšák2013-12-031-0/+2
|
* gallium: new shader cap bit for the amount of sampler viewsRoland Scheidegger2013-11-281-0/+5
| | | | | | | | | Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: support 8bit subpixel precisionZack Rusin2013-11-258-148/+321
| | | | | | | | | | | | | 8 bit precision is required by d3d10 but unfortunately requires 64 bit rasterizer. This commit implements 64 bit rasterization with full support for 8bit subpixel precision. It's a combination of all individual commits from the llvmpipe-rast-64 branch. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: (trivial) disable new accurate origin calculationRoland Scheidegger2013-11-221-1/+1
| | | | It looks like there's some bugs in it...
* llvmpipe: calculate more accurate interpolation value at originRoland Scheidegger2013-11-211-6/+82
| | | | | | | | | | | | | | | | | Some rounding errors could crop up when calculating a0. Use a more accurate method (barycentric interpolation essentially) to fix this, though to fix the REAL problem (which is that our interpolation will give very bad results with small triangles far away from the origin when they have steep gradients) this does absolutely nothing (actually makes it worse). (To fix the real problem, either would need to use a vertex corner (or some other point inside the tri) as starting point value instead of fb origin and pass that down to interpolation, or mimic what hw does, use barycentric interpolation (using the coordinates extracted from the rasterizer edge functions) - maybe another time.) Some (silly) tests though really want a high accuracy at fb origin and don't care much about anything else (Just. Don't. Ask.). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/drivers: compact compiler flags into Automake.incEmil Velikov2013-11-161-7/+6
| | | | | | | | | | * minimise flags duplication * distingush between VISIBILITY C and CXX flags * set only required flags - C and/or CXX v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash) Signed-off-by: Emil Velikov <[email protected]>
* llvmpipe: (trivial) fix more fallout from the setup cleanup.Roland Scheidegger2013-11-141-2/+4
| | | | Oops... Should have done some more testing.
* llvmpipe: (trivial) fix misplaced bld context assignment.Roland Scheidegger2013-11-141-2/+1
| | | | Should fix polygon offset crashes...
* llvmpipe: clean up state setup code a bitRoland Scheidegger2013-11-141-115/+59
| | | | | | | In particular get rid of home-grown vector helpers which didn't add much. And while here fix formatting a bit. No functional change. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm,llvmpipe: fix float->srgb conversion to handle NaNsRoland Scheidegger2013-11-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offsetRoland Scheidegger2013-11-121-11/+15
| | | | | | | | | | Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-075-20/+86
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix bogus layer clamping in setupRoland Scheidegger2013-10-292-8/+25
| | | | | | | | | | | | | | | | | | | | | The layer coming from GS needs to be clamped (not sure if that's actually the correct error behavior but we need something) as the number can be higher than the amount of layers in the fb. However, this code was using the layer calculation from the scene, and this was actually calculated in lp_scene_begin_rasterization() hence too late (so setup was using the value from the _previous_ scene or just zero if it was the first scene). Since the value is used in both rasterization and setup, move calculation up to lp_scene_begin_binning() though it's a bit more inconvenient to calculate there. (Theoretically could move _all_ code which was in lp_scene_begin_rasterization() to there, because ever since we got rid of swizzled render/depth buffers our "map" functions preparing the fb data for render don't actually change the data in there at all, but it feels like it would be a hack.) v2: improve comments Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2013-10-291-19/+12
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZESIlia Mirkin2013-10-261-0/+1
| | | | | | | | | This CAP will determine whether ARB_framebuffer_object can be enabled. The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf textures. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-231-29/+1
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* llvmpipe: enable seamless cube filteringRoland Scheidegger2013-10-211-1/+1
| | | | | Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* Revert "scons: Fix build when rtti is disabled"José Fonseca2013-10-161-5/+4
| | | | | | | | | | This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a few problems: - it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there - it is merging not only rtti, but the whole cxxflags (defines etc) which has proven to be a source of troubles (breaks debugging etc.)
* scons: Fix build when rtti is disabledAlexander von Gluck IV2013-10-151-4/+5
| | | | | | | | | | | | * The rtti fix actually dug up a bug in the scons build scripts. * Autotools took the LLVM cpp and cxx flags, while scons only took the cpp flags. * This grabs the cxx flags and applies them where needed. We may want to make the same change for the llvm cpp flags in scons. * The only linux platform I can find with LLVM no-rtti is Ubuntu. * Fixes bug #70471 Tested-by: Vinson Lee <[email protected]>
* llvmpipe: Advertise PIPE_CAP_DEPTH_CLIP_DISABLE.José Fonseca2013-10-151-1/+1
| | | | | | | | Actually implemented by draw module. Tested piglit ARB_depth_clamp tests, which pass 100%. Trivial.
* llvmpipe: increase fs shader variant instruction cache limit by factor 4Roland Scheidegger2013-10-121-2/+2
| | | | | | | | | | | | | | | | | The previous limit of of 128*1024 was reported to cause frequent recompiles in some apps due to shader variant thrashing on IRC in some apps leading to noticeable lags. Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible to reach, since even simple fragment shaders without texturing (glxgears) used more than twice than 128 instructions, hence the instruction limit would have always been reached first (excluding things like trivial shaders not writing color). Even with the new limit it is VERY likely the instruction limit is hit first. Should help with such lags due to recompiles (though other shader types have their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in particular the latter seems a bit small (128)). Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: We don't use the draw pipeline for offset_point/line.José Fonseca2013-10-091-2/+0
| | | | | | | | Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL, so checking the the polygon mode is sufficient. Testing done: no regression in polygon-mode-offset Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: abstract the code to set number of subpixel bitsZack Rusin2013-10-093-10/+15
| | | | | | | | | | As we're moving towards expanding the number of subpixel bits and the width of the variables used in the computations we need to make this code a bit more centralized. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/swrast: don't export any private symbolsMarek Olšák2013-10-081-1/+2
| | | | Reviewed-by: Tom Stellard <[email protected]>
* llvmpipe: remove old bind_*_sampler_states() functionsBrian Paul2013-10-031-26/+0
|
* llvmpipe: implement pipe_context::bind_sampler_states()Brian Paul2013-10-031-0/+1
|
* llvmpipe: consolidate C sources list into Makefile.sourcesEmil Velikov2013-10-013-85/+46
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* llvmpipe: Remove unnecessary null check of shader.Vinson Lee2013-09-301-1/+1
| | | | | | | | | shader has already been dereferenced earlier so cannot be null here. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: count c_primitives before discarding null primsZack Rusin2013-09-251-7/+6
| | | | | | | | | We need to count the clipper primitives before the rasterizer discards one it considers to be null. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: we need to subdivide if fb is bigger in either directionZack Rusin2013-09-251-1/+1
| | | | | | | | | We need to subdivide triangles if either of the dimensions is larger than the max edge length, not when both of them are larger. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "llvmpipe: increase number of subpixel bits to eight"Zack Rusin2013-09-243-17/+11
| | | | | | | | | This reverts commit 755c11dc5e94f17097c186edaaa39d818396f14c. We agreed that this is band-aid that's not very useful and the proper solution is to rewrite the rasterization algo so that it operates on 64 bit values. Signed-off-by: Zack Rusin <[email protected]>
* llvmpipe: align the array used for subdivived verticesZack Rusin2013-09-231-1/+1
| | | | | | | | | | | When subdiving a triangle we're using a temporary array to store the new coordinates for the subdivided triangles. Unfortunately the array used for that was not aligned properly causing random crashes in the llvm jit code which was trying to load vectors from it. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: increase number of subpixel bits to eightZack Rusin2013-09-233-11/+17
| | | | | | | | | | | | | Unfortunately d3d10 requires a lot higher precision (e.g. wgf11clipping tests for it). The smallest number of precision bits with which it passes is 8. That means that we need to decrease the maximum length of an edge that we can handle without subdivision by 4 bits. Abstracted the code a bit to make it easier to change once to switch to 64bit rasterization. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add flush_resource context functionMarek Olšák2013-09-201-0/+7
| | | | | | | | | r600g needs explicit flushing before DRI2 buffers are presented on the screen. v2: add (stub) implementations for all drivers, fix frontbuffer flushing v3: fix galahad Signed-off-by: Marek Olšák <[email protected]>
* llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.José Fonseca2013-09-201-6/+78
| | | | | | | We must take rounding in consideration when re-scaling to narrow normalized channels, such as 2-bit normalized alpha. Reviewed-by: Roland Scheidegger <[email protected]>
* draw: clean up setting stream out information a bitRoland Scheidegger2013-08-273-12/+17
| | | | | | | | | | | | | | | | | In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: support nested/overlapping queries for all query typesRoland Scheidegger2013-08-273-18/+20
| | | | | | | There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement better control of per-quad/per-element/scalar lodRoland Scheidegger2013-08-201-4/+4
| | | | | | | | | | | | | | | | There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fix stencil bug if we have both stencil and depth testsRoland Scheidegger2013-08-151-14/+13
| | | | | | | | | | | | | This is a very well hidden bug found by accident (only the fixed glean tstencil2 test so far seems to hit it). We must use new mask with combined s_pass values and orig_mask values for zpass/zfail stencil ops, otherwise both the sfail op and one of zpass/zfail op are applied (probably not hit in most tests because some of the ops tend to be KEEP usually). Note: this is a candidate for the 9.2 branch. Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: fix pipeline statistics with a null psZack Rusin2013-08-148-8/+43
| | | | | | | | | | If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: set non-existing values really to zero in size queries for d3d10Roland Scheidegger2013-08-091-2/+2
| | | | | | | | | | | My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. Reviewed-by: Zack Rusin <[email protected]>
* gallivm: use texture target from shader instead of static state for size queryRoland Scheidegger2013-08-091-0/+2
| | | | | | | | | | | | | | | | | | | d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). Reviewed-by: Zack Rusin <[email protected]>