summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/llvmpipe
Commit message (Collapse)AuthorAgeFilesLines
* llvmpipe: fix bogus layer clamping in setupRoland Scheidegger2013-10-292-8/+25
| | | | | | | | | | | | | | | | | | | | | The layer coming from GS needs to be clamped (not sure if that's actually the correct error behavior but we need something) as the number can be higher than the amount of layers in the fb. However, this code was using the layer calculation from the scene, and this was actually calculated in lp_scene_begin_rasterization() hence too late (so setup was using the value from the _previous_ scene or just zero if it was the first scene). Since the value is used in both rasterization and setup, move calculation up to lp_scene_begin_binning() though it's a bit more inconvenient to calculate there. (Theoretically could move _all_ code which was in lp_scene_begin_rasterization() to there, because ever since we got rid of swizzled render/depth buffers our "map" functions preparing the fb data for render don't actually change the data in there at all, but it feels like it would be a hack.) v2: improve comments Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2013-10-291-19/+12
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: add PIPE_CAP_MIXED_FRAMEBUFFER_SIZESIlia Mirkin2013-10-261-0/+1
| | | | | | | | | This CAP will determine whether ARB_framebuffer_object can be enabled. The nv30 driver does not allow mixing swizzled and linear zsbuf/cbuf textures. Signed-off-by: Ilia Mirkin <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: new, unified pipe_context::set_sampler_views() functionBrian Paul2013-10-231-29/+1
| | | | | | | | | | | | The new function replaces four old functions: set_fragment/vertex/ geometry/compute_sampler_views(). Note: at this time, it's expected that the 'start' parameter will always be zero. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Emil Velikov <[email protected]>
* llvmpipe: enable seamless cube filteringRoland Scheidegger2013-10-211-1/+1
| | | | | Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* Revert "scons: Fix build when rtti is disabled"José Fonseca2013-10-161-5/+4
| | | | | | | | | | This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a few problems: - it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there - it is merging not only rtti, but the whole cxxflags (defines etc) which has proven to be a source of troubles (breaks debugging etc.)
* scons: Fix build when rtti is disabledAlexander von Gluck IV2013-10-151-4/+5
| | | | | | | | | | | | * The rtti fix actually dug up a bug in the scons build scripts. * Autotools took the LLVM cpp and cxx flags, while scons only took the cpp flags. * This grabs the cxx flags and applies them where needed. We may want to make the same change for the llvm cpp flags in scons. * The only linux platform I can find with LLVM no-rtti is Ubuntu. * Fixes bug #70471 Tested-by: Vinson Lee <[email protected]>
* llvmpipe: Advertise PIPE_CAP_DEPTH_CLIP_DISABLE.José Fonseca2013-10-151-1/+1
| | | | | | | | Actually implemented by draw module. Tested piglit ARB_depth_clamp tests, which pass 100%. Trivial.
* llvmpipe: increase fs shader variant instruction cache limit by factor 4Roland Scheidegger2013-10-121-2/+2
| | | | | | | | | | | | | | | | | The previous limit of of 128*1024 was reported to cause frequent recompiles in some apps due to shader variant thrashing on IRC in some apps leading to noticeable lags. Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible to reach, since even simple fragment shaders without texturing (glxgears) used more than twice than 128 instructions, hence the instruction limit would have always been reached first (excluding things like trivial shaders not writing color). Even with the new limit it is VERY likely the instruction limit is hit first. Should help with such lags due to recompiles (though other shader types have their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in particular the latter seems a bit small (128)). Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: We don't use the draw pipeline for offset_point/line.José Fonseca2013-10-091-2/+0
| | | | | | | | Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL, so checking the the polygon mode is sufficient. Testing done: no regression in polygon-mode-offset Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: abstract the code to set number of subpixel bitsZack Rusin2013-10-093-10/+15
| | | | | | | | | | As we're moving towards expanding the number of subpixel bits and the width of the variables used in the computations we need to make this code a bit more centralized. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium/swrast: don't export any private symbolsMarek Olšák2013-10-081-1/+2
| | | | Reviewed-by: Tom Stellard <[email protected]>
* llvmpipe: remove old bind_*_sampler_states() functionsBrian Paul2013-10-031-26/+0
|
* llvmpipe: implement pipe_context::bind_sampler_states()Brian Paul2013-10-031-0/+1
|
* llvmpipe: consolidate C sources list into Makefile.sourcesEmil Velikov2013-10-013-85/+46
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* llvmpipe: Remove unnecessary null check of shader.Vinson Lee2013-09-301-1/+1
| | | | | | | | | shader has already been dereferenced earlier so cannot be null here. Fixes "Dereference before null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: count c_primitives before discarding null primsZack Rusin2013-09-251-7/+6
| | | | | | | | | We need to count the clipper primitives before the rasterizer discards one it considers to be null. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: we need to subdivide if fb is bigger in either directionZack Rusin2013-09-251-1/+1
| | | | | | | | | We need to subdivide triangles if either of the dimensions is larger than the max edge length, not when both of them are larger. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* Revert "llvmpipe: increase number of subpixel bits to eight"Zack Rusin2013-09-243-17/+11
| | | | | | | | | This reverts commit 755c11dc5e94f17097c186edaaa39d818396f14c. We agreed that this is band-aid that's not very useful and the proper solution is to rewrite the rasterization algo so that it operates on 64 bit values. Signed-off-by: Zack Rusin <[email protected]>
* llvmpipe: align the array used for subdivived verticesZack Rusin2013-09-231-1/+1
| | | | | | | | | | | When subdiving a triangle we're using a temporary array to store the new coordinates for the subdivided triangles. Unfortunately the array used for that was not aligned properly causing random crashes in the llvm jit code which was trying to load vectors from it. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: increase number of subpixel bits to eightZack Rusin2013-09-233-11/+17
| | | | | | | | | | | | | Unfortunately d3d10 requires a lot higher precision (e.g. wgf11clipping tests for it). The smallest number of precision bits with which it passes is 8. That means that we need to decrease the maximum length of an edge that we can handle without subdivision by 4 bits. Abstracted the code a bit to make it easier to change once to switch to 64bit rasterization. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add flush_resource context functionMarek Olšák2013-09-201-0/+7
| | | | | | | | | r600g needs explicit flushing before DRI2 buffers are presented on the screen. v2: add (stub) implementations for all drivers, fix frontbuffer flushing v3: fix galahad Signed-off-by: Marek Olšák <[email protected]>
* llvmpipe: Fix rendering to PIPE_FORMAT_R10G10B10A2_UNORM.José Fonseca2013-09-201-6/+78
| | | | | | | We must take rounding in consideration when re-scaling to narrow normalized channels, such as 2-bit normalized alpha. Reviewed-by: Roland Scheidegger <[email protected]>
* draw: clean up setting stream out information a bitRoland Scheidegger2013-08-273-12/+17
| | | | | | | | | | | | | | | | | In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: support nested/overlapping queries for all query typesRoland Scheidegger2013-08-273-18/+20
| | | | | | | There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <[email protected]>
* gallivm: implement better control of per-quad/per-element/scalar lodRoland Scheidegger2013-08-201-4/+4
| | | | | | | | | | | | | | | | There's a new debug value used to disable per-quad lod optimizations in fragment shader (ignored for vs/gs as the results are just too wrong typically). Also trying to detect if a supplied lod value is really a scalar (if it's coming from immediate or constant file) in which case sampler code can use this to stay on per-quad-lod path (in fact for explicit lod could simplify even further and use same lod for both quads in the avx case but this is not implemented yet). Still need to actually implement per-element lod bias (and derivatives), and need to handle per-element lod in size queries. v2: fix comments, prettify. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: fix stencil bug if we have both stencil and depth testsRoland Scheidegger2013-08-151-14/+13
| | | | | | | | | | | | | This is a very well hidden bug found by accident (only the fixed glean tstencil2 test so far seems to hit it). We must use new mask with combined s_pass values and orig_mask values for zpass/zfail stencil ops, otherwise both the sfail op and one of zpass/zfail op are applied (probably not hit in most tests because some of the ops tend to be KEEP usually). Note: this is a candidate for the 9.2 branch. Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: fix pipeline statistics with a null psZack Rusin2013-08-148-8/+43
| | | | | | | | | | If the fragment shader is null then pixel shader invocations have to be equal to zero. And if we're running a null ps then clipper invocations and primitives should be equal to zero but only if both stancil and depth testing are disabled. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: set non-existing values really to zero in size queries for d3d10Roland Scheidegger2013-08-091-2/+2
| | | | | | | | | | | My previous attempt at doing so double-failed miserably (minification of zero still gives one, and even if it would not the value was never written anyway). While here also rename the confusingly named int_vec bld as we have int vecs of different sizes, and rename need_nr_mips (as this also changes out-of-bounds behavior) to is_sviewinfo too. Reviewed-by: Zack Rusin <[email protected]>
* gallivm: use texture target from shader instead of static state for size queryRoland Scheidegger2013-08-091-0/+2
| | | | | | | | | | | | | | | | | | | d3d10 has no notion of distinct array resources neither at the resource nor sampler view level. However, shader dcl of resources certainly has, and d3d10 expects resinfo to return the values according to that - in particular a resource might have been a 1d texture with some array layers, then the sampler view might have only used 1 layer so it can be accessed both as 1d or 1d array texture (I think - the former definitely works). resinfo of a resource decleared as array needs to return number of array layers but non-array resource needs to return 0 (and not 1). Hence fix this by passing the target from the shader decl to emit_size_query and use that (in case of OpenGL the target will come from the instruction itself). Could probably do the same for actual sampling, though it may not matter there (as the bogus components will essentially get clamped away), possibly could wreak havoc though if it REALLY doesn't match (which is of course an error but still). Reviewed-by: Zack Rusin <[email protected]>
* gallivm: propagate scalar_lod to emit_size_query tooRoland Scheidegger2013-08-081-0/+2
| | | | | | | Clearly the returned values need to be per-element if the lod is per element. Does not actually change behavior yet. Reviewed-by: Zack Rusin <[email protected]>
* draw: fix slot detectionZack Rusin2013-08-062-2/+1
| | | | | | | | | | | | Nowadays -1 for slots means that the semantic is not present, so we need to store it in a signed variables, otherwise <0 comparisons are pointless. Fixes http://bugzilla.eng.vmware.com/show_bug.cgi?id=67811 (at least with softpipe, edgeflags don't work wit llvmpipe) Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Do not need to free anything if there is no geometry shader.Vinson Lee2013-08-051-2/+5
| | | | | | | | | | If gs is null, then freeing state->shader.tokens would result in a null dereference. Fixes "Dereference after null check" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: fix frontface behavior againZack Rusin2013-08-021-3/+11
| | | | | | | Lets make sure the frontface is 1 for front and -1 for back. Discussed with Roland and Jose. Signed-off-by: Zack Rusin <[email protected]>
* llvmpipe: don't interpolate front face or prim idZack Rusin2013-08-021-15/+13
| | | | | | | | | | | | | | The loop was iterating over all the fs inputs and setting them to perspective interpolation, then after the loop we were creating extra output slots with the correct interpolation. Instead of injecting bogus extra outputs, just set the interpolation on front face and prim id correctly when doing the initial scan of fs inputs. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* draw: inject frontface info into wireframe outputsZack Rusin2013-08-026-4/+39
| | | | | | | | | | | | | | Draw module can decompose primitives into wireframe models, which is a fancy word for 'lines', unfortunately that decomposition means that we weren't able to preserve the original front-face info which could be derived from the original primitives (lines don't have a 'face'). To fix it allow draw module to inject a fake face semantic into outputs from which the backends can figure out the original frontfacing info of the primitives. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: make the front-face behavior match the gallium specZack Rusin2013-08-021-1/+4
| | | | | | | | | | | | The spec says that front-face is true if the value is >0 and false if it's <0. To make sure that we follow the spec, lets just subtract 0.5 from our value (llvmpipe did 1 for frontface and 0 otherwise), which will get us a positive num for frontface and negative for backface. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Add PIPE_CAP_ENDIANNESSTom Stellard2013-07-221-0/+2
| | | | | | Cc: [email protected] [ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation, define PIPE_ENDIAN_NATIVE. ]
* llvmpipe: Ensure FTZ/DAZ flags are set on deferred draw flushes.Zack Rusin2013-07-221-0/+8
| | | | Tested-by: José Fonseca <[email protected]>
* llvmpipe: Remove lp_rast_get_num_threads().José Fonseca2013-07-222-11/+0
| | | | | | Never called. Trivial.
* llvmpipe/tests: update arith test to check for edge casesZack Rusin2013-07-191-9/+19
| | | | | | | | | Test infs, zeros and nans with our arith functions to assure correct/defined behavior with those values. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: clamp inputs for srgb render buffersRoland Scheidegger2013-07-181-0/+35
| | | | | | | | | | | | | | | Usually with fixed point renderbuffers clamping is done as part of conversion. However, since we blend in float format, we essentially skip all conversion steps pre-blend but since this is still a fixed point renderbuffer we must still clamp the inputs in this case. Makes no difference for piglit though. Obviously we could skip this if fragment color clamping is enabled, but a) this is deprecated in OpenGL (d3d never had it) and b) we don't support it natively so it gets baked into the shader. Also add some comment about logic ops being broken for srgb, luckily no test tries to do that as there's no easy fix... Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: fix blending with SRC_ALPHA_SATURATE with some formats without alphaRoland Scheidegger2013-07-182-8/+26
| | | | | | | | | | | | | | | | | | We were fixing up the blend factor to ZERO, however this only works correctly with fixed point render buffers where the input values are clamped to 0/1 (because src_alpha_saturate is min(As, 1-Ad) so can be negative with unclamped inputs). Haven't seen any failure anywhere due to that with fixed point SNORM buffers (which clamp inputs to -1/1) but it should apply there as well (snorm blending is rare, even opengl 4.3 doesn't require snorm rendertargets at all, d3d10 requires them but they are not blendable). Doesn't look like piglit hits this though (some internal testing hits the float case at least). (With legacy OpenGL we could theoretically still use the fixup to zero if the fragment color clamp is enabled, but we can't detect that easily since we don't support native clamping hence it gets baked into the shader.) Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: support sRGB framebuffersRoland Scheidegger2013-07-162-14/+57
| | | | | | | | | | | | | | | Just use the new conversion functions to do the work. The way it's plugged in into the blend code is quite hacktastic but follows all the same hacks as used by packed float format already. Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never worked anyway in the blend code and are thus disabled, and I don't think anyone is interested in L8/L8A8. Would need even more hacks otherwise. Unless I'm missing something, this is the last feature except MSAA needed for OpenGL 3.0, and for OpenGL 3.1 as well I believe. v2: prettify a bit, use separate function for packing. Reviewed-by: Jose Fonseca <[email protected]>
* util: treat denorm'ed floats like zeroZack Rusin2013-07-092-0/+11
| | | | | | | | | | | | | The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallivm: do per-pixel lod calculations for explicit lodRoland Scheidegger2013-07-041-1/+2
| | | | | | | | | | | | | | | | | | | | | d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
* mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependenciesMarek Olšák2013-07-021-1/+0
| | | | | | Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: fix timer query if there's no binsRoland Scheidegger2013-06-291-0/+10
| | | | | | | | | | | | | b04a295a4a0cd2defe352b3193b5fa79ca8fc9fc removed seemingly unnecessary code in get_query. Turns out this code could in fact be reached - while timestamps are always binned, if there are no bins (which happens if fb size is 0) then the rasterization query code filling this in is still never executed. So fix this up by filling in some timestamp, but do it at EndQuery time not GetQuery time which should be more appropriate. Makes piglit arb_timer_query-timestamp-get happy again. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: handle offset_clampRoland Scheidegger2013-06-273-2/+24
| | | | | | | | | | | | | | | This was just ignored (unless for some reason like unfilled polys draw was handling this). I'm not convinced of that code, putting the float for the clamp in the key isn't really a good idea. Then again the other floats for depth bias are already in there too anyway (should probably have a jit_context for the setup function), so this is just a quick fix. Also, the "minimum resolvable depth difference" used isn't really right as it should be calculated according to the z values of the current primitive and not be a constant (of course, this only makes a difference for float depth buffers), at least for d3d10, so depth biasing is still not quite right. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: remove never reached code for timestamp queries.Roland Scheidegger2013-06-271-2/+0
| | | | | timestamp queries are always binned in an active scene, therefore always have a result.