summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/llvmpipe
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add a way to query min/max texture gather offsetsIlia Mirkin2014-04-101-0/+2
| | | | | | | | Defaults to providing the same offsets as MIN/MAX_TEXEL_OFFSET. For nvc0, the offset can be -32/31. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add support for LODQ opcodes.Dave Airlie2014-04-071-0/+1
| | | | | | | | | This opcode provide support for GL_ARB_texture_query_lod, Signed-off-by: Dave Airlie <[email protected]> [imirkin: rebase, docs update] Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* llvmpipe: remove no-op checks in sampler, sampler_view functionsBrian Paul2014-04-031-14/+0
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* mesa/soft/llvmpipe: add fake MSAA supportDave Airlie2014-04-021-0/+2
| | | | | | | | This adds a gallium cap that allows us to fake GL3.0 by not exposing MSAA on sw rendering. It also forces the extra extensions needed for GL3.2. Signed-off-by: Dave Airlie <[email protected]>
* llvmpipe: Fix llvmpipe_create_gs_state.Zack Rusin2014-03-261-3/+5
| | | | | | | Revert unintended behaviour change from commit b995a010e688bc4d4557e973e5e28091c378e881. Tested-by: José Fonseca <[email protected]>
* llvmpipe: Simplify vertex and geometry shaders.José Fonseca2014-03-255-70/+33
| | | | | | | | | | | | Eliminate lp_vertex_shader, as it added nothing over draw_vertex_shader. Simplify lp_geometry_shader, as most of the incoming state is unneeded. (We could also just use draw_geometry_shader if we were willing to peek inside the structure.) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: add support for b5g6r5_srgbRoland Scheidegger2014-03-212-4/+36
| | | | | | | | | | | | | The conversion code for srgb was tuned for n x 4x8bit AoS -> 4 x nxfloat SoA (and vice versa), fix this to handle also 16bit 565-style srgb formats. Still not really all that generic, things like r10g10b10a2_srgb or r4g4b4a4_srgb wouldn't work (the latter trivial to fix, the former would not require more work to not crash but near certainly need some higher precision calculation) but not needed right now. The code is not fully optimized for this (could use more direct calculation instead of expanding to 8-bit range first) but should be good enough. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: Tighten check for alpha-only formatsRichard Sandiford2014-03-201-1/+1
| | | | | | | | | | | The AoS version of ld_build_blend_factor was assuming that if the first channel was alpha, there were no rgb components. Fixes glean/blendFunc on System z. No piglit regressions on x86_64. The shortcut is still used in tests like spec/ARB_framebuffer_object/ fbo-alpha. Signed-off-by: Richard Sandiford <[email protected]>
* gallium: allow setting of the internal stream output offsetZack Rusin2014-03-071-6/+6
| | | | | | | | | | | | | | | | D3D10 allows setting of the internal offset of a buffer, which is in general only incremented via actual stream output writes. By allowing setting of the internal offset draw_auto is capable of rendering from buffers which have not been actually streamed out to. Our interface didn't allow. This change functionally shouldn't make any difference to OpenGL where instead of an append_bitmask you just get a real array where -1 means append (like in D3D) and 0 means do not append. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: the other drivers don't support ARB_buffer_storageMarek Olšák2014-02-251-0/+1
| | | | Reviewed-by: Fredrik Höglund <[email protected]>
* gallium: add texture gather support to gallium (v3)Dave Airlie2014-02-251-0/+2
| | | | | | | | | | | | | | | | | | | | | This adds support to gallium for a TG4 instruction, and two CAPs. The first CAP is required for GL_ARB_texture_gather. The second CAP is required to expose GL_ARB_gpu_shader5. However so far we haven't found any hardware that natively exposes the textureGatherOffsets feature from GL, so just lower it for now. If hardware appears for this we can add another CAP to allow TG4 to take 4 offsets. v2: add component selection src and a cap to say hw can do it. (st can use to help control GL_ARB_gpu_shader5/GLSL 4.00). Add docs. v3: rename to SM5, add docs. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* gallium: add geometry shader output limitsGrigori Goronzy2014-02-091-0/+3
| | | | | | | | v2: adjust limits for radeonsi and llvmpipe v3: add documentation Cc: "10.1" <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* gallium: remove PIPE_CAP_MAX_COMBINED_SAMPLERSMarek Olšák2014-02-041-2/+0
| | | | | | | This can be derived from the shader caps. All GPUs from ATI/AMD, NVIDIA, and INTEL have separate texture slots for each shader stage.
* llvmpipe: fix denorm handling for r11g11b10_float format when blendingRoland Scheidegger2014-01-311-2/+15
| | | | | | | | The code re-enabling denorms for small float formats did not recognize this format due to format handling hacks (mainly, the lp_type doesn't have the floating bit set). Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: Set PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT to 64Siavash Eliasi2014-01-291-1/+2
| | | | | | | v2: Fixed setting switch cases prior to PIPE_CAP_MIN_MAP_BUFFER_ALIGNMENT incorrectly. Reviewed-by: Ian Romanick <[email protected]>
* llvmpipe: Use alignment of 64 instead of 16 for buffer allocationSiavash Eliasi2014-01-291-3/+3
| | | | | | v2: Changed allocation alignment of llvmpipe_displaytarget_layout. Reviewed-by: Ian Romanick <[email protected]>
* gallium: Use C11 thread abstractions.José Fonseca2014-01-231-1/+1
| | | | | | | Note that PIPE_ROUTINE now returns an int. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* gallium: remove PIPE_CAP_SCALED_RESOLVEMarek Olšák2014-01-231-2/+0
| | | | | | | If any driver doesn't support this, it can use a blit after resolving the samples. Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: dump geometry shaders when using LP_DEBUG=tgsiDave Airlie2014-01-221-1/+4
| | | | | | | for consistency with vs and fs dumpers. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* s/Tungsten Graphics/VMware/José Fonseca2014-01-1734-79/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the old copyright name is creating unnecessary confusion, hence this change. This was the sed script I used: $ cat tg2vmw.sed # Run as: # # git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed # # Rename copyrights s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./ s/TUNGSTEN GRAPHICS/VMWARE/g # Rename emails s/[email protected]/[email protected]/ s/[email protected]/[email protected]/g s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/ s/jrfonseca\[email protected]/[email protected]/g s/keithw\[email protected]/[email protected]/g s/[email protected]/[email protected]/g s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/ s/[email protected]/[email protected]/ # Remove dead links s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g # C string src/gallium/state_trackers/vega/api_misc.c s/"Tungsten Graphics, Inc"/"VMware, Inc"/ Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: handle NULL color buffer pointersBrian Paul2014-01-175-94/+156
| | | | | | | | | Fixes regression from 9baa45f78b8ca7d66280e36009b6a685055d7cd6 v2: incorporate a few small changes suggested by Roland. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix large point rasterization with point_quad_rasterizationRoland Scheidegger2014-01-171-12/+19
| | | | | | | | | | The whole round-pointsize-to-int stuff must only be done with GL legacy rules (no point_quad_rasterization) or all the wrong edges are lit up. This was previously in a private branch (d3d pointsprite test complains loudly otherwise) and got lost in a merge. However, it should certainly apply to GL point sprite rasterization as well. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: do constant buffer bounds checking in shadersZack Rusin2014-01-164-4/+20
| | | | | | | | | | | | | | | | | | | | It's possible to bind a smaller buffer as a constant buffer, than what the shader actually uses/requires. This could cause nasty crashes. This patch adds the architecture to pass the maximum allowable constant buffer index to the jit to let it make sure that the constant buffer indices are always within bounds. The behavior follows the d3d10 spec, which says the overflow should always return all zeros, and overflow is only defined as access beyond the size of the currently bound buffer. Accesses beyond the declared shader constant register size are not considered an overflow and expected to return garbage but consistent garbage (we follow the behavior which some wlk tests expect which is to return the actual values from the bound buffer). Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Honour pipe_rasterizer::point_quad_rasterization.José Fonseca2014-01-091-10/+57
| | | | | | | | | | | | Commit eda21d2a3010d9fc5a68b55a843c5e44b2abf8dd fixed the rasterization of points for Direct3D but ended up breaking the rasterization of OpenGL non-sprite points, in particular conform's pntrast.c test. The only way to get both working is to properly honour pipe_rasterizer::point_quad_rasterization, and follow the weird OpenGL rule when it is false. Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: Fix the bottom_edge_rule adjustment for points.José Fonseca2014-01-081-4/+4
| | | | | | | | | The adjustment needs to be applied to the y coordinates and not the x coordinates, just like the equivalent code for lines and triangles in lp_setup_line.c and lp_setup_tri.c. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding ↵José Fonseca2014-01-083-3/+3
| | | | | | | | | | | | | | boxes. This was inadvertently forgotten when replacing gl_rasterization_rules with lower_left_origin and half_pixel_center (commit 2737abb44efebfa10ac84b183c20fc5818d1514e). This makes a difference when lower_left_origin != half_pixel_center, e.g, D3D10. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Zack Rusin <[email protected]>
* llvmpipe: Basic implementation of pipe_context::set_sample_mask.José Fonseca2014-01-075-7/+20
| | | | | | | | | | | | | | | | | We don't support MSAA (ie, number of samples is always one) therefore sample_mask boils down to a synonym of the rasterizer_discard flag. Also, this change makes setup actually use the value received in lp_setup_set_rasterizer_discard instead of reaching out to llvmpipe upper layers to re-fetch it. Based on Si Chen's draft. With this patch `wgf11multisample Coverage passes 100%` on the UMD D3D10 state tracker. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Si Chen <[email protected]>
* llvmpipe: Implement alpha_to_coverage for non-MSAA framebuffers.Si Chen2014-01-073-1/+59
| | | | | | | | Implement Alpha to Coverage by discarding a fragment alpha component is less than 0.5. This is a joint work of Jose and Si. Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: use pipe_sampler_view_release() to avoid segfaultJonathan Liu2013-12-221-0/+6
| | | | | | | | | This fixes another case of faulting when freeing a pipe_sampler_view that belongs to a previously destroyed context. Cc: "10.0" <[email protected]> Signed-off-by: Jonathan Liu <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: get rid of barycentric calculation of a0Roland Scheidegger2013-12-141-66/+4
| | | | | | | Didn't really work as well as hoped (in particular it was not generally more accurate), will solve this differently. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: (trivial) get rid of triangle subdivision codeRoland Scheidegger2013-12-143-182/+1
| | | | | | | | This code was always problematic, and with 64bit rasterization we no longer need it at all. Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* swrast* (gallium, classic): add MESA_copy_sub_buffer support (v3)Dave Airlie2013-12-131-3/+3
| | | | | | | | | | | | | | | | | | | | | | | This patches add MESA_copy_sub_buffer support to the dri sw loader and then to gallium state tracker, llvmpipe, softpipe and other bits. It reuses the dri1 driver extension interface, and it updates the swrast loader interface for a new putimage which can take a stride. I've tested this with gnome-shell with a cogl hacked to reenable sub copies for llvmpipe and the one piglit test. I could probably split this patch up as well. v2: pass a pipe_box, to reduce the entrypoints, as per Jose's review, add to p_screen doc comments. v3: finish off winsys interfaces, add swrast classic support as well. Reviewed-by: Jose Fonseca <[email protected]> Signed-off-by: Dave Airlie <[email protected]> swrast: add support for copy_sub_buffer
* llvmpipe: add plumbing for ARB_depth_clampMatthew McClure2013-12-113-35/+60
| | | | | | | | | | With this patch llvmpipe will adhere to the ARB_depth_clamp enabled state when clamping the fragment's zw value. To support this, the variant key now includes the depth_clamp state. key->depth_clamp is derived from pipe_rasterizer_state's (depth_clip == 0), thus depth clamp is only enabled when depth clip is disabled. Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: add a very useful (disabled) debugging outputZack Rusin2013-12-101-0/+20
| | | | | | | | Disabled by default, but it's very useful when needed. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix blending with half-float formatsZack Rusin2013-12-101-5/+26
| | | | | | | | | | | | | The fact that we flush denorms to zero breaks our half-float conversion and blending. This patches enables denorms for blending. It's a little tricky due to the llvm bug that makes it incorrectly reorder the mxcsr intrinsics: http://llvm.org/bugs/show_bug.cgi?id=6393 Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Zack Rusin <[email protected]>
* llvmpipe: clamp fragment shader depth write to the current viewport depth range.Matthew McClure2013-12-0913-29/+255
| | | | | | | | | | | | | | | | | With this patch, generate_fs_loop will clamp any fragment shader depth writes to the viewport's min and max depth values. Viewport selection is determined by the geometry shader output for the viewport array index. If no index is specified, then the default viewport index is zero. Semantics for this path can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx. lp_jit_viewport was created to store viewport information visible to JIT code, and is validated when the LP_NEW_VIEWPORT dirty flag is set. lp_rast_shader_inputs is responsible for passing the viewport_index through the rasterizer stage to fragment stage (via lp_jit_thread_data). Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* gallium: add support for AMD_vertex_shader_layerMarek Olšák2013-12-031-0/+2
|
* gallium: new shader cap bit for the amount of sampler viewsRoland Scheidegger2013-11-281-0/+5
| | | | | | | | | Ever since introducing separate sampler and sampler view max this was really missing. Every driver but llvmpipe reports the same number as number of samplers for now, so nothing should break. Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: support 8bit subpixel precisionZack Rusin2013-11-258-148/+321
| | | | | | | | | | | | | 8 bit precision is required by d3d10 but unfortunately requires 64 bit rasterizer. This commit implements 64 bit rasterization with full support for 8bit subpixel precision. It's a combination of all individual commits from the llvmpipe-rast-64 branch. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: (trivial) disable new accurate origin calculationRoland Scheidegger2013-11-221-1/+1
| | | | It looks like there's some bugs in it...
* llvmpipe: calculate more accurate interpolation value at originRoland Scheidegger2013-11-211-6/+82
| | | | | | | | | | | | | | | | | Some rounding errors could crop up when calculating a0. Use a more accurate method (barycentric interpolation essentially) to fix this, though to fix the REAL problem (which is that our interpolation will give very bad results with small triangles far away from the origin when they have steep gradients) this does absolutely nothing (actually makes it worse). (To fix the real problem, either would need to use a vertex corner (or some other point inside the tri) as starting point value instead of fb origin and pass that down to interpolation, or mimic what hw does, use barycentric interpolation (using the coordinates extracted from the rasterizer edge functions) - maybe another time.) Some (silly) tests though really want a high accuracy at fb origin and don't care much about anything else (Just. Don't. Ask.). Reviewed-by: Jose Fonseca <[email protected]>
* gallium/drivers: compact compiler flags into Automake.incEmil Velikov2013-11-161-7/+6
| | | | | | | | | | * minimise flags duplication * distingush between VISIBILITY C and CXX flags * set only required flags - C and/or CXX v2: add LLVM_CFLAGS back to AM_CFLAGS (add missing backslash) Signed-off-by: Emil Velikov <[email protected]>
* llvmpipe: (trivial) fix more fallout from the setup cleanup.Roland Scheidegger2013-11-141-2/+4
| | | | Oops... Should have done some more testing.
* llvmpipe: (trivial) fix misplaced bld context assignment.Roland Scheidegger2013-11-141-2/+1
| | | | Should fix polygon offset crashes...
* llvmpipe: clean up state setup code a bitRoland Scheidegger2013-11-141-115/+59
| | | | | | | In particular get rid of home-grown vector helpers which didn't add much. And while here fix formatting a bit. No functional change. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm,llvmpipe: fix float->srgb conversion to handle NaNsRoland Scheidegger2013-11-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | d3d10 requires us to convert NaNs to zero for any float->int conversion. We don't really do that but mostly seems to work. In particular I suspect the very common float->unorm8 path only really passes because it relies on sse2 pack intrinsics which just happen to work by luck for NaNs (float->int conversion in hw gives integer indeterminate value, which just happens to be -0x80000000 hence gets converted to zero in the end after pack intrinsics). However, float->srgb didn't get so lucky, because we need to clamp before blending and clamping resulted in NaN behavior being undefined (and actually got converted to 1.0 by clamping with sse2). Fix this by using a zero/one clamp with defined nan behavior as we can handle the NaN for free this way. I suspect there's more bugs lurking in this area (e.g. converting floats to snorm) as we don't really use defined NaN behavior everywhere but this seems to be good enough. While here respecify nan behavior modes a bit, in particular the return_second mode didn't really do what we wanted. From the caller's perspective, we really wanted to say we need the non-nan result, but we already know the second arg isn't a NaN. So we use this now instead, which means that cpu architectures which actually implement min/max by always returning non-nan (that is adhering to ieee754-2008 rules) don't need to bend over backwards for nothing. Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe: use exponent manipulation instead of exp2 for polygon offsetRoland Scheidegger2013-11-121-11/+15
| | | | | | | | | | Since we explicitly require a integer input we should avoid using exp2 math (even if we were using optimized versions), which turns the exp2 into a int sub (plus some casts). v2: fix bogus uint (needs to be int) math spotted by Matthew, fix comments Reviewed-by: Jose Fonseca <[email protected]>
* draw,llvmpipe,util: add depth bias calculation for arb_depth_buffer_floatMatthew McClure2013-11-075-20/+86
| | | | | | | | | | | | | | | With this patch, the llvmpipe and draw modules will calculate the depth bias according to floating point depth buffer semantics described in the arb_depth_buffer_float specification, when the driver has a z buffer bound with a format type of UTIL_FORMAT_TYPE_FLOAT. By default, the driver will use the existing UNORM calculation for depth bias. A new function, draw_set_zs_format, was added to calculate the Minimum Resolvable Depth value and floating point depth sense for the draw module. Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: fix bogus layer clamping in setupRoland Scheidegger2013-10-292-8/+25
| | | | | | | | | | | | | | | | | | | | | The layer coming from GS needs to be clamped (not sure if that's actually the correct error behavior but we need something) as the number can be higher than the amount of layers in the fb. However, this code was using the layer calculation from the scene, and this was actually calculated in lp_scene_begin_rasterization() hence too late (so setup was using the value from the _previous_ scene or just zero if it was the first scene). Since the value is used in both rasterization and setup, move calculation up to lp_scene_begin_binning() though it's a bit more inconvenient to calculate there. (Theoretically could move _all_ code which was in lp_scene_begin_rasterization() to there, because ever since we got rid of swizzled render/depth buffers our "map" functions preparing the fb data for render don't actually change the data in there at all, but it feels like it would be a hack.) v2: improve comments Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* util,llvmpipe: correctly set the minimum representable depth valueMatthew McClure2013-10-291-19/+12
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>