summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallium/st: add pipe_context::generate_mipmap()Charmaine Lee2016-01-1419-0/+77
| | | | | | | | | | | | | | | | This patch adds a new interface to support hardware mipmap generation. PIPE_CAP_GENERATE_MIPMAP is added to allow a driver to specify if this new interface is supported; if not supported, the state tracker will fallback to mipmap generation by rendering/texturing. v2: add PIPE_CAP_GENERATE_MIPMAP to the disabled section for all drivers v3: add format to the generate_mipmap interface to allow mipmap generation using a format other than the resource format v4: fix return type of trace_context_generate_mipmap() Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium/radeon: do not reallocate user memory buffersNicolai Hähnle2016-01-144-8/+43
| | | | | | | | | The whole point of AMD_pinned_memory is that applications don't have to map buffers via OpenGL - but they're still allowed to, so make sure we don't break the link between buffer object and user memory unless explicitly instructed to. Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: implement PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-145-9/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium/radeon: reset valid_buffer_range on PIPE_TRANSFER_DISCARD_WHOLE_RESOURCENicolai Hähnle2016-01-141-0/+3
| | | | | | | This accomodates a streaming pattern where the discard flag is set when the application wraps back to the beginning of the buffer. Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_INVALIDATE_BUFFERNicolai Hähnle2016-01-1417-2/+23
| | | | | | | | | It makes sense to re-use pipe->invalidate_resource for the purpose of glInvalidateBufferData, but this function is already implemented in vc4 where it doesn't have the expected behavior. So add a capability flag to indicate that the driver supports the expected behavior. Reviewed-by: Marek Olšák <[email protected]>
* winsys/radeon: fix warnings about incompatible pointer typesNicolai Hähnle2016-01-141-6/+6
| | | | | | | Some confusion between pb_buffer and radeon_bo as well as between radeon_drm_winsys and radeon_winsys. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move POSITION and FACE fragment shader inputs to system valuesMarek Olšák2016-01-133-45/+25
| | | | | | And FACE becomes integer instead of float. Reviewed-by: Edward O'Callaghan <[email protected]>
* radeonsi: simplify gl_FragCoord behaviorMarek Olšák2016-01-131-23/+22
| | | | | | It will become a system value, not an input. Reviewed-by: Edward O'Callaghan <[email protected]>
* llvmpipe: (trivial) use cast wrapper for __m128d to __m128 castsRoland Scheidegger2016-01-131-2/+2
| | | | some compiler was unhappy.
* llvmpipe: avoid most 64 bit math in rasterizationRoland Scheidegger2016-01-132-65/+143
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The trick here is to recognize that in the c + n * dcdx calculations, not only can the lower FIXED_ORDER bits not change (as the dcdx values have those all zero) but that this means the sign bit of the calculations cannot be different as well, that is sign(c + n*dcdx) == sign((c >> FIXED_ORDER) + n*(dcdx >> FIXED_ORDER)). That shaves off more than enough bits to never require 64bit masks. A shifted plane c value could still easily exceed 32 bits, however since we throw out planes which are trivial accept even before binning (and similarly don't even get to see tris for which there was a trivial reject plane)) this is never a problem. The idea isnt't all that revolutionary, in fact something similar was tried ages ago (9773722c2b09d5f0615a47cecf4347859474dc56) back when the values were only 32 bit anyway. I believe now it didn't quite work then because the adjustment needed for testing trivial reject / partial masks wasn't handled correctly. This still keeps the separate 32/64 bit paths for now, as the 32 bit one still looks minimally simpler (and also because if we'd pass in dcdx/dcdy/eo unscaled from setup which would be a good reason to ditch the 32 bit path, we'd need to change the special-purpose rasterization functions for small tris). This passes piglit triangle-rasterization (-fbo -auto -max_size -subpixelbits 8) and triangle-rasterization-overdraw (with some hacks to make it work correctly with large sizes) easily (full piglit as well of course, but most tests wouldn't use triangles large enough to be affected, that is tris with a bounding box over 128x128). The profiler says indeed time spent in rast_tri functions is reduced substantially, BUT of course only if the tris are large. I measured a 3% improvement in mesa gloss demo when supersized to twice the screen size... Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: scale up bounding box planes to subpixel precisionRoland Scheidegger2016-01-133-30/+30
| | | | | | | | | Otherwise some planes we get in rasterization have subpixel precision, others not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports with subpixel accuracy, so could even do bounding box calcs with that). Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* llvmpipe: add sse code for fixed position calculationRoland Scheidegger2016-01-131-8/+50
| | | | | | | | | | | | | | This is quite a few less instructions, albeit still do the 2 64bit muls with scalar c code (they'd need way more shuffles, plus fixup for the signed mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed scalar muls natively just fine after all (even on 32bit). (This still doesn't have a very measurable performance impact in reality, although profiler seems to say time spent in setup indeed has gone down by 10% or so overall. Maybe good for a 3% or so improvement in openarena.) Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* draw: fix key comparison with uninitialized valueRoland Scheidegger2016-01-132-6/+6
| | | | | | | | Discovered by accident, valgrind was complaining (could have possibly caused us to create redundant geometry shader variants). v2: convinced by Brian and Jose, just use memset for both gs and vs keys, just as easy and less error prone.
* st/omx: Avoid segfault in deconstructor if constructor failsTom St Denis2016-01-121-0/+3
| | | | | | | | | If the constructor fails before the LIST_INIT calls the pointers will be null and the deconstructor will segfault. Signed-off-by: Tom St Denis <[email protected]> Reviewed-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* vl: use preferred format for deinterlacingChristian König2016-01-121-1/+7
| | | | | Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl: improve motion adaptive deinterlacerChristian König2016-01-122-22/+49
| | | | | | | Handle other formats than YV12 as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: add BOB deinterlacing v2Christian König2016-01-122-11/+79
| | | | | | | | | Tested with MPV. v2: correctly handle compositor deinterlacing as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: add NV12 -> NV12 post processing v2Christian König2016-01-122-37/+124
| | | | | | | | | | Usefull for mpv and GStreamer. v2: use common functionality for size adjustment. Signed-off-by: Indrajit-kumar Das <[email protected]> Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: use vl_video_buffer_adjust_sizeChristian König2016-01-121-9/+4
| | | | | | | Use the new helper function instead of open coding it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/vdpau: use vl_video_buffer_adjust_sizeChristian König2016-01-121-10/+3
| | | | | | | Use the new helper function instead of open coding it. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* vl/buffers: extract vl_video_buffer_adjust_size helperChristian König2016-01-122-8/+20
| | | | | | | Useful for the state trackers as well. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/va: make the implementation thread safe v2Christian König2016-01-127-54/+199
| | | | | | | | | | | Otherwise we might crash with MPV. v2: minor cleanups suggested on the list. Signed-off-by: Christian König <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Julien Isorce <[email protected]> Tested-by: Julien Isorce <[email protected]>
* gallium/util: removed unused header-fileErik Faye-Lund2016-01-122-53/+0
| | | | | | | | | This hasn't been in use since c476305 ("gallium/util: pregenerate half float tables"), where the last bit of run-time init using this was killed. So let's just get rid of the pointless header. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nvc0: do not force re-binding of compute constbufs on FermiSamuel Pitoiset2016-01-121-1/+1
| | | | | | | | | | Re-binding compute constant buffers after launching a grid have no effects because they are not currently validated and because dirty_cp is not updated accordingly. This might also prevent weird future behaviours when UBOs will be bound for compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: remove useless goto in nvc0_launch_grid()Samuel Pitoiset2016-01-121-6/+4
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: the whole point of data array is to hand out regular registersIlia Mirkin2016-01-111-1/+1
| | | | | Fixes: 0d3051f75a (nv50/ir: Fix scratch allocation size and file) Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Fix scratch allocation size and filePierre Moreau2016-01-092-3/+3
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversionIlia Mirkin2016-01-085-9/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: add ir3_compiler to gitignoreIlia Mirkin2016-01-081-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add a RESQ opcode to query info about a resourceIlia Mirkin2016-01-083-1/+14
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_CAP_SHADER_BUFFER_OFFSET_ALIGNMENTIlia Mirkin2016-01-0816-13/+32
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: add PIPE_SHADER_CAP_MAX_SHADER_BUFFERSIlia Mirkin2016-01-0813-0/+23
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: update atomic op docsIlia Mirkin2016-01-081-46/+47
| | | | | | | | | | Specify that the operation only applies to the x component, not per-component as previously specified. This is unnecessary for GL and creates additional complications for images which need to support these operations as well. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add a is_store propertyIlia Mirkin2016-01-082-223/+224
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: provide a way to encode memory qualifiers for SSBOIlia Mirkin2016-01-0810-2/+180
| | | | | | | | | | Each load/store on most hardware can specify what caching to do. Since SSBO allows individual variables to also have separate caching modes, allow loads/stores to have the qualifiers instead of attempting to encode them in declarations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ureg: add buffer support to uregIlia Mirkin2016-01-086-1/+69
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi: add ureg support for image declsIlia Mirkin2016-01-0812-52/+153
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nine: allow fragment shader POSITION and FACE to be system valuesMarek Olšák2016-01-082-12/+46
| | | | Reported-by: Axel Davy <[email protected]>
* vl: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-081-4/+8
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* util/pstipple: allow fragment shader POSITION to be a system valueMarek Olšák2016-01-086-11/+34
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/scan: update for POSITION and FACE sytem valuesMarek Olšák2016-01-081-1/+4
| | | | | Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* gallium: add caps for POSITION and FACE system valuesMarek Olšák2016-01-0817-6/+48
| | | | | | | v2: document the integer behavior Reviewed-by: Edward O'Callaghan <[email protected] Reviewed-by: Brian Paul <[email protected]>
* tgsi/ureg: handle redundant declarations in ureg_DECL_system_valueMarek Olšák2016-01-081-1/+9
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* tgsi/ureg: remove index parameter from ureg_DECL_system_valueMarek Olšák2016-01-082-7/+6
| | | | | | | | It can be trivially derived from the number of already declared system values. This allows ureg users not to worry about which index to choose. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* radeon, si: Use TGSI chan name defines in lp_build_emit_fetch() callsEdward O'Callaghan2016-01-082-8/+8
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/aux: Use TGSI chan name defines inplace of literalsEdward O'Callaghan2016-01-081-6/+7
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: add ARB_indirect_parameters supportIlia Mirkin2016-01-075-6/+313
| | | | | | | I chose to make separate macros for this due to the additional complexity and extra scratch usage. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for real ARB_multi_draw_indirectIlia Mirkin2016-01-074-18/+47
| | | | | | | The draw groups are now split up into groups of 32 if there's a non-packed stride, or in groups of 400-500 if the draw data is packed. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: adjust indirect draw macros to handle multiple draws at onceIlia Mirkin2016-01-073-52/+101
| | | | | | | These are still invoked one at a time, but the underlying macro can handle multiple draws. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add caps to expose support for multi indirect drawsIlia Mirkin2016-01-0716-0/+35
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>