summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: add smallfloat to float conversion not relying on cpu denorm handlingRoland Scheidegger2014-02-201-20/+65
| | | | | | | | | | | | | | | | | | | | | | The previous code relied on cpu denorm support for converting small float formats (such r11g11b10_float and r16_float) to floats, otherwise denorms are flushed to zero. We worked around that in llvmpipe blend code by reenabling denorms, but this did nothing for texture sampling. Now it would be possible to reenable it there too but I'm not really a fan of messing with fpu flags (and it seems we can't actually do it reliably with llvm in any case looking at some bug reports). (Not to mention if you actually have a lot of denorms in there, you can expect some order-of-magnitude slowdown with x86 cpus.) So instead use code which adjusts exponents etc. directly hence not relying on cpu denorm support for the rescaling mul. (We still need the fpu flag handling as we can't do float-to-smallfloat without using cpu denorms at least for now - I actually wanted to keep both the old and new code and using one or the other depending on from where it's called but that didn't work out as the parameter would have to be passed through too many layers than I'd like.) Reviewed-by: Zack Rusin <[email protected]> Reviewed-by: Si Chen <[email protected]>
* st/omx/enc: add multi scaling buffers for performance improvementLeo Liu2014-02-202-16/+29
| | | | | Signed-off-by: Leo Liu <[email protected]> Reviewed-by: Christian König <[email protected]>
* st/omx/dec/h264: fix prevFrameNumOffset handlingChristian König2014-02-201-0/+4
| | | | Signed-off-by: Christian König <[email protected]>
* freedreno: tweak ringbuffer sizes/countRob Clark2014-02-192-2/+2
| | | | | | | | Since we are now consuming two ringbuffers at a time, we probably want a pool larger than 4.. but we don't need each individual ringbuffer to be so large, so offset the pool size increase by reducing rb size. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: scheduling/legalize fixesRob Clark2014-02-193-2/+30
| | | | | | | | | | It seems the write-after-read hazard that applies to texture fetch instructions, also applies to sfu instructions. Also, cat5/cat6 instructions do not have a (ss) bit, so in these cases we need to insert a dummy nop instruction with (ss) bit set. Signed-off-by: Rob Clark <[email protected]>
* r600g,radeonsi: Consolidate logic for short-circuiting flushesMichel Dänzer2014-02-186-6/+8
| | | | | | | | | | | | | | | Fixes radeonsi emitting command streams to the kernel even when there have been no draw calls before a flush, potentially powering up the GPU needlessly. Incidentally, this also cuts the runtime of piglit gpu.py in about half on my Kaveri system, probably because an X11 client going away no longer always results in a command stream being submitted to the kernel via glamor. Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=65761 Cc: "10.1" [email protected] Reviewed-by: Marek Olšák <[email protected]>
* st/dri: remove #ifdef DRM_CAP_PRIME guardEmil Velikov2014-02-181-2/+0
| | | | | | | | Required for libdrm 2.4.37 and earlier. Both scons and automake require version 2.4.38 now so that guard is not longer needed. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* targets/vdpau: Don't link unused librariesKusanagi Kouichi2014-02-174-4/+5
| | | | | | libvdpau, libselinux and libexpat are not used. Signed-off-by: Kusanagi Kouichi <[email protected]>
* targets/vdpau: Always use c++ to linkKusanagi Kouichi2014-02-171-5/+1
| | | | | | | | | | | If built without llvm, the following error occurs with mplayer: Failed to open VDPAU backend .../libvdpau_r600.so: undefined symbol: _ZTVN10__cxxabiv117__class_type_infoE [vo/vdpau] Error when calling vdp_device_create_x11: 1 Cc: <[email protected]> Signed-off-by: Kusanagi Kouichi <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/xvmc: fix tests so that they passIlia Mirkin2014-02-163-10/+12
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christian König <[email protected]>
* pipe-loader: add pipe loader for freedreno/msmRob Clark2014-02-162-0/+38
| | | | Signed-off-by: Rob Clark <[email protected]>
* st/xa: missing handle typeRob Clark2014-02-161-0/+1
| | | | | | | DRM_API_HANDLE_TYPE_SHARED is zero, so doesn't actually fix anything. But we shouldn't rely on SHARED handle type being zero. Signed-off-by: Rob Clark <[email protected]>
* st/xa: use pipe-loader to get screenRob Clark2014-02-167-51/+40
| | | | | | This lets multiple gallium drivers use XA. Signed-off-by: Rob Clark <[email protected]>
* pipe-loader: split out "client" versionRob Clark2014-02-164-10/+25
| | | | | | | | Build two versions of pipe-loader, with only the client version linking in x11 client side dependencies. This will allow the XA state tracker to use pipe-loader. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: use (ss) for WAR hazardsRob Clark2014-02-161-2/+19
| | | | | | | | | | | | | | | Seems texture sample instructions don't immediately consume there src(s). In fact, some shaders from blob compiler seem to indiciate that it does not even count the texture sample instructions when calculating number of delay slots to fill for non-sample instructions. (Although so far it seems inconclusive as to whether this is required.) In particular, when a src register of a previous texture sample instruction is clobbered, the (ss) bit is needed to synchronize with the tex pipeline to ensure it has picked up the previous values before they are overwritten. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: fix RA typoRob Clark2014-02-162-4/+4
| | | | | | | | | | Was supposed to be a '+', otherwise we end up with a negative offset and choosing registers below the assigned range. This seems to fix the scheduling mystery "solved" by adding in extra delay slots. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: handle kill properly (new compiler)Rob Clark2014-02-164-26/+105
| | | | | | | | | | | | | Since 'kill' does not produce a result, the new compiler was happily optimizing them out. We need to instead track 'kill's similar to outputs. But since there is no non-predicated kill instruction, (and for flattend if/else we do want them to be predicated), we need to track the topmost branch condition on the stack and use that as src arg to the kill. For a kill at the topmost level, we have to generate an immediate 1.0 to feed into the cmps.f for setting the predicate register. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: trans_cmp() sanityRob Clark2014-02-161-51/+35
| | | | | | | | | | | | | | | Thanks to figuring out 32bit float render target, and adding regdump test in fdre-a3xx, I can more easily play around with instructions to figure out range of inputs/outputs/etc. And from this I can conclude that cmps.f works more like expected and I can do something much more simple in trans_cmp() (compared to before which was more closely emulating the instruction sequence of the blob compiler). And using sel.b32 (binary 0/1) often makes more sense than sel.f32 (+/- float) or sel.u32 (+/- uint) as it can use the output directly from cmps.f without needing the 'add.s tmp0, tmp0, -1'. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix problems if no color buf boundRob Clark2014-02-162-2/+7
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium/pipebuffer: change pb_cache_manager_create() size_factor to floatBrian Paul2014-02-144-8/+8
| | | | | | | Requested by Marek. Reviewed-by: Marek Olšák <[email protected]> Cc: "10.1" <[email protected]>
* svga/winsys: Propagate surface shared information to the winsysThomas Hellstrom2014-02-145-26/+44
| | | | | | | | | | | | | | | | | The linux winsys needs to know whether a surface is shared. For guest-backed surfaces we need this information to avoid allocating a mob out of the mob cache for shared surfaces, but instead allocate a shared mob, that is never put in the mob cache, from the kernel. Also previously, all surfaces were given the "shareable" attribute when allocated from the kernel. This is too permissive for client-local surfaces. Now that we have the needed info, only set the "shareable" attribute if the client indicates that it needs to share the surface. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* svga/winsys: implement GBS supportBrian Paul2014-02-1419-323/+3064
| | | | | | | This is a squash commit of many commits by Thomas Hellstrom. Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* gallium/util: Add flush/map debug utility codeThomas Hellstrom2014-02-143-0/+530
| | | | | | Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* gallium/pipebuffer: Add a cache buffer manager bypass maskThomas Hellstrom2014-02-143-5/+23
| | | | | | | | | | | | | In some situations, it may be desirable to bypass the cache at buffer creation but to insert the buffer in the cache at buffer destruction. One such situation is where we already have a kernel representation of a buffer that we want to use, but we also want to insert it in the cache when it's freed up. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: José Fonseca <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "10.1" <[email protected]>
* pipebuffer, winsys: Add a size match parameter to the cached buffer managerThomas Hellstrom2014-02-143-4/+8
| | | | | | | | In some situations it's important to restrict the sizes of buffers that the cached buffer manager is allowed to return Signed-off-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update texture code for GBSBrian Paul2014-02-142-64/+326
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update buffer code for GBSBrian Paul2014-02-142-42/+224
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: add new helper functions for GBS buffersBrian Paul2014-02-141-0/+76
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: remove a couple unneeded assertionsBrian Paul2014-02-142-2/+0
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: adjust adjustment for point coordinatesBrian Paul2014-02-141-1/+4
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: track which textures are rendered toBrian Paul2014-02-142-18/+38
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: add helpers for tracking rendering to texturesBrian Paul2014-02-142-0/+68
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update shader code for GBSBrian Paul2014-02-148-20/+142
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update constant buffer code for GBSBrian Paul2014-02-145-69/+175
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: add svga_have_gb_objects/dma() functionsBrian Paul2014-02-141-0/+14
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: add new GBS commandsBrian Paul2014-02-142-5/+637
| | | | | | | And update some existing commands. Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update svga_winsys interface for GBSBrian Paul2014-02-146-13/+141
| | | | | | | | This adds new interface functions for guest-backed surfaces and adds a mobid parameter to the surface_relocation() function. Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: update dumping code with new GBS commands, etcBrian Paul2014-02-141-44/+268
| | | | | Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* svga: split / update svga3d header filesBrian Paul2014-02-1417-1976/+4808
| | | | | | | | | | The old svga3d_reg.h file is split into separate header files and we add new items for guest-backed surfaces. Plus some minor code fixes because of renamed symbols. Reviewed-by: Thomas Hellstrom <[email protected]> Cc: "10.1" <[email protected]>
* st/vdpau: add support for DEINTERLACE_TEMPORALGrigori Goronzy2014-02-143-4/+73
| | | | Reviewed-by: Christian König <[email protected]>
* vl: add motion adaptive deinterlacerGrigori Goronzy2014-02-143-1/+569
| | | | Reviewed-by: Christian König <[email protected]>
* st/omx/enc: fix scaling src alignment issueLeo Liu2014-02-141-1/+15
| | | | | Signed-off-by: Leo Liu <[email protected]> Signed-off-by: Christian König <[email protected]>
* radeon: reverse DBG_NO_HYPERZ logicAlex Deucher2014-02-134-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | Change the flag to DBG_HYPERZ and reverse the logic so setting the flag enabled the feature. This disables hyperz on r600g and radeonsi by default. It can be enabled by setting the env var. There are just too many issues with certain apps so leave it disabled for now until we sort out the issues with the problematic apps. Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=58660 https://bugs.freedesktop.org/show_bug.cgi?id=64471 https://bugs.freedesktop.org/show_bug.cgi?id=66352 https://bugs.freedesktop.org/show_bug.cgi?id=68799 https://bugs.freedesktop.org/show_bug.cgi?id=72685 https://bugs.freedesktop.org/show_bug.cgi?id=73088 https://bugs.freedesktop.org/show_bug.cgi?id=74428 https://bugs.freedesktop.org/show_bug.cgi?id=74803 https://bugs.freedesktop.org/show_bug.cgi?id=74863 https://bugs.freedesktop.org/show_bug.cgi?id=74892 https://bugzilla.kernel.org/show_bug.cgi?id=70411 Signed-off-by: Alex Deucher <[email protected]> Cc: "10.1" "10.0" <[email protected]> Acked-by: Marek Olšák <[email protected]>
* pipe-loader: Add support for render nodes v2Tom Stellard2014-02-131-3/+77
| | | | | | | v2: - Add missing call to pipe_loader_drm_release() - Fix render node macros - Drop render-node configure option
* pipe-loader: Add auth_x parameter to pipe_loader_drm_probe_fd()Tom Stellard2014-02-133-5/+11
| | | | | The caller can use this boolean parameter to tell the pipe-loader to authenticate with the X server when probing a file descriptor.
* st/omx/dec/h264: fix pic_order_cnt_type==2Christian König2014-02-131-1/+1
| | | | Signed-off-by: Christian König <[email protected]>
* st/omx: initial OpenMAX H264 encoder v7Christian König2014-02-135-8/+970
| | | | | | | | | | | | | | | v2 (chk): fix eos handling v3 (leo): implement scaling configuration support v4 (leo): fix bitrate bug v5 (chk): add workaround for bug in Bellagio v6 (chk): fix div by 0 if framerate isn't known, user separate pipe object for scale and transfer, always flush the transfer pipe before encoding v7 (chk): make suggested changes, cleanup a bit more, only advertise encoder on supported hardware Signed-off-by: Christian König <[email protected]> Signed-off-by: Leo Liu <[email protected]>
* radeon/vce: initial VCE support v8Christian König2014-02-136-1/+768
| | | | | | | | | | | | | | v2 (chk): revert feedback buffer hack v3 (slava): fixed bitstream size calculation v4 (chk): always create buffers in the right domain v5 (chk): flush async v6 (chk): rework fw interface add version check v7 (leo): implement cropping support v8 (chk): add hw checks Signed-off-by: Christian König <[email protected]> Signed-off-by: Leo Liu <[email protected]> Signed-off-by: Slava Grigorev <[email protected]>
* radeon/winsys: add VCE support v4Christian König2014-02-133-1/+31
| | | | | | | | v2: add fw version query v3: add README.VCE v4: avoid error msg when kernel doesn't support it Signed-off-by: Christian König <[email protected]>
* nv50: mark scissors/viewports dirty on context switchIlia Mirkin2014-02-131-0/+2
| | | | | | | | | | Commit 246ca4b001 ("nv50: implement multiple viewports/scissors, enable ARB_viewport_array") added dirty tracking to scissors/viewports. However it neglected to mark them all as dirty on a context switch. This fixes an apparent regression in webgl in chrome, but probably in any application that switches contexts. Signed-off-by: Ilia Mirkin <[email protected]>