From 216543ea547dd0572d9f2f0364f7a239a5aeafe1 Mon Sep 17 00:00:00 2001 From: Marek Olšák Date: Sat, 28 Feb 2015 00:26:31 +0100 Subject: gallium: add FMA and DFMA opcodes (v3) Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin --- src/gallium/docs/source/screen.rst | 2 ++ src/gallium/docs/source/tgsi.rst | 26 ++++++++++++++++++++++++++ 2 files changed, 28 insertions(+) (limited to 'src/gallium/docs') diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst index e0fd1a2dbac..26cc9ffc6f7 100644 --- a/src/gallium/docs/source/screen.rst +++ b/src/gallium/docs/source/screen.rst @@ -336,6 +336,8 @@ to be 0. is supported. If it is, DTRUNC/DCEIL/DFLR/DROUND opcodes may be used. * ``PIPE_SHADER_CAP_TGSI_DFRACEXP_DLDEXP_SUPPORTED``: Whether DFRACEXP and DLDEXP are supported. +* ``PIPE_SHADER_CAP_TGSI_FMA_SUPPORTED``: Whether FMA and DFMA (doubles only) + are supported. .. _pipe_compute_cap: diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst index b0a975aa70a..7771136f167 100644 --- a/src/gallium/docs/source/tgsi.rst +++ b/src/gallium/docs/source/tgsi.rst @@ -272,6 +272,21 @@ This instruction replicates its result. dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w +.. opcode:: FMA - Fused Multiply-Add + +Perform a * b + c with no intermediate rounding step. + +.. math:: + + dst.x = src0.x \times src1.x + src2.x + + dst.y = src0.y \times src1.y + src2.y + + dst.z = src0.z \times src1.z + src2.z + + dst.w = src0.w \times src1.w + src2.w + + .. opcode:: DP2A - 2-component Dot Product And Add .. math:: @@ -1962,6 +1977,17 @@ source is an integer. dst.zw = src0.zw \times src1.zw + src2.zw +.. opcode:: DFMA - Fused Multiply-Add + +Perform a * b + c with no intermediate rounding step. + +.. math:: + + dst.xy = src0.xy \times src1.xy + src2.xy + + dst.zw = src0.zw \times src1.zw + src2.zw + + .. opcode:: DRCP - Reciprocal .. math:: -- cgit v1.2.3