summaryrefslogtreecommitdiffstats
path: root/src/gallium/docs/source
diff options
context:
space:
mode:
Diffstat (limited to 'src/gallium/docs/source')
-rw-r--r--src/gallium/docs/source/conf.py4
-rw-r--r--src/gallium/docs/source/context.rst119
-rw-r--r--src/gallium/docs/source/cso/blend.rst45
-rw-r--r--src/gallium/docs/source/cso/rasterizer.rst101
-rw-r--r--src/gallium/docs/source/cso/sampler.rst10
-rw-r--r--src/gallium/docs/source/distro.rst60
-rw-r--r--src/gallium/docs/source/exts/tgsi.py17
-rw-r--r--src/gallium/docs/source/glossary.rst13
-rw-r--r--src/gallium/docs/source/screen.rst230
-rw-r--r--src/gallium/docs/source/tgsi.rst689
10 files changed, 968 insertions, 320 deletions
diff --git a/src/gallium/docs/source/conf.py b/src/gallium/docs/source/conf.py
index 9b0c86babdb..59c19ed98dd 100644
--- a/src/gallium/docs/source/conf.py
+++ b/src/gallium/docs/source/conf.py
@@ -16,13 +16,13 @@ import sys, os
# If extensions (or modules to document with autodoc) are in another directory,
# add these directories to sys.path here. If the directory is relative to the
# documentation root, use os.path.abspath to make it absolute, like shown here.
-#sys.path.append(os.path.abspath('.'))
+sys.path.append(os.path.abspath('exts'))
# -- General configuration -----------------------------------------------------
# Add any Sphinx extension module names here, as strings. They can be extensions
# coming with Sphinx (named 'sphinx.ext.*') or your custom ones.
-extensions = ['sphinx.ext.pngmath']
+extensions = ['sphinx.ext.pngmath', 'tgsi']
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
diff --git a/src/gallium/docs/source/context.rst b/src/gallium/docs/source/context.rst
index 21f5f9111a0..a7669575b95 100644
--- a/src/gallium/docs/source/context.rst
+++ b/src/gallium/docs/source/context.rst
@@ -33,7 +33,11 @@ This state describes how resources in various flavours (textures,
buffers, surfaces) are bound to the driver.
-* ``set_constant_buffer``
+* ``set_constant_buffer`` sets a constant buffer to be used for a given shader
+ type. index is used to indicate which buffer to set (some apis may allow
+ multiple ones to be set, and binding a specific one later, though drivers
+ are mostly restricted to the first one right now).
+
* ``set_framebuffer_state``
* ``set_fragment_sampler_textures``
* ``set_vertex_sampler_textures``
@@ -47,11 +51,13 @@ These pieces of state are too small, variable, and/or trivial to have CSO
objects. They all follow simple, one-method binding calls, e.g.
``set_edgeflags``.
-* ``set_edgeflags``
* ``set_blend_color``
* ``set_clip_state``
* ``set_polygon_stipple``
-* ``set_scissor_state``
+* ``set_scissor_state`` sets the bounds for the scissor test, which culls
+ pixels before blending to render targets. If the :ref:`Rasterizer` does
+ not have the scissor test enabled, then the scissor bounds never need to
+ be set since they will not be used.
* ``set_viewport_state``
* ``set_vertex_elements``
@@ -72,12 +78,67 @@ stencil-only clears of packed depth-stencil buffers.
Drawing
^^^^^^^
-``draw_arrays``
+``draw_arrays`` draws a specified primitive.
+
+This command is equivalent to calling ``draw_arrays_instanced``
+with ``startInstance`` set to 0 and ``instanceCount`` set to 1.
-``draw_elements``
+``draw_elements`` draws a specified primitive using an optional
+index buffer.
+
+This command is equivalent to calling ``draw_elements_instanced``
+with ``startInstance`` set to 0 and ``instanceCount`` set to 1.
``draw_range_elements``
+XXX: this is (probably) a temporary entrypoint, as the range
+information should be available from the vertex_buffer state.
+Using this to quickly evaluate a specialized path in the draw
+module.
+
+``draw_arrays_instanced`` draws multiple instances of the same primitive.
+
+This command is equivalent to calling ``draw_elements_instanced``
+with ``indexBuffer`` set to NULL and ``indexSize`` set to 0.
+
+``draw_elements_instanced`` draws multiple instances of the same primitive
+using an optional index buffer.
+
+For instanceID in the range between ``startInstance``
+and ``startInstance``+``instanceCount``-1, inclusive, draw a primitive
+specified by ``mode`` and sequential numbers in the range between ``start``
+and ``start``+``count``-1, inclusive.
+
+If ``indexBuffer`` is not NULL, it specifies an index buffer with index
+byte size of ``indexSize``. The sequential numbers are used to lookup
+the index buffer and the resulting indices in turn are used to fetch
+vertex attributes.
+
+If ``indexBuffer`` is NULL, the sequential numbers are used directly
+as indices to fetch vertex attributes.
+
+If a given vertex element has ``instance_divisor`` set to 0, it is said
+it contains per-vertex data and effective vertex attribute address needs
+to be recalculated for every index.
+
+ attribAddr = ``stride`` * index + ``src_offset``
+
+If a given vertex element has ``instance_divisor`` set to non-zero,
+it is said it contains per-instance data and effective vertex attribute
+address needs to recalculated for every ``instance_divisor``-th instance.
+
+ attribAddr = ``stride`` * instanceID / ``instance_divisor`` + ``src_offset``
+
+In the above formulas, ``src_offset`` is taken from the given vertex element
+and ``stride`` is taken from a vertex buffer associated with the given
+vertex element.
+
+The calculated attribAddr is used as an offset into the vertex buffer to
+fetch the attribute data.
+
+The value of ``instanceID`` can be read in a vertex shader through a system
+value register declared with INSTANCEID semantic name.
+
Queries
^^^^^^^
@@ -87,9 +148,51 @@ draws. Queries may be nested, though no state tracker currently
exercises this.
Queries can be created with ``create_query`` and deleted with
-``destroy_query``. To enable a query, use ``begin_query``, and when finished,
-use ``end_query`` to stop the query. Finally, ``get_query_result`` is used
-to retrieve the results.
+``destroy_query``. To start a query, use ``begin_query``, and when finished,
+use ``end_query`` to end the query.
+
+``get_query_result`` is used to retrieve the results of a query. If
+the ``wait`` parameter is TRUE, then the ``get_query_result`` call
+will block until the results of the query are ready (and TRUE will be
+returned). Otherwise, if the ``wait`` parameter is FALSE, the call
+will not block and the return value will be TRUE if the query has
+completed or FALSE otherwise.
+
+A common type of query is the occlusion query which counts the number of
+fragments/pixels which are written to the framebuffer (and not culled by
+Z/stencil/alpha testing or shader KILL instructions).
+
+
+Conditional Rendering
+^^^^^^^^^^^^^^^^^^^^^
+
+A drawing command can be skipped depending on the outcome of a query
+(typically an occlusion query). The ``render_condition`` function specifies
+the query which should be checked prior to rendering anything.
+
+If ``render_condition`` is called with ``query`` = NULL, conditional
+rendering is disabled and drawing takes place normally.
+
+If ``render_condition`` is called with a non-null ``query`` subsequent
+drawing commands will be predicated on the outcome of the query. If
+the query result is zero subsequent drawing commands will be skipped.
+
+If ``mode`` is PIPE_RENDER_COND_WAIT the driver will wait for the
+query to complete before deciding whether to render.
+
+If ``mode`` is PIPE_RENDER_COND_NO_WAIT and the query has not yet
+completed, the drawing command will be executed normally. If the query
+has completed, drawing will be predicated on the outcome of the query.
+
+If ``mode`` is PIPE_RENDER_COND_BY_REGION_WAIT or
+PIPE_RENDER_COND_BY_REGION_NO_WAIT rendering will be predicated as above
+for the non-REGION modes but in the case that an occulusion query returns
+a non-zero result, regions which were occluded may be ommitted by subsequent
+drawing commands. This can result in better performance with some GPUs.
+Normally, if the occlusion query returned a non-zero result subsequent
+drawing happens normally so fragments may be generated, shaded and
+processed even where they're known to be obscured.
+
Flushing
^^^^^^^^
diff --git a/src/gallium/docs/source/cso/blend.rst b/src/gallium/docs/source/cso/blend.rst
index fd9e4a1e2d5..55c0f328859 100644
--- a/src/gallium/docs/source/cso/blend.rst
+++ b/src/gallium/docs/source/cso/blend.rst
@@ -6,9 +6,50 @@ Blend
This state controls blending of the final fragments into the target rendering
buffers.
-XXX it is unresolved what behavior should result if blend_enable is off.
+Blend Factors
+-------------
+
+The blend factors largely follow the same pattern as their counterparts
+in other modern and legacy drawing APIs.
+
+XXX blurb about dual-source blends
Members
-------
-XXX undocumented members
+independent_blend_enable
+ If enabled, blend state is different for each render target, and
+ for each render target set in the respective member of the rt array.
+ If disabled, blend state is the same for all render targets, and only
+ the first member of the rt array contains valid data.
+logicop_enable
+ Enables logic ops. Cannot be enabled at the same time as blending, and
+ is always the same for all render targets.
+logicop_func
+ The logic operation to use if logic ops are enabled. One of PIPE_LOGICOP.
+dither
+ Whether dithering is enabled.
+rt
+ Contains the per-rendertarget blend state.
+
+Per-rendertarget Members
+------------------------
+
+blend_enable
+ If blending is enabled, perform a blend calculation according to blend
+ functions and source/destination factors. Otherwise, the incoming fragment
+ color gets passed unmodified (but colormask still applies).
+rgb_func
+ The blend function to use for rgb channels. One of PIPE_BLEND.
+rgb_src_factor
+ The blend source factor to use for rgb channels. One of PIPE_BLENDFACTOR.
+rgb_dst_factor
+ The blend destination factor to use for rgb channels. One of PIPE_BLENDFACTOR.
+alpha_func
+ The blend function to use for the alpha channel. One of PIPE_BLEND.
+alpha_src_factor
+ The blend source factor to use for the alpha channel. One of PIPE_BLENDFACTOR.
+alpha_dst_factor
+ The blend destination factor to use for alpha channel. One of PIPE_BLENDFACTOR.
+colormask
+ Bitmask of which channels to write. Combination of PIPE_MASK bits.
diff --git a/src/gallium/docs/source/cso/rasterizer.rst b/src/gallium/docs/source/cso/rasterizer.rst
index 00d65fc598a..24cc78c68de 100644
--- a/src/gallium/docs/source/cso/rasterizer.rst
+++ b/src/gallium/docs/source/cso/rasterizer.rst
@@ -7,32 +7,69 @@ The rasterizer state controls the rendering of points, lines and triangles.
Attributes include polygon culling state, line width, line stipple,
multisample state, scissoring and flat/smooth shading.
-
Members
-------
+bypass_vs_clip_and_viewport
+^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Whether the entire TCL pipeline should be bypassed. This implies that
+vertices are pre-transformed for the viewport, and will not be run
+through the vertex shader.
+
+.. note::
+
+ Implementations may still clip away vertices that are not in the viewport
+ when this is set.
+
flatshade
- If set, the provoking vertex of each polygon is used to determine the
- color of the entire polygon. If not set, fragment colors will be
- interpolated between the vertex colors.
- Note that this is separate from the fragment shader input attributes
- CONSTANT, LINEAR and PERSPECTIVE. We need the flatshade state at
+^^^^^^^^^
+
+If set, the provoking vertex of each polygon is used to determine the color
+of the entire polygon. If not set, fragment colors will be interpolated
+between the vertex colors.
+
+The actual interpolated shading algorithm is obviously
+implementation-dependent, but will usually be Gourard for most hardware.
+
+.. note::
+
+ This is separate from the fragment shader input attributes
+ CONSTANT, LINEAR and PERSPECTIVE. The flatshade state is needed at
clipping time to determine how to set the color of new vertices.
- Also note that the draw module can implement flat shading by copying
- the provoking vertex color to all the other vertices in the primitive.
+
+ :ref:`Draw` can implement flat shading by copying the provoking vertex
+ color to all the other vertices in the primitive.
flatshade_first
- Whether the first vertex should be the provoking vertex, for most
- primitives. If not set, the last vertex is the provoking vertex.
+^^^^^^^^^^^^^^^
+
+Whether the first vertex should be the provoking vertex, for most primitives.
+If not set, the last vertex is the provoking vertex.
+
+There are several important exceptions to the specification of this rule.
+
+* ``PIPE_PRIMITIVE_POLYGON``: The provoking vertex is always the first
+ vertex. If the caller wishes to change the provoking vertex, they merely
+ need to rotate the vertices themselves.
+* ``PIPE_PRIMITIVE_QUAD``, ``PIPE_PRIMITIVE_QUAD_STRIP``: This option has no
+ effect; the provoking vertex is always the last vertex.
+* ``PIPE_PRIMITIVE_TRIANGLE_FAN``: When set, the provoking vertex is the
+ second vertex, not the first. This permits each segment of the fan to have
+ a different color.
+
+Other Members
+^^^^^^^^^^^^^
light_twoside
- If set, there are per-vertex back-facing colors. The draw module
+ If set, there are per-vertex back-facing colors. :ref:`Draw`
uses this state along with the front/back information to set the
final vertex colors prior to rasterization.
front_winding
Indicates the window order of front-facing polygons, either
PIPE_WINDING_CW or PIPE_WINDING_CCW
+
cull_mode
Indicates which polygons to cull, either PIPE_WINDING_NONE (cull no
polygons), PIPE_WINDING_CW (cull clockwise-winding polygons),
@@ -68,7 +105,7 @@ line_stipple_enable
line_stipple_pattern
16-bit bitfield of on/off flags, used to pattern the line stipple.
line_stipple_factor
- When drawinga stippled line, each bit in the stipple pattern is
+ When drawing a stippled line, each bit in the stipple pattern is
repeated N times, where N = line_stipple_factor + 1.
line_last_pixel
Controls whether the last pixel in a line is drawn or not. OpenGL
@@ -98,7 +135,7 @@ sprite_coord_mode
coordinate (0,0,0,1).
For PIPE_SPRITE_COORD_UPPER_LEFT, the upper-left vertex will have
coordinate (0,0,0,1).
- This state is needed by the 'draw' module because that's where each
+ This state is needed by :ref:`Draw` because that's where each
point vertex is converted into four quad vertices. There's no other
place to emit the new vertex texture coordinates which are required for
sprite rendering.
@@ -118,45 +155,9 @@ scissor
Whether the scissor test is enabled.
multisample
- Whether :ref:`MSAA` is enabled.
-
-bypass_vs_clip_and_viewport
- Whether the entire TCL pipeline should be bypassed. This implies that
- vertices are pre-transformed for the viewport, and will not be run
- through the vertex shader. Note that implementations may still clip away
- vertices that are not in the viewport.
+ Whether :term:`MSAA` is enabled.
gl_rasterization_rules
Whether the rasterizer should use (0.5, 0.5) pixel centers. When not set,
the rasterizer will use (0, 0) for pixel centers.
-
-Notes
------
-
-flatshade
-^^^^^^^^^
-
-The actual interpolated shading algorithm is obviously
-implementation-dependent, but will usually be Gourard for most hardware.
-
-bypass_vs_clip_and_viewport
-^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-When set, this implies that vertices are pre-transformed for the viewport, and
-will not be run through the vertex shader. Note that implementations may still
-clip away vertices that are not visible.
-
-flatshade_first
-^^^^^^^^^^^^^^^
-
-There are several important exceptions to the specification of this rule.
-
-* ``PIPE_PRIMITIVE_POLYGON``: The provoking vertex is always the first
- vertex. If the caller wishes to change the provoking vertex, they merely
- need to rotate the vertices themselves.
-* ``PIPE_PRIMITIVE_QUAD``, ``PIPE_PRIMITIVE_QUAD_STRIP``: This option has no
- effect; the provoking vertex is always the last vertex.
-* ``PIPE_PRIMITIVE_TRIANGLE_FAN``: When set, the provoking vertex is the
- second vertex, not the first. This permits each segment of the fan to have
- a different color.
diff --git a/src/gallium/docs/source/cso/sampler.rst b/src/gallium/docs/source/cso/sampler.rst
index e3f1757f57a..044ffffcb4f 100644
--- a/src/gallium/docs/source/cso/sampler.rst
+++ b/src/gallium/docs/source/cso/sampler.rst
@@ -12,8 +12,6 @@ with the traditional (S, T, R, Q) notation.
Members
-------
-XXX undocumented compare_mode, compare_func
-
wrap_s
How to wrap the S coordinate. One of PIPE_TEX_WRAP.
wrap_t
@@ -27,12 +25,16 @@ min_mip_filter
PIPE_TEX_FILTER.
mag_img_filter
The filter to use when magnifying texels. One of PIPE_TEX_FILTER.
+compare_mode
+ If set to PIPE_TEX_COMPARE_R_TO_TEXTURE, texture output is computed
+ according to compare_func, using r coord and the texture value as operands.
+ If set to PIPE_TEX_COMPARE_NONE, no comparison calculation is performed.
+compare_func
+ How the comparison is computed. One of PIPE_FUNC.
normalized_coords
Whether the texture coordinates are normalized. If normalized, they will
always be in [0, 1]. If not, they will be in the range of each dimension
of the loaded texture.
-prefilter
- XXX From the Doxy, "weird sampling state exposed by some APIs." Refine.
lod_bias
The bias to apply to the level of detail.
min_lod
diff --git a/src/gallium/docs/source/distro.rst b/src/gallium/docs/source/distro.rst
index 33e846e33d2..100afe33972 100644
--- a/src/gallium/docs/source/distro.rst
+++ b/src/gallium/docs/source/distro.rst
@@ -31,21 +31,6 @@ Wrapper driver.
LLVM Softpipe
^^^^^^^^^^^^^
-nVidia nv04
-^^^^^^^^^^^
-
-Deprecated.
-
-nVidia nv10
-^^^^^^^^^^^
-
-Deprecated.
-
-nVidia nv20
-^^^^^^^^^^^
-
-Deprecated.
-
nVidia nv30
^^^^^^^^^^^
@@ -61,10 +46,7 @@ VMWare SVGA
ATI r300
^^^^^^^^
-AMD/ATI r600
-^^^^^^^^^^^^
-
-Highly experimental.
+Testing-quality.
Softpipe
^^^^^^^^
@@ -106,20 +88,50 @@ Xorg XFree86 DDX
Auxiliary
---------
+OS
+^^
+
+The OS module contains the abstractions for basic operating system services:
+
+* memory allocation
+* simple message logging
+* obtaining run-time configuration option
+* threading primitives
+
+This is the bare minimum required to port Gallium to a new platform.
+
+The OS module already provides the implementations of these abstractions for
+the most common platforms. When targeting an embedded platform no
+implementation will be provided -- these must be provided separately.
+
CSO Cache
^^^^^^^^^
+The CSO cache is used to accelerate preparation of state by saving
+driver-specific state structures for later use.
+
+.. _draw:
+
Draw
^^^^
+Draw is a software :term:`TCL` pipeline for hardware that lacks vertex shaders
+or other essential parts of pre-rasterization vertex preparation.
+
Gallivm
^^^^^^^
Indices
^^^^^^^
-Pipe Buffer Manager
-^^^^^^^^^^^^^^^^^^^
+Indices provides tools for translating or generating element indices for
+use with element-based rendering.
+
+Pipe Buffer Managers
+^^^^^^^^^^^^^^^^^^^^
+
+Each of these managers provides various services to drivers that are not
+fully utilizing a memory manager.
Remote Debugger
^^^^^^^^^^^^^^^
@@ -127,12 +139,12 @@ Remote Debugger
Runtime Assembly Emission
^^^^^^^^^^^^^^^^^^^^^^^^^
-Surface Context Tracker
-^^^^^^^^^^^^^^^^^^^^^^^
-
TGSI
^^^^
+The TGSI auxiliary module provides basic utilities for manipulating TGSI
+streams.
+
Translate
^^^^^^^^^
diff --git a/src/gallium/docs/source/exts/tgsi.py b/src/gallium/docs/source/exts/tgsi.py
new file mode 100644
index 00000000000..e92cd5c4d1b
--- /dev/null
+++ b/src/gallium/docs/source/exts/tgsi.py
@@ -0,0 +1,17 @@
+# tgsi.py
+# Sphinx extension providing formatting for TGSI opcodes
+# (c) Corbin Simpson 2010
+
+import docutils.nodes
+import sphinx.addnodes
+
+def parse_opcode(env, sig, signode):
+ opcode, desc = sig.split("-", 1)
+ opcode = opcode.strip().upper()
+ desc = " (%s)" % desc.strip()
+ signode += sphinx.addnodes.desc_name(opcode, opcode)
+ signode += sphinx.addnodes.desc_annotation(desc, desc)
+ return opcode
+
+def setup(app):
+ app.add_description_unit("opcode", "opcode", "%s (TGSI opcode)", parse_opcode)
diff --git a/src/gallium/docs/source/glossary.rst b/src/gallium/docs/source/glossary.rst
index 6a9110ce786..0696cb5d277 100644
--- a/src/gallium/docs/source/glossary.rst
+++ b/src/gallium/docs/source/glossary.rst
@@ -8,3 +8,16 @@ Glossary
Multi-Sampled Anti-Aliasing. A basic anti-aliasing technique that takes
multiple samples of the depth buffer, and uses this information to
smooth the edges of polygons.
+
+ TCL
+ Transform, Clipping, & Lighting. The three stages of preparation in a
+ rasterizing pipeline prior to the actual rasterization of vertices into
+ fragments.
+
+ NPOT
+ Non-power-of-two. Usually applied to textures which have at least one
+ dimension which is not a power of two.
+
+ LOD
+ Level of Detail. Also spelled "LoD." The value that determines when the
+ switches between mipmaps occur during texture sampling.
diff --git a/src/gallium/docs/source/screen.rst b/src/gallium/docs/source/screen.rst
index 9631e6967ef..27f65522b69 100644
--- a/src/gallium/docs/source/screen.rst
+++ b/src/gallium/docs/source/screen.rst
@@ -3,6 +3,160 @@ Screen
A screen is an object representing the context-independent part of a device.
+Useful Flags
+------------
+
+.. _pipe_cap:
+
+PIPE_CAP
+^^^^^^^^
+
+Pipe capabilities help expose hardware functionality not explicitly required
+by Gallium. For floating-point values, use :ref:`get_paramf`, and for boolean
+or integer values, use :ref:`get_param`.
+
+The integer capabilities:
+
+* ``MAX_TEXTURE_IMAGE_UNITS``: The maximum number of samplers available.
+* ``NPOT_TEXTURES``: Whether :term:`NPOT` textures may have repeat modes,
+ normalized coordinates, and mipmaps.
+* ``TWO_SIDED_STENCIL``: Whether the stencil test can also affect back-facing
+ polygons.
+* ``GLSL``: Deprecated.
+* ``DUAL_SOURCE_BLEND``: Whether dual-source blend factors are supported. See
+ :ref:`Blend` for more information.
+* ``ANISOTROPIC_FILTER``: Whether textures can be filtered anisotropically.
+* ``POINT_SPRITE``: Whether point sprites are available.
+* ``MAX_RENDER_TARGETS``: The maximum number of render targets that may be
+ bound.
+* ``OCCLUSION_QUERY``: Whether occlusion queries are available.
+* ``TEXTURE_SHADOW_MAP``: XXX
+* ``MAX_TEXTURE_2D_LEVELS``: The maximum number of mipmap levels available
+ for a 2D texture.
+* ``MAX_TEXTURE_3D_LEVELS``: The maximum number of mipmap levels available
+ for a 3D texture.
+* ``MAX_TEXTURE_CUBE_LEVELS``: The maximum number of mipmap levels available
+ for a cubemap.
+* ``TEXTURE_MIRROR_CLAMP``: Whether mirrored texture coordinates with clamp
+ are supported.
+* ``TEXTURE_MIRROR_REPEAT``: Whether mirrored repeating texture coordinates
+ are supported.
+* ``MAX_VERTEX_TEXTURE_UNITS``: The maximum number of samplers addressable
+ inside the vertex shader. If this is 0, then the vertex shader cannot
+ sample textures.
+* ``TGSI_CONT_SUPPORTED``: Whether the TGSI CONT opcode is supported.
+* ``BLEND_EQUATION_SEPARATE``: Whether alpha blend equations may be different
+ from color blend equations, in :ref:`Blend` state.
+* ``SM3``: Whether the vertex shader and fragment shader support equivalent
+ opcodes to the Shader Model 3 specification. XXX oh god this is horrible
+* ``MAX_PREDICATE_REGISTERS``: XXX
+* ``MAX_COMBINED_SAMPLERS``: The total number of samplers accessible from
+ the vertex and fragment shader, inclusive.
+* ``MAX_CONST_BUFFERS``: Maximum number of constant buffers that can be bound
+ to any shader stage using ``set_constant_buffer``. If 0 or 1, the pipe will
+ only permit binding one constant buffer per shader, and the shaders will
+ not permit two-dimensional access to constants.
+* ``MAX_CONST_BUFFER_SIZE``: Maximum byte size of a single constant buffer.
+* ``INDEP_BLEND_ENABLE``: Whether per-rendertarget blend enabling and channel
+ masks are supported. If 0, then the first rendertarget's blend mask is
+ replicated across all MRTs.
+* ``INDEP_BLEND_FUNC``: Whether per-rendertarget blend functions are
+ available. If 0, then the first rendertarget's blend functions affect all
+ MRTs.
+* ``PIPE_CAP_TGSI_FS_COORD_ORIGIN_UPPER_LEFT``: Whether the TGSI property
+ FS_COORD_ORIGIN with value UPPER_LEFT is supported.
+* ``PIPE_CAP_TGSI_FS_COORD_ORIGIN_LOWER_LEFT``: Whether the TGSI property
+ FS_COORD_ORIGIN with value LOWER_LEFT is supported.
+* ``PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_HALF_INTEGER``: Whether the TGSI
+ property FS_COORD_PIXEL_CENTER with value HALF_INTEGER is supported.
+* ``PIPE_CAP_TGSI_FS_COORD_PIXEL_CENTER_INTEGER``: Whether the TGSI
+ property FS_COORD_PIXEL_CENTER with value INTEGER is supported.
+
+The floating-point capabilities:
+
+* ``MAX_LINE_WIDTH``: The maximum width of a regular line.
+* ``MAX_LINE_WIDTH_AA``: The maximum width of a smoothed line.
+* ``MAX_POINT_WIDTH``: The maximum width and height of a point.
+* ``MAX_POINT_WIDTH_AA``: The maximum width and height of a smoothed point.
+* ``MAX_TEXTURE_ANISOTROPY``: The maximum level of anisotropy that can be
+ applied to anisotropically filtered textures.
+* ``MAX_TEXTURE_LOD_BIAS``: The maximum :term:`LOD` bias that may be applied
+ to filtered textures.
+* ``GUARD_BAND_LEFT``, ``GUARD_BAND_TOP``, ``GUARD_BAND_RIGHT``,
+ ``GUARD_BAND_BOTTOM``: XXX
+
+XXX Is there a better home for this? vvv
+
+If 0 is returned, the driver is not aware of multiple constant buffers,
+supports binding of only one constant buffer, and does not support
+two-dimensional CONST register file access in TGSI shaders.
+
+If a value greater than 0 is returned, the driver can have multiple
+constant buffers bound to shader stages. The CONST register file can
+be accessed with two-dimensional indices, like in the example below.
+
+DCL CONST[0][0..7] # declare first 8 vectors of constbuf 0
+DCL CONST[3][0] # declare first vector of constbuf 3
+MOV OUT[0], CONST[0][3] # copy vector 3 of constbuf 0
+
+For backwards compatibility, one-dimensional access to CONST register
+file is still supported. In that case, the constbuf index is assumed
+to be 0.
+
+.. _pipe_buffer_usage:
+
+PIPE_BUFFER_USAGE
+^^^^^^^^^^^^^^^^^
+
+These flags control buffer creation. Buffers may only have one role, so
+care should be taken to not allocate a buffer with the wrong usage.
+
+* ``PIXEL``: This is the flag to use for all textures.
+* ``VERTEX``: A vertex buffer.
+* ``INDEX``: An element buffer.
+* ``CONSTANT``: A buffer of shader constants.
+
+Buffers are inevitably abstracting the pipe's underlying memory management,
+so many of their usage flags can be used to direct the way the buffer is
+handled.
+
+* ``CPU_READ``, ``CPU_WRITE``: Whether the user will map and, in the case of
+ the latter, write to, the buffer. The convenience flag ``CPU_READ_WRITE`` is
+ available to signify a read/write buffer.
+* ``GPU_READ``, ``GPU_WRITE``: Whether the driver will internally need to
+ read from or write to the buffer. The latter will only happen if the buffer
+ is made into a render target.
+* ``DISCARD``: When set on a map, the contents of the map will be discarded
+ beforehand. Cannot be used with ``CPU_READ``.
+* ``DONTBLOCK``: When set on a map, the map will fail if the buffer cannot be
+ mapped immediately.
+* ``UNSYNCHRONIZED``: When set on a map, any outstanding operations on the
+ buffer will be ignored. The interaction of any writes to the map and any
+ operations pending with the buffer are undefined. Cannot be used with
+ ``CPU_READ``.
+* ``FLUSH_EXPLICIT``: When set on a map, written ranges of the map require
+ explicit flushes using :ref:`buffer_flush_mapped_range`. Requires
+ ``CPU_WRITE``.
+
+.. _pipe_texture_usage:
+
+PIPE_TEXTURE_USAGE
+^^^^^^^^^^^^^^^^^^
+
+These flags determine the possible roles a texture may be used for during its
+lifetime. Texture usage flags are cumulative and may be combined to create a
+texture that can be used as multiple things.
+
+* ``RENDER_TARGET``: A colorbuffer or pixelbuffer.
+* ``DISPLAY_TARGET``: A sharable buffer that can be given to another process.
+* ``PRIMARY``: A frontbuffer or scanout buffer.
+* ``DEPTH_STENCIL``: A depthbuffer, stencilbuffer, or Z buffer. Gallium does
+ not explicitly provide for stencil-only buffers, so any stencilbuffer
+ validated here is implicitly also a depthbuffer.
+* ``SAMPLER``: A texture that may be sampled from in a fragment or vertex
+ shader.
+* ``DYNAMIC``: A texture that will be mapped frequently.
+
Methods
-------
@@ -18,22 +172,96 @@ get_vendor
Returns the screen vendor.
+.. _get_param:
+
get_param
^^^^^^^^^
Get an integer/boolean screen parameter.
+**param** is one of the :ref:`PIPE_CAP` names.
+
+.. _get_paramf:
+
get_paramf
^^^^^^^^^^
Get a floating-point screen parameter.
+**param** is one of the :ref:`PIPE_CAP` names.
+
+context_create
+^^^^^^^^^^^^^^
+
+Create a pipe_context.
+
+**priv** is private data of the caller, which may be put to various
+unspecified uses, typically to do with implementing swapbuffers
+and/or front-buffer rendering.
+
is_format_supported
^^^^^^^^^^^^^^^^^^^
See if a format can be used in a specific manner.
+**usage** is a bitmask of :ref:`PIPE_TEXTURE_USAGE` flags.
+
+Returns TRUE if all usages can be satisfied.
+
+.. note::
+
+ ``PIPE_TEXTURE_USAGE_DYNAMIC`` is not a valid usage.
+
+.. _texture_create:
+
texture_create
^^^^^^^^^^^^^^
-Given a template of texture setup, create a BO-backed texture.
+Given a template of texture setup, create a buffer and texture.
+
+texture_blanket
+^^^^^^^^^^^^^^^
+
+Like :ref:`texture_create`, but use a supplied buffer instead of creating a
+new one.
+
+texture_destroy
+^^^^^^^^^^^^^^^
+
+Destroy a texture. The buffer backing the texture is destroyed if it has no
+more references.
+
+buffer_map
+^^^^^^^^^^
+
+Map a buffer into memory.
+
+**usage** is a bitmask of :ref:`PIPE_BUFFER_USAGE` flags.
+
+Returns a pointer to the map, or NULL if the mapping failed.
+
+buffer_map_range
+^^^^^^^^^^^^^^^^
+
+Map a range of a buffer into memory.
+
+The returned map is always relative to the beginning of the buffer, not the
+beginning of the mapped range.
+
+.. _buffer_flush_mapped_range:
+
+buffer_flush_mapped_range
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Flush a range of mapped memory into a buffer.
+
+The buffer must have been mapped with ``PIPE_BUFFER_USAGE_FLUSH_EXPLICIT``.
+
+**usage** is a bitmask of :ref:`PIPE_BUFFER_USAGE` flags.
+
+buffer_unmap
+^^^^^^^^^^^^
+
+Unmap a buffer from memory.
+
+Any pointers into the map should be considered invalid and discarded.
diff --git a/src/gallium/docs/source/tgsi.rst b/src/gallium/docs/source/tgsi.rst
index ef068448e83..c292cd37d5c 100644
--- a/src/gallium/docs/source/tgsi.rst
+++ b/src/gallium/docs/source/tgsi.rst
@@ -6,6 +6,23 @@ for describing shaders. Since Gallium is inherently shaderful, shaders are
an important part of the API. TGSI is the only intermediate representation
used by all drivers.
+Basics
+------
+
+All TGSI instructions, known as *opcodes*, operate on arbitrary-precision
+floating-point four-component vectors. An opcode may have up to one
+destination register, known as *dst*, and between zero and three source
+registers, called *src0* through *src2*, or simply *src* if there is only
+one.
+
+Some instructions, like :opcode:`I2F`, permit re-interpretation of vector
+components as integers. Other instructions permit using registers as
+two-component vectors with double precision; see :ref:`Double Opcodes`.
+
+When an instruction has a scalar result, the result is usually copied into
+each of the components of *dst*. When this happens, the result is said to be
+*replicated* to *dst*. :opcode:`RCP` is one such instruction.
+
Instruction Set
---------------
@@ -13,7 +30,7 @@ From GL_NV_vertex_program
^^^^^^^^^^^^^^^^^^^^^^^^^
-ARL - Address Register Load
+.. opcode:: ARL - Address Register Load
.. math::
@@ -26,7 +43,7 @@ ARL - Address Register Load
dst.w = \lfloor src.w\rfloor
-MOV - Move
+.. opcode:: MOV - Move
.. math::
@@ -39,7 +56,7 @@ MOV - Move
dst.w = src.w
-LIT - Light Coefficients
+.. opcode:: LIT - Light Coefficients
.. math::
@@ -52,33 +69,25 @@ LIT - Light Coefficients
dst.w = 1
-RCP - Reciprocal
-
-.. math::
+.. opcode:: RCP - Reciprocal
- dst.x = \frac{1}{src.x}
+This instruction replicates its result.
- dst.y = \frac{1}{src.x}
+.. math::
- dst.z = \frac{1}{src.x}
+ dst = \frac{1}{src.x}
- dst.w = \frac{1}{src.x}
+.. opcode:: RSQ - Reciprocal Square Root
-RSQ - Reciprocal Square Root
+This instruction replicates its result.
.. math::
- dst.x = \frac{1}{\sqrt{|src.x|}}
-
- dst.y = \frac{1}{\sqrt{|src.x|}}
-
- dst.z = \frac{1}{\sqrt{|src.x|}}
+ dst = \frac{1}{\sqrt{|src.x|}}
- dst.w = \frac{1}{\sqrt{|src.x|}}
-
-EXP - Approximate Exponential Base 2
+.. opcode:: EXP - Approximate Exponential Base 2
.. math::
@@ -91,7 +100,7 @@ EXP - Approximate Exponential Base 2
dst.w = 1
-LOG - Approximate Logarithm Base 2
+.. opcode:: LOG - Approximate Logarithm Base 2
.. math::
@@ -104,7 +113,7 @@ LOG - Approximate Logarithm Base 2
dst.w = 1
-MUL - Multiply
+.. opcode:: MUL - Multiply
.. math::
@@ -117,7 +126,7 @@ MUL - Multiply
dst.w = src0.w \times src1.w
-ADD - Add
+.. opcode:: ADD - Add
.. math::
@@ -130,33 +139,25 @@ ADD - Add
dst.w = src0.w + src1.w
-DP3 - 3-component Dot Product
-
-.. math::
+.. opcode:: DP3 - 3-component Dot Product
- dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+This instruction replicates its result.
- dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+.. math::
- dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+ dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
- dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z
+.. opcode:: DP4 - 4-component Dot Product
-DP4 - 4-component Dot Product
+This instruction replicates its result.
.. math::
- dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
- dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
- dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
+ dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
- dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src0.w \times src1.w
-
-DST - Distance Vector
+.. opcode:: DST - Distance Vector
.. math::
@@ -169,7 +170,7 @@ DST - Distance Vector
dst.w = src1.w
-MIN - Minimum
+.. opcode:: MIN - Minimum
.. math::
@@ -182,7 +183,7 @@ MIN - Minimum
dst.w = min(src0.w, src1.w)
-MAX - Maximum
+.. opcode:: MAX - Maximum
.. math::
@@ -195,7 +196,7 @@ MAX - Maximum
dst.w = max(src0.w, src1.w)
-SLT - Set On Less Than
+.. opcode:: SLT - Set On Less Than
.. math::
@@ -208,7 +209,7 @@ SLT - Set On Less Than
dst.w = (src0.w < src1.w) ? 1 : 0
-SGE - Set On Greater Equal Than
+.. opcode:: SGE - Set On Greater Equal Than
.. math::
@@ -221,7 +222,7 @@ SGE - Set On Greater Equal Than
dst.w = (src0.w >= src1.w) ? 1 : 0
-MAD - Multiply And Add
+.. opcode:: MAD - Multiply And Add
.. math::
@@ -234,7 +235,7 @@ MAD - Multiply And Add
dst.w = src0.w \times src1.w + src2.w
-SUB - Subtract
+.. opcode:: SUB - Subtract
.. math::
@@ -247,7 +248,7 @@ SUB - Subtract
dst.w = src0.w - src1.w
-LRP - Linear Interpolate
+.. opcode:: LRP - Linear Interpolate
.. math::
@@ -260,7 +261,7 @@ LRP - Linear Interpolate
dst.w = src0.w \times src1.w + (1 - src0.w) \times src2.w
-CND - Condition
+.. opcode:: CND - Condition
.. math::
@@ -273,7 +274,7 @@ CND - Condition
dst.w = (src2.w > 0.5) ? src0.w : src1.w
-DP2A - 2-component Dot Product And Add
+.. opcode:: DP2A - 2-component Dot Product And Add
.. math::
@@ -286,7 +287,7 @@ DP2A - 2-component Dot Product And Add
dst.w = src0.x \times src1.x + src0.y \times src1.y + src2.x
-FRAC - Fraction
+.. opcode:: FRAC - Fraction
.. math::
@@ -299,7 +300,7 @@ FRAC - Fraction
dst.w = src.w - \lfloor src.w\rfloor
-CLAMP - Clamp
+.. opcode:: CLAMP - Clamp
.. math::
@@ -312,9 +313,9 @@ CLAMP - Clamp
dst.w = clamp(src0.w, src1.w, src2.w)
-FLR - Floor
+.. opcode:: FLR - Floor
-This is identical to ARL.
+This is identical to :opcode:`ARL`.
.. math::
@@ -327,7 +328,7 @@ This is identical to ARL.
dst.w = \lfloor src.w\rfloor
-ROUND - Round
+.. opcode:: ROUND - Round
.. math::
@@ -340,45 +341,33 @@ ROUND - Round
dst.w = round(src.w)
-EX2 - Exponential Base 2
-
-.. math::
+.. opcode:: EX2 - Exponential Base 2
- dst.x = 2^{src.x}
+This instruction replicates its result.
- dst.y = 2^{src.x}
+.. math::
- dst.z = 2^{src.x}
+ dst = 2^{src.x}
- dst.w = 2^{src.x}
+.. opcode:: LG2 - Logarithm Base 2
-LG2 - Logarithm Base 2
+This instruction replicates its result.
.. math::
- dst.x = \log_2{src.x}
-
- dst.y = \log_2{src.x}
+ dst = \log_2{src.x}
- dst.z = \log_2{src.x}
- dst.w = \log_2{src.x}
+.. opcode:: POW - Power
-
-POW - Power
+This instruction replicates its result.
.. math::
- dst.x = src0.x^{src1.x}
-
- dst.y = src0.x^{src1.x}
-
- dst.z = src0.x^{src1.x}
+ dst = src0.x^{src1.x}
- dst.w = src0.x^{src1.x}
-
-XPD - Cross Product
+.. opcode:: XPD - Cross Product
.. math::
@@ -391,7 +380,7 @@ XPD - Cross Product
dst.w = 1
-ABS - Absolute
+.. opcode:: ABS - Absolute
.. math::
@@ -404,48 +393,36 @@ ABS - Absolute
dst.w = |src.w|
-RCC - Reciprocal Clamped
+.. opcode:: RCC - Reciprocal Clamped
+
+This instruction replicates its result.
XXX cleanup on aisle three
.. math::
- dst.x = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
-
- dst.y = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
+ dst = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
- dst.z = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
- dst.w = (1 / src.x) > 0 ? clamp(1 / src.x, 5.42101e-020, 1.884467e+019) : clamp(1 / src.x, -1.884467e+019, -5.42101e-020)
+.. opcode:: DPH - Homogeneous Dot Product
-
-DPH - Homogeneous Dot Product
+This instruction replicates its result.
.. math::
- dst.x = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
-
- dst.y = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
-
- dst.z = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+ dst = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
- dst.w = src0.x \times src1.x + src0.y \times src1.y + src0.z \times src1.z + src1.w
+.. opcode:: COS - Cosine
-COS - Cosine
+This instruction replicates its result.
.. math::
- dst.x = \cos{src.x}
-
- dst.y = \cos{src.x}
-
- dst.z = \cos{src.x}
+ dst = \cos{src.x}
- dst.w = \cos{src.x}
-
-DDX - Derivative Relative To X
+.. opcode:: DDX - Derivative Relative To X
.. math::
@@ -458,7 +435,7 @@ DDX - Derivative Relative To X
dst.w = partialx(src.w)
-DDY - Derivative Relative To Y
+.. opcode:: DDY - Derivative Relative To Y
.. math::
@@ -471,32 +448,32 @@ DDY - Derivative Relative To Y
dst.w = partialy(src.w)
-KILP - Predicated Discard
+.. opcode:: KILP - Predicated Discard
discard
-PK2H - Pack Two 16-bit Floats
+.. opcode:: PK2H - Pack Two 16-bit Floats
TBD
-PK2US - Pack Two Unsigned 16-bit Scalars
+.. opcode:: PK2US - Pack Two Unsigned 16-bit Scalars
TBD
-PK4B - Pack Four Signed 8-bit Scalars
+.. opcode:: PK4B - Pack Four Signed 8-bit Scalars
TBD
-PK4UB - Pack Four Unsigned 8-bit Scalars
+.. opcode:: PK4UB - Pack Four Unsigned 8-bit Scalars
TBD
-RFL - Reflection Vector
+.. opcode:: RFL - Reflection Vector
.. math::
@@ -508,145 +485,171 @@ RFL - Reflection Vector
dst.w = 1
-Considered for removal.
+.. note::
+
+ Considered for removal.
-SEQ - Set On Equal
+.. opcode:: SEQ - Set On Equal
.. math::
dst.x = (src0.x == src1.x) ? 1 : 0
+
dst.y = (src0.y == src1.y) ? 1 : 0
+
dst.z = (src0.z == src1.z) ? 1 : 0
+
dst.w = (src0.w == src1.w) ? 1 : 0
-SFL - Set On False
+.. opcode:: SFL - Set On False
+
+This instruction replicates its result.
.. math::
- dst.x = 0
- dst.y = 0
- dst.z = 0
- dst.w = 0
+ dst = 0
+
+.. note::
+
+ Considered for removal.
-Considered for removal.
-SGT - Set On Greater Than
+.. opcode:: SGT - Set On Greater Than
.. math::
dst.x = (src0.x > src1.x) ? 1 : 0
+
dst.y = (src0.y > src1.y) ? 1 : 0
- dst.z = (src0.z > src1.z) ? 1 : 0
- dst.w = (src0.w > src1.w) ? 1 : 0
+ dst.z = (src0.z > src1.z) ? 1 : 0
-SIN - Sine
+ dst.w = (src0.w > src1.w) ? 1 : 0
-.. math::
- dst.x = \sin{src.x}
+.. opcode:: SIN - Sine
- dst.y = \sin{src.x}
+This instruction replicates its result.
- dst.z = \sin{src.x}
+.. math::
- dst.w = \sin{src.x}
+ dst = \sin{src.x}
-SLE - Set On Less Equal Than
+.. opcode:: SLE - Set On Less Equal Than
.. math::
dst.x = (src0.x <= src1.x) ? 1 : 0
+
dst.y = (src0.y <= src1.y) ? 1 : 0
+
dst.z = (src0.z <= src1.z) ? 1 : 0
+
dst.w = (src0.w <= src1.w) ? 1 : 0
-SNE - Set On Not Equal
+.. opcode:: SNE - Set On Not Equal
.. math::
dst.x = (src0.x != src1.x) ? 1 : 0
+
dst.y = (src0.y != src1.y) ? 1 : 0
+
dst.z = (src0.z != src1.z) ? 1 : 0
+
dst.w = (src0.w != src1.w) ? 1 : 0
-STR - Set On True
+.. opcode:: STR - Set On True
+
+This instruction replicates its result.
.. math::
- dst.x = 1
- dst.y = 1
- dst.z = 1
- dst.w = 1
+ dst = 1
-TEX - Texture Lookup
+.. opcode:: TEX - Texture Lookup
TBD
-TXD - Texture Lookup with Derivatives
+.. opcode:: TXD - Texture Lookup with Derivatives
TBD
-TXP - Projective Texture Lookup
+.. opcode:: TXP - Projective Texture Lookup
TBD
-UP2H - Unpack Two 16-Bit Floats
+.. opcode:: UP2H - Unpack Two 16-Bit Floats
TBD
- Considered for removal.
+.. note::
+
+ Considered for removal.
-UP2US - Unpack Two Unsigned 16-Bit Scalars
+.. opcode:: UP2US - Unpack Two Unsigned 16-Bit Scalars
TBD
- Considered for removal.
+.. note::
+
+ Considered for removal.
-UP4B - Unpack Four Signed 8-Bit Values
+.. opcode:: UP4B - Unpack Four Signed 8-Bit Values
TBD
- Considered for removal.
+.. note::
-UP4UB - Unpack Four Unsigned 8-Bit Scalars
+ Considered for removal.
+
+.. opcode:: UP4UB - Unpack Four Unsigned 8-Bit Scalars
TBD
- Considered for removal.
+.. note::
+
+ Considered for removal.
-X2D - 2D Coordinate Transformation
+.. opcode:: X2D - 2D Coordinate Transformation
.. math::
dst.x = src0.x + src1.x \times src2.x + src1.y \times src2.y
+
dst.y = src0.y + src1.x \times src2.z + src1.y \times src2.w
+
dst.z = src0.x + src1.x \times src2.x + src1.y \times src2.y
+
dst.w = src0.y + src1.x \times src2.z + src1.y \times src2.w
-Considered for removal.
+.. note::
+
+ Considered for removal.
From GL_NV_vertex_program2
^^^^^^^^^^^^^^^^^^^^^^^^^^
-ARA - Address Register Add
+.. opcode:: ARA - Address Register Add
TBD
- Considered for removal.
+.. note::
-ARR - Address Register Load With Round
+ Considered for removal.
+
+.. opcode:: ARR - Address Register Load With Round
.. math::
@@ -659,26 +662,28 @@ ARR - Address Register Load With Round
dst.w = round(src.w)
-BRA - Branch
+.. opcode:: BRA - Branch
pc = target
- Considered for removal.
+.. note::
+
+ Considered for removal.
-CAL - Subroutine Call
+.. opcode:: CAL - Subroutine Call
push(pc)
pc = target
-RET - Subroutine Call Return
+.. opcode:: RET - Subroutine Call Return
pc = pop()
Potential restrictions:
* Only occurs at end of function.
-SSG - Set Sign
+.. opcode:: SSG - Set Sign
.. math::
@@ -691,7 +696,7 @@ SSG - Set Sign
dst.w = (src.w > 0) ? 1 : (src.w < 0) ? -1 : 0
-CMP - Compare
+.. opcode:: CMP - Compare
.. math::
@@ -704,7 +709,7 @@ CMP - Compare
dst.w = (src0.w < 0) ? src1.w : src2.w
-KIL - Conditional Discard
+.. opcode:: KIL - Conditional Discard
.. math::
@@ -713,7 +718,7 @@ KIL - Conditional Discard
endif
-SCS - Sine Cosine
+.. opcode:: SCS - Sine Cosine
.. math::
@@ -726,12 +731,12 @@ SCS - Sine Cosine
dst.y = 1
-TXB - Texture Lookup With Bias
+.. opcode:: TXB - Texture Lookup With Bias
TBD
-NRM - 3-component Vector Normalise
+.. opcode:: NRM - 3-component Vector Normalise
.. math::
@@ -744,7 +749,7 @@ NRM - 3-component Vector Normalise
dst.w = 1
-DIV - Divide
+.. opcode:: DIV - Divide
.. math::
@@ -757,35 +762,31 @@ DIV - Divide
dst.w = \frac{src0.w}{src1.w}
-DP2 - 2-component Dot Product
+.. opcode:: DP2 - 2-component Dot Product
-.. math::
+This instruction replicates its result.
- dst.x = src0.x \times src1.x + src0.y \times src1.y
-
- dst.y = src0.x \times src1.x + src0.y \times src1.y
-
- dst.z = src0.x \times src1.x + src0.y \times src1.y
+.. math::
- dst.w = src0.x \times src1.x + src0.y \times src1.y
+ dst = src0.x \times src1.x + src0.y \times src1.y
-TXL - Texture Lookup With LOD
+.. opcode:: TXL - Texture Lookup With LOD
TBD
-BRK - Break
+.. opcode:: BRK - Break
TBD
-IF - If
+.. opcode:: IF - If
TBD
-BGNFOR - Begin a For-Loop
+.. opcode:: BGNFOR - Begin a For-Loop
dst.x = floor(src.x)
dst.y = floor(src.y)
@@ -798,25 +799,31 @@ BGNFOR - Begin a For-Loop
Note: The destination must be a loop register.
The source must be a constant register.
- Considered for cleanup / removal.
+.. note::
+
+ Considered for cleanup.
+
+.. note::
+
+ Considered for removal.
-REP - Repeat
+.. opcode:: REP - Repeat
TBD
-ELSE - Else
+.. opcode:: ELSE - Else
TBD
-ENDIF - End If
+.. opcode:: ENDIF - End If
TBD
-ENDFOR - End a For-Loop
+.. opcode:: ENDFOR - End a For-Loop
dst.x = dst.x + dst.z
dst.y = dst.y - 1.0
@@ -827,30 +834,48 @@ ENDFOR - End a For-Loop
Note: The destination must be a loop register.
- Considered for cleanup / removal.
+.. note::
-ENDREP - End Repeat
+ Considered for cleanup.
+
+.. note::
+
+ Considered for removal.
+
+.. opcode:: ENDREP - End Repeat
TBD
-PUSHA - Push Address Register On Stack
+.. opcode:: PUSHA - Push Address Register On Stack
push(src.x)
push(src.y)
push(src.z)
push(src.w)
- Considered for cleanup / removal.
+.. note::
+
+ Considered for cleanup.
+
+.. note::
+
+ Considered for removal.
-POPA - Pop Address Register From Stack
+.. opcode:: POPA - Pop Address Register From Stack
dst.w = pop()
dst.z = pop()
dst.y = pop()
dst.x = pop()
- Considered for cleanup / removal.
+.. note::
+
+ Considered for cleanup.
+
+.. note::
+
+ Considered for removal.
From GL_NV_gpu_program4
@@ -858,7 +883,7 @@ From GL_NV_gpu_program4
Support for these opcodes indicated by a special pipe capability bit (TBD).
-CEIL - Ceiling
+.. opcode:: CEIL - Ceiling
.. math::
@@ -871,7 +896,7 @@ CEIL - Ceiling
dst.w = \lceil src.w\rceil
-I2F - Integer To Float
+.. opcode:: I2F - Integer To Float
.. math::
@@ -884,7 +909,7 @@ I2F - Integer To Float
dst.w = (float) src.w
-NOT - Bitwise Not
+.. opcode:: NOT - Bitwise Not
.. math::
@@ -897,7 +922,7 @@ NOT - Bitwise Not
dst.w = ~src.w
-TRUNC - Truncate
+.. opcode:: TRUNC - Truncate
.. math::
@@ -910,7 +935,7 @@ TRUNC - Truncate
dst.w = trunc(src.w)
-SHL - Shift Left
+.. opcode:: SHL - Shift Left
.. math::
@@ -923,7 +948,7 @@ SHL - Shift Left
dst.w = src0.w << src1.x
-SHR - Shift Right
+.. opcode:: SHR - Shift Right
.. math::
@@ -936,7 +961,7 @@ SHR - Shift Right
dst.w = src0.w >> src1.x
-AND - Bitwise And
+.. opcode:: AND - Bitwise And
.. math::
@@ -949,7 +974,7 @@ AND - Bitwise And
dst.w = src0.w & src1.w
-OR - Bitwise Or
+.. opcode:: OR - Bitwise Or
.. math::
@@ -962,7 +987,7 @@ OR - Bitwise Or
dst.w = src0.w | src1.w
-MOD - Modulus
+.. opcode:: MOD - Modulus
.. math::
@@ -975,20 +1000,20 @@ MOD - Modulus
dst.w = src0.w \bmod src1.w
-XOR - Bitwise Xor
+.. opcode:: XOR - Bitwise Xor
.. math::
- dst.x = src0.x ^ src1.x
+ dst.x = src0.x \oplus src1.x
- dst.y = src0.y ^ src1.y
+ dst.y = src0.y \oplus src1.y
- dst.z = src0.z ^ src1.z
+ dst.z = src0.z \oplus src1.z
- dst.w = src0.w ^ src1.w
+ dst.w = src0.w \oplus src1.w
-SAD - Sum Of Absolute Differences
+.. opcode:: SAD - Sum Of Absolute Differences
.. math::
@@ -1001,17 +1026,17 @@ SAD - Sum Of Absolute Differences
dst.w = |src0.w - src1.w| + src2.w
-TXF - Texel Fetch
+.. opcode:: TXF - Texel Fetch
TBD
-TXQ - Texture Size Query
+.. opcode:: TXQ - Texture Size Query
TBD
-CONT - Continue
+.. opcode:: CONT - Continue
TBD
@@ -1020,12 +1045,12 @@ From GL_NV_geometry_program4
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-EMIT - Emit
+.. opcode:: EMIT - Emit
TBD
-ENDPRIM - End Primitive
+.. opcode:: ENDPRIM - End Primitive
TBD
@@ -1034,62 +1059,171 @@ From GLSL
^^^^^^^^^^
-BGNLOOP - Begin a Loop
+.. opcode:: BGNLOOP - Begin a Loop
TBD
-BGNSUB - Begin Subroutine
+.. opcode:: BGNSUB - Begin Subroutine
TBD
-ENDLOOP - End a Loop
+.. opcode:: ENDLOOP - End a Loop
TBD
-ENDSUB - End Subroutine
+.. opcode:: ENDSUB - End Subroutine
TBD
-NOP - No Operation
+.. opcode:: NOP - No Operation
Do nothing.
-NRM4 - 4-component Vector Normalise
-
-.. math::
-
- dst.x = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+.. opcode:: NRM4 - 4-component Vector Normalise
- dst.y = \frac{src.y}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+This instruction replicates its result.
- dst.z = \frac{src.z}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+.. math::
- dst.w = \frac{src.w}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
+ dst = \frac{src.x}{src.x \times src.x + src.y \times src.y + src.z \times src.z + src.w \times src.w}
ps_2_x
^^^^^^^^^^^^
-CALLNZ - Subroutine Call If Not Zero
+.. opcode:: CALLNZ - Subroutine Call If Not Zero
TBD
-IFC - If
+.. opcode:: IFC - If
TBD
-BREAKC - Break Conditional
+.. opcode:: BREAKC - Break Conditional
TBD
+.. _doubleopcodes:
+
+Double Opcodes
+^^^^^^^^^^^^^^^
+
+.. opcode:: DADD - Add Double
+
+.. math::
+
+ dst.xy = src0.xy + src1.xy
+
+ dst.zw = src0.zw + src1.zw
+
+
+.. opcode:: DDIV - Divide Double
+
+.. math::
+
+ dst.xy = src0.xy / src1.xy
+
+ dst.zw = src0.zw / src1.zw
+
+.. opcode:: DSEQ - Set Double on Equal
+
+.. math::
+
+ dst.xy = src0.xy == src1.xy ? 1.0F : 0.0F
+
+ dst.zw = src0.zw == src1.zw ? 1.0F : 0.0F
+
+.. opcode:: DSLT - Set Double on Less than
+
+.. math::
+
+ dst.xy = src0.xy < src1.xy ? 1.0F : 0.0F
+
+ dst.zw = src0.zw < src1.zw ? 1.0F : 0.0F
+
+.. opcode:: DFRAC - Double Fraction
+
+.. math::
+
+ dst.xy = src.xy - \lfloor src.xy\rfloor
+
+ dst.zw = src.zw - \lfloor src.zw\rfloor
+
+
+.. opcode:: DFRACEXP - Convert Double Number to Fractional and Integral Components
+
+.. math::
+
+ dst0.xy = frexp(src.xy, dst1.xy)
+
+ dst0.zw = frexp(src.zw, dst1.zw)
+
+.. opcode:: DLDEXP - Multiple Double Number by Integral Power of 2
+
+.. math::
+
+ dst.xy = ldexp(src0.xy, src1.xy)
+
+ dst.zw = ldexp(src0.zw, src1.zw)
+
+.. opcode:: DMIN - Minimum Double
+
+.. math::
+
+ dst.xy = min(src0.xy, src1.xy)
+
+ dst.zw = min(src0.zw, src1.zw)
+
+.. opcode:: DMAX - Maximum Double
+
+.. math::
+
+ dst.xy = max(src0.xy, src1.xy)
+
+ dst.zw = max(src0.zw, src1.zw)
+
+.. opcode:: DMUL - Multiply Double
+
+.. math::
+
+ dst.xy = src0.xy \times src1.xy
+
+ dst.zw = src0.zw \times src1.zw
+
+
+.. opcode:: DMAD - Multiply And Add Doubles
+
+.. math::
+
+ dst.xy = src0.xy \times src1.xy + src2.xy
+
+ dst.zw = src0.zw \times src1.zw + src2.zw
+
+
+.. opcode:: DRCP - Reciprocal Double
+
+.. math::
+
+ dst.xy = \frac{1}{src.xy}
+
+ dst.zw = \frac{1}{src.zw}
+
+.. opcode:: DSQRT - Square root double
+
+.. math::
+
+ dst.xy = \sqrt{src.xy}
+
+ dst.zw = \sqrt{src.zw}
+
Explanation of symbols used
------------------------------
@@ -1137,25 +1271,41 @@ Keywords
discard Discard fragment.
- dst First destination register.
+ pc Program counter.
- dst0 First destination register.
+ target Label of target instruction.
- pc Program counter.
- src First source register.
+Other tokens
+---------------
- src0 First source register.
- src1 Second source register.
+Declaration
+^^^^^^^^^^^
- src2 Third source register.
- target Label of target instruction.
+Declares a register that is will be referenced as an operand in Instruction
+tokens.
+File field contains register file that is being declared and is one
+of TGSI_FILE.
-Other tokens
----------------
+UsageMask field specifies which of the register components can be accessed
+and is one of TGSI_WRITEMASK.
+
+Interpolate field is only valid for fragment shader INPUT register files.
+It specifes the way input is being interpolated by the rasteriser and is one
+of TGSI_INTERPOLATE.
+
+If Dimension flag is set to 1, a Declaration Dimension token follows.
+
+If Semantic flag is set to 1, a Declaration Semantic token follows.
+
+CylindricalWrap bitfield is only valid for fragment shader INPUT register
+files. It specifies which register components should be subject to cylindrical
+wrapping when interpolating by the rasteriser. If TGSI_CYLINDRICAL_WRAP_X
+is set to 1, the X component should be interpolated according to cylindrical
+wrapping rules.
Declaration Semantic
@@ -1187,9 +1337,8 @@ are the Cartesian coordinates, and ``w`` is the homogenous coordinate and used
for the perspective divide, if enabled.
As a vertex shader output, position should be scaled to the viewport. When
-used in fragment shaders, position will ---
-
-XXX --- wait a minute. Should position be in [0,1] for x and y?
+used in fragment shaders, position will be in window coordinates. The convention
+used depends on the FS_COORD_ORIGIN and FS_COORD_PIXEL_CENTER properties.
XXX additionally, is there a way to configure the perspective divide? it's
accelerated on most chipsets AFAIK...
@@ -1266,3 +1415,85 @@ TGSI_SEMANTIC_EDGEFLAG
""""""""""""""""""""""
XXX no clue
+
+
+Properties
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+
+ Properties are general directives that apply to the whole TGSI program.
+
+FS_COORD_ORIGIN
+"""""""""""""""
+
+Specifies the fragment shader TGSI_SEMANTIC_POSITION coordinate origin.
+The default value is UPPER_LEFT.
+
+If UPPER_LEFT, the position will be (0,0) at the upper left corner and
+increase downward and rightward.
+If LOWER_LEFT, the position will be (0,0) at the lower left corner and
+increase upward and rightward.
+
+OpenGL defaults to LOWER_LEFT, and is configurable with the
+GL_ARB_fragment_coord_conventions extension.
+
+DirectX 9/10 use UPPER_LEFT.
+
+FS_COORD_PIXEL_CENTER
+"""""""""""""""""""""
+
+Specifies the fragment shader TGSI_SEMANTIC_POSITION pixel center convention.
+The default value is HALF_INTEGER.
+
+If HALF_INTEGER, the fractionary part of the position will be 0.5
+If INTEGER, the fractionary part of the position will be 0.0
+
+Note that this does not affect the set of fragments generated by
+rasterization, which is instead controlled by gl_rasterization_rules in the
+rasterizer.
+
+OpenGL defaults to HALF_INTEGER, and is configurable with the
+GL_ARB_fragment_coord_conventions extension.
+
+DirectX 9 uses INTEGER.
+DirectX 10 uses HALF_INTEGER.
+
+
+
+Texture Sampling and Texture Formats
+------------------------------------
+
+This table shows how texture image components are returned as (x,y,z,w) tuples
+by TGSI texture instructions, such as :opcode:`TEX`, :opcode:`TXD`, and
+:opcode:`TXP`. For reference, OpenGL and Direct3D conventions are shown as
+well.
+
++--------------------+--------------+--------------------+--------------+
+| Texture Components | Gallium | OpenGL | Direct3D 9 |
++====================+==============+====================+==============+
+| R | XXX TBD | (r, 0, 0, 1) | (r, 1, 1, 1) |
++--------------------+--------------+--------------------+--------------+
+| RG | XXX TBD | (r, g, 0, 1) | (r, g, 1, 1) |
++--------------------+--------------+--------------------+--------------+
+| RGB | (r, g, b, 1) | (r, g, b, 1) | (r, g, b, 1) |
++--------------------+--------------+--------------------+--------------+
+| RGBA | (r, g, b, a) | (r, g, b, a) | (r, g, b, a) |
++--------------------+--------------+--------------------+--------------+
+| A | (0, 0, 0, a) | (0, 0, 0, a) | (0, 0, 0, a) |
++--------------------+--------------+--------------------+--------------+
+| L | (l, l, l, 1) | (l, l, l, 1) | (l, l, l, 1) |
++--------------------+--------------+--------------------+--------------+
+| LA | (l, l, l, a) | (l, l, l, a) | (l, l, l, a) |
++--------------------+--------------+--------------------+--------------+
+| I | (i, i, i, i) | (i, i, i, i) | N/A |
++--------------------+--------------+--------------------+--------------+
+| UV | XXX TBD | (0, 0, 0, 1) | (u, v, 1, 1) |
+| | | [#envmap-bumpmap]_ | |
++--------------------+--------------+--------------------+--------------+
+| Z | XXX TBD | (z, z, z, 1) | (0, z, 0, 1) |
+| | | [#depth-tex-mode]_ | |
++--------------------+--------------+--------------------+--------------+
+
+.. [#envmap-bumpmap] http://www.opengl.org/registry/specs/ATI/envmap_bumpmap.txt
+.. [#depth-tex-mode] the default is (z, z, z, 1) but may also be (0, 0, 0, z)
+ or (z, z, z, z) depending on the value of GL_DEPTH_TEXTURE_MODE.