| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).
Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).
v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)
v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This helps minimize confusion / effort when moving between branches or
helping others.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Fixes random failures with piglit glsl-max-varyings.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
The card spews an error if I use all 128 generic slots.
Apparently the real limit isn't just dictated by the address space
layout.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
|
|
|
|
|
|
|
|
|
| |
Doesn't exist on the asic and will cause a CS rejection
if VM is disabled.
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Not using HiS yet, but matches what we do on evergreen+.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
NOTE: This is a candidate for the 9.1 branch.
Tested-by: Vincent Lejeune <[email protected]>
Signed-off-by: Maarten Lankhorst <[email protected]>
|
|
|
|
|
|
| |
Some fixes for clearing only depth or only stencil.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Fixing 16 piglit tests.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
v2: remove superfluous mask, use buffer_size instead of constant
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
Cleanup the code and implement indirect addressing.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To further improve the optimization of source and destination
indirect addressing we need the ability to store a reference
to the declaration of the addressed operands.
Since most of the fields in tgsi_src_register doesn't apply for
an indirect addressing operand replace it with a separate
tgsi_ind_register structure and so make room for extra information.
v2: rename Declaration to ArrayID, put the ArrayID into () instead of []
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
| |
Nobody seems to be using it, and only nv50 had a partial implementation.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ported from downstream:
http://anonscm.debian.org/gitweb/?p=pkg-xorg/lib/mesa.git;a=blob;f=debian/patches/119-libllvmradeon-link.patch;h=ee47f8a07dbf33c32f8b57faed923680ed6648fb;hb=refs/heads/ubuntu%2B1
Fixes a regression introduced with
f70c3853513637fa6ed38e75f73d472a9fa61213
NOTE: This is a candidate for the 9.1 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=62434
Signed-off-by: Maarten Lankhorst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Needs to be set for depth, stencil, and fmask just
like other blocks.
v2: drop additional cayman bits for now
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On cayman, 128bpp surfaces require non_disp ordering for hw
access to both linear and tiled surfaces. When we use the 3D
engine we can set the non_disp ordering on both the tiled and
linear sides (via CB or texture), but when we use the DMA
engine, we can only set the non_disp ordering on the tiled
side, so after a L2T operation with the DMA engine, the data
ends up in the wrong order on the tiled side.
v2: cayman/TN only
v3: fix comments
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=60802
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Andreas Boll <[email protected]>
- Fix formatting - use one CFLAG per line
NOTE: This is a candidate for the 9.1 branch.
Signed-off-by: Maarten Lankhorst <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59238
Reviewed-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There were two different NUM_ENTRIES #defines for the framebuffer
tile cache and the texture tile cache. Rename the later to fix
the warnings:
In file included from sp_flush.c:40:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
In file included from sp_context.c:50:0:
sp_tex_tile_cache.h:76:0: warning: "NUM_ENTRIES" redefined
sp_tile_cache.h:78:0: note: this is the location of the previous definition
Also, replace occurances of NUM_ENTRIES with Element() macro to
be safer.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- each softpipe_tex_tile_cache 50*64*64*4*4 = 3,276,800 bytes
- each softpipe_context has 3*32 softpipe_tex_tile_cache, i.e, each softpipe
context is 314,572,800 bytes, i.e, 300MB
That is, in a 32bits process (around 3GB virtual memory max), we can
only fit 10 contexts.
This change is a short-term hack to shrink the context size. Longer
term we'll need to change how the texture cache works.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
instead just warn when creating the surface, rendering will simply happen
to first layer.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't handle them yet, however we can safely just warn (we will
just render to first layer, which is fine since we can't handle
rendertarget system value neither).
Also make behavior more predictable with buffer surfaces
(it would sometimes hit bogus asserts because of the union in the surface,
instead create the surface but assert when trying to set a buffer
in the framebuffer).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Trivial. Matches softpipe's code.
|
|
|
|
| |
Signed-off-by: Tom Stellard <[email protected]>
|
|
|
|
| |
All the functions in this file are now implemented in C.
|
|
|
|
|
|
|
|
| |
Just delete unused kernels rather than marking them as internal and
running the GlobalDCE pass.
Also implement this function in C and inline it into
radeon_llvm_get_kernel_module()
|
| |
|
| |
|
|
|
|
| |
Also make the function static since it is not used anywhere else.
|
| |
|
|
|
|
|
| |
Fixes autotools build failure. Not sure if there are more, as I have
difficulties in building the full tree.
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In cases where the vertex element size is smaller than the vertex buffer
stride, the previous calculation could end up 1 too low. This would result
in the GPU using index 0 instead of the maximum index for those elements,
which would be visible as intermittent distorted triangles.
NOTE: This is a candidate for the 9.1 branch.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
When doing a blit with the 3D engine, the rasterizer or zsa cso may
be NULL.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
| |
And remove non-working code for indirect sampler/resource selection.
Will be added back later.
Includes code from "nv50/ir/tgsi: Resource indirect indexing" by
Francisco Jerez (when mixing the R and S handles we can only specify
them via a register, i.e. indirectly, unless we upload all the used
handle combinations to c[] space, which we don't for now).
|