| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
That confuses Gallium's memory debugging code where CALLOC/MALLOC
must be matched with FREE, not free().
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
To prevent segfaults in the AA line module, the code will check for a
valid pointer to the aaline_stage in the draw context.
Fixes segfault from backtrace:
* aaline_stage_from_pipe
aaline_delete_fs_state
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Squashed commit of the following:
commit 0857a7e105bfcbc4d1431b2cc56612094c747ca3
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:07 2013 -0400
gallivm: Fix lp_build_rgba8_to_fi32_soa for big endian
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 0d65131649a8aa140e2db228ba779d685c4333e3
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:07 2013 -0400
gallivm: Fix big-endian machines
This adds a bit-shift count to the format table, and adds the concept of
vector or bitwise alignment on gathers.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 9740bda9b7dc894b629ed38be9b51059ce90818f
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:07 2013 -0400
llvmpipe: Fix convert_to_blend_type on big-endian
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit ae037c2de0f029e4e99371c0de25560484f0d8df
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
util: Convert color pack to packed formats
This fixes them on big-endian.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 5b05ac0c89ae092ea8ba5bba9f739708d7396b5c
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
graw-xlib: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 51396e7d098cb6ff794391cf11afe4dbf86dbea0
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
format: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 417b60bc66eb450e68a92ab0e47f76e292b385e6
Author: Adam Jackson <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
st/dri: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 0934b2e022a5e0847d312c40734e2b44cac52fd8
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
st/xlib: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit a307ea3c3716a706963acce7966b5e405ba11db9
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
gbm: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 53eebdd253e1960a645ea278f31d7ef6a6cf4aeb
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
tests: Convert to packed formats
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 2f77fe3ee524945eacd546efcac34f7799fb3124
Author: Adam Jackson <[email protected]>
Date: Tue Jun 18 13:07:37 2013 -0400
gallium: Document packed formats
Signed-off-by: Adam Jackson <[email protected]>
commit 1f1017159ce951f922210a430de9229f91f62714
Author: Richard Sandiford <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
gallium: Introduce 32-bit packed format names
These are for interacting with buffers natively described in terms of
bit shifts, like X11 visuals:
uint32_t xyzw8888 = (x << 0) | (y << 8) | (z << 16) | (w << 24);
Define these in terms of (endian-dependent) aliases to the array-style
format names.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
commit 6cc7ab1ee66ed668da78c1d951dfd7782b4e786a
Author: Adam Jackson <[email protected]>
Date: Mon Jun 3 12:10:32 2013 -0400
gallium: Document format name conventions
v2:
- Fix a channel name thinko (Michel Dänzer)
- Elaborate on SCALED versus INT
- Add links to DirectX and FOURCC docs
Signed-off-by: Adam Jackson <[email protected]>
commit df4d269e7fb62051a3c029b84147465001e5776e
Author: Adam Jackson <[email protected]>
Date: Tue Jun 18 12:25:06 2013 -0400
gallivm: Remove all notion of byte-swapping
Signed-off-by: Adam Jackson <[email protected]>
Signed-off-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the bytestream parsing of mpeg-1 stream, but still leaves
open a number of issues with the interpretation:
- IDCT mismatch control is not correct for MPEG-1.
- Slices do not have to start and end on the same horizontal row of macroblocks.
- picture_coding_type = 4 (D-pictures) is not handled.
- full_pel_*_vector is not handled.
Signed-off-by: Maarten Lankhorst <[email protected]>
|
|
|
|
|
|
| |
Not used yet but there's a couple of places in llvmpipe which should use this
(occlusion count is currently very inefficent if there's no cpu popcnt
instruction).
|
|
|
|
|
|
| |
This is pretty complicated code with few/any comments. Here's a first stab.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 41966fdb3b71c0b70aeb095e0eb3c5626c144a3a.
While it's a lot cleaner it causes regressions because
the draw interface is always called from the draw functions
of the drivers (because the buffers need to be mapped) which
means that the stream output buffers endup being cleared on
every draw rather than on setting.
Signed-off-by: Zack Rusin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For conditional rendering this makes it possible to skip rendering
if either the predicate is true or false, as supported by d3d10
(in fact previously it was sort of implied skip rendering if predicate
is false for occlusion predicate, and true for so_overflow predicate).
There's no cap bit for this as presumably all drivers could do it trivially
(but this patch does not implement it for the drivers using true
hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL
functionality).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed this code didn't work as advertised while doing some passing around
of TGSI shaders and trying to reparse them, and things failing.
This seems to fix it here for at least the small test case I hacked into a
graw test.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gl can use elts without setting indices, in which case
our eltMax was set to 0 and always invoking the overflow
condition. So by default set eltMax to maximum, it will
be curbed by draw_set_indexes (if it ever comes) and if
not then it will let gl's glVertexPointer/glDrawArrays
work correctly. Fixes piglit's
triangle-rasterization-overdraw test.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Moves clearing of the draw so target buffers to the draw
module. They had to be cleared in the drivers before
which was quite messy.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There are strict limits on those registers. Define the maximums
and use them instead of magic numbers. Also allows us to add
some extra sanity checks.
Suggested by Brian.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need the clamped variable, because we can just
return early. We should also do the regular culling after
the distance culling passes.
All spotted by Brian.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Problem: The IEEE float optimized version of UNCLAMPED_FLOAT_TO_UBYTE
in macros.h computed incorrect results for inputs in the range
0x3f7f0000 (=0.99609375) to 0x3f7f7f80 (=0.99803924560546875)
inclusive. 0x3f7f7f80 is the IEEE float value that results in 254.5
when multiplied by 255. With rounding mode "round to closest even
integer", this is the largest float in the range 0.0-1.0 that is
converted to 254 by the generic implementation of
UNCLAMPED_FLOAT_TO_UBYTE. The IEEE float optimized version
incorrectly defined the cut-off for mapping to 255 as 0x3f7f0000
(=255.0/256.0). The same bug was present in the function
float_to_ubyte in u_math.h.
Fix: The proposed fix replaces the incorrect cut-off value by
0x3f800000, which is the IEEE float representation of 1.0f. 0x3f7f7f81
(or any value in between) would also work, but 1.0f is probably
cleaner.
The patch does not regress piglit on llvmpipe and on i965 on sandy
bridge.
Tested-by Stéphane Marchesin <[email protected]>
Reviewed-by Stéphane Marchesin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching.
Both of them don't do any format conversion.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
We can use the fragment shader TGSI property WRITES_ALL_CBUFS.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use new util_fill_box helper for util_clear_render_target.
(Also fix off-by-one map error.)
v2: handle non-zero z correctly in new helper
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Works similarly to clip distance. If the cull distance is negative
for all vertices against a specific plane then the primitive
is culled.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cull distance is analogous to clip distance. If a register is
given this semantic, then the values in it are assumed to be a
float32 distance to a plane. Primitives will be completely
discarded if the plane distance for all of the vertices in
the primitive are < 0.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to figure out the number of invocations of the clipper
before the emit, because in the emit we are after clipping
where the number of primitives will be equal to number of clipper
invocations minus the clipped primitives. So our computations
were always off by the number of clipped primitives.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Draw depended on clip_plane_enable being set in the rasterizer
to use clipdistance registers for clipping. That's really
unfriendly because it requires that rasterizer state to have
variants for every shader out there. Instead of depending on
the rasterizer lets extract the info from the available state:
if a shader writes clipdistance then we need to use it and we
need to clip using a number of planes equal to the number
of writen clipdistance components. This way clipdistances
just work.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
we were always fetching the info from the vertex shader, but if
geometry shader is present it should be used as the source of
that info.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
draw_vertex_buffer declared the size field to be a size_t, but the LLVM
code used an int32 instead. This caused problems on big-endian 64-bit
targets, because the first 32-bit chunk of the 64-bit size_t was always 0.
In one sense size_t seems like a good choice for a size, so one fix
would have been to try to get the LLVM code to use the equivalent of
size_t too. However, in practice, the size is taken from things like ~0
or width0, both of which are int-sized, so it seemed simpler to make the
size field int-sized as well.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
|
|
|
|
|
|
|
|
| |
Without this, llvmpipe ends up giving a zero size to all uncompressed textures
on non-x86 systems, since align() cannot handle a 0 alignment.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
lp_build_add and lp_build_sub have fallback code for cases
that cannot be handled by known intrinsics. For UNORM formats,
this code was using modulo rather than saturating arithmetic.
This fixes some rendering issues for a gnome session on System z.
It also fixes various piglit tests on z, such as
spec/ARB_color_buffer_float/GL_RGBA8-render.
The patch deliberately doesn't tackle the more complicated
SNORM case.
Tested against piglit on x86_64 and System z with no regressions.
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Richard Sandiford <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use 128bit vector interleave for untwiddling in the blend code (with
256bit vectors). llvm generates terrible code for this for some reason,
so instead of generating a shuffle for 2 128bit vectors use a
extract/insert shuffle instead (it only seems to matter we're not using
128bit wide vectors for the shuffle). This decreases instruction count of
the blend code generated for a rgba8 render target without blending from
169 to 113 with llvm 3.1 and from 136 to 114 in llvm 3.2/3.3, and I got
a ~8% (llvm 3.1) and ~5% (3.2/3.3) performance improvement in gears.
(The generated code is still not terribly good as we could actually avoid
the interleaving completely but llvm can't know this.)
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
These functions must clear all bound layers, not just the first.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Checking if array_size is greater than 1 is not enough for single-layered
array textures.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Moved draw_arrays() to st_draw_feedback.c and removed draw_arrays_instanced().
draw_arrays() was used by nobody else. Now there's just one "draw" entrypoint
into the draw module.
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This change came from the discovery that the STATIC_ASSERT to check that
the number of register file strings didn't actually work.
Similar changes could be made for the other string arrays in tgsi_string.c
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
| |
Also report if a shader writes the layer semantic
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The main change is to use MCJIT rather than the old JIT, which will never
be supported for System z. The endianness part is by example since the
patch was tested on a glibc system.
Signed-off-by: Richard Sandiford <[email protected]>
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no good reason why it can't handle 2x4f->1x8ub, 1x4f->1x4ub and
1x8f->1x8ub cases, there might be legitimate reasons why we don't have
enough input vectors for a full destination vector, and using pack
intrinsics should still be much better than using generic conversion
(it looks like convert_alpha from the blend code might hit this though
I suspect it could be avoided).
v2: add another test vector format to lp_test_conv so this gets tested.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
The code was designed to handle no-op concat but failed (unless the
caller was using same pointer for src and dst).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Surprising this bug survived so long, we were missing a clamp (in the
linear filtering version).
(Valgrind complained a lot about invalid reads with piglit texwrap,
I've also seen spurios failures in this test which might have
happened due to this. Valgrind probably didn't complain before the
alignment reduction in llvmpipe to 4x4 since the test is using tiny
textures so the reads were still always well within allocated area.)
While here, also do an effective clamp (after half subtraction)
of [0,length-0.5] instead of [0, length-1] which saves an instruction
(the filtering weight could be different due to this, but only if
both texels point to the same max texel so it doesn't matter).
(Both changes are borrowed from PIPE_TEX_CLAMP_TO_EDGE case.)
Note: This is a candidate for the stable branches.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we've changed draw_find_shader_output to return -1 instead
of 0 on non found attribs we broke the default behavior of
draw, which was to always redirect those to the first (0th) slot.
To preserve that behavior if draw_emit_vertex_attr notices a
mismatched vertex attrib, it just redirects it to the first slot
(instead of trying to use negative index in an array).
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Viewport index should only be used on a per primitive basis, so
instead of fetching it from each vertex, potentially making each
vertex in a primitive use a different viewport index, which is
obviously broken, make sure that we only fetch from the first
vertex in the primitive making the viewport index the same
for the entire primtive.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the viewport index is larger than the PIPE_MAX_VIEWPORTS,
then the first (0-th) viewport should be used.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
draw_find_shader_output like most of the code in draw used to
depend on position always being at output slot 0. which meant
that any other attribute being at 0 could signify an error.
unfortunately position can be at any of the output slots, thus
other attributes can occupy slot 0 and we need to mark the ones
which were not found by something else. This commit changes
draw_find_shader_output so that it returns -1 if it can't
find the given attribute and adjust the code that depended
on it returning >0 whenever it correctly found an attrib.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This adds support for multiple viewports to the draw module.
Multiple viewports depend on the presence of geometry shaders
which can write the viewport index.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gallium supported only a single viewport/scissor combination. This
commit changes the interface to allow us to add support for multiple
viewports/scissors.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: José Fonseca<[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|