| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile
and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability
to remove mentions of the inline define.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
8 bit precision is required by d3d10 but unfortunately
requires 64 bit rasterizer. This commit implements
64 bit rasterization with full support for 8bit subpixel
precision. It's a combination of all individual commits
from the llvmpipe-rast-64 branch.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now dead code.
Also had to remove the show_tiles/show_subtiles because now the color
buffers are always stored in their native format, so there is no longer
an easy way to paint the tile sizes.
Depth-stencil buffers are still swizzled.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Tested with custom rasterisation test tool added to piglit suite, reduced errors
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
| |
It is a typo went unnoticed.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the tile.
llvmpipe has a few special rasterization paths for triangles contained in
16x16 blocks, but it allows the 16x16 block to be aligned only to a 4x4
grid.
Some 16x16 blocks could actually intersect the tile
if the triangle is 16 pixels in one dimension but 4 in the other, causing
a buffer overflow.
The fix consists of budging the 16x16 blocks back inside the tile.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Apply Jose's suggestions for a small but measurable improvement in
isosurf.
|
| |
|
|
|
|
| |
MSVC doesn't accept more than 3 __m128i arguments.
|
| |
|
|
|
|
|
|
|
|
| |
There was actually a large quantity of scalar code in these functions
previously. This tries to move more into intrinsics.
Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency
in the new rasterization code.
|
| |
|
|
|
|
| |
Should fix fdo 30168.
|
| |
|
|
|
|
|
| |
Keep step array as a set of four m128i's and reuse throughout the
rasterization.
|
|
|
|
| |
Fragment shader can extract the correct bits for each quad.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Rasterize lines directly by treating them as 4-sided polygons.
Still need to check the exact pixel rasteration.
|
|
|
|
|
| |
Check for these and route them to a dedicated handler with one fewer
levels of recursive rasterization.
|
|
|
|
|
| |
No need to calculate these values any longer, nor to store them in the
bin data. Improves isosurf a bit more, 115->123 fps.
|
|
|
|
|
|
|
|
|
|
|
|
| |
For 16 and 64 pixel levels, calculate a mask which is linear in x and
y (ie not in the swizzle layout).
When iterating over full and partial masks, figure out position by
manipulating the bit number set in the mask, rather than relying on
postion arrays.
Similarly, calculate the lower-level c values from dcdx, dcdy and the
position rather than relying on the step array.
|
|
|
|
| |
No noticable slowdown with isosurf.
|
|
|
|
|
| |
isosurf 95->115 fps just by exchanging the two inner loops in this
function...
|
|
|
|
|
|
|
|
|
| |
Move this code back out to C for now, will generate separately.
Shader now takes a mask parameter instead of C0/C1/C2/etc.
Shader does not currently use that parameter and rasterizes whole
pixel stamps always.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
Need to compute two masks here for full and partial 16x16 blocks.
Gives a further good improvement for isosurf particularly:
isosurf 97 -> 108
gears 597 -> 611
|
|
|
|
|
|
|
|
| |
Some nice speedups:
gears: 547 -> 597
isosurf: 83 -> 98
Others like gloss unchanged. Could do further work in this direction.
|
|
|
|
| |
Saves a few more cycles.
|
|
|
|
|
|
|
|
| |
Currently counting number of tris, how many tiles of each size are
fully covered, partially covered or empty, etc.
Set LP_DEBUG=counters to enable. Results are printed upon context
destruction.
|
|
|
|
|
| |
It's a litte faster to just do the in/out testing in the shader
jit code.
|
| |
|
|
|
|
|
|
|
| |
When we know that a 4x4 pixel block is entirely inside of a triangle
use the jit function which omits the in/out test code.
Results in a few percent speedup in many tests.
|
|
|
|
|
| |
Since changing the in/out test we can just use INT_MIN to be sure the
comparison against the step values always passes.
|
|
|
|
|
|
|
|
| |
Instead of:
s = c + step
m = s > 0
Do:
m = step > c (with negated c)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The test to determine which of the pixels in a 2x2 quad is now done in
the fragment shader rather than in the calling C code. This is a little
faster but there's a few more things to do.
Note that the step[] array elements are in a different order now. Rather
than being in row-major order for the 4x4 grid, they're in "quad-major"
order. The setup of the step arrays is a little more complicated now.
So is the course/intermediate tile test code, but some lookup tables
help with that.
Next steps:
- early-cull 2x2 quads which are totally outside the triangle.
- skip the in/out test for fully contained quads
- make the in/out comparison code tighter/faster.
|
|
|
|
|
| |
Some of the state is per-thread. Put that state in new lp_rasterizer_task
struct.
|
| |
|
|
|
|
| |
And remove unused BLOCKSIZE.
|
|
|
|
| |
Make this a little easier to understand.
|
|
|
|
|
| |
The compiler will still do the multiplies with shifts.
It's just a bit easier to follow the logic with multiplies.
|
| |
|
| |
|