| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
The drivers have been changed so that they behave as if all of the flags
were set. This is already implicit in most hardware drivers and required
for multiple contexts.
Some state trackers were also abusing the PIPE_FLUSH_RENDER_CACHE flag
to decide whether flush_frontbuffer should be called.
New flag ST_FLUSH_FRONT has been added to st_api.h as a replacement.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Michel Hermier reported libdrm segfault (and kernel crash) on nv40 using
gallium :
http://www.mail-archive.com/[email protected]/msg06563.html
It turns out these were caused by some missing WAIT_RING (or wrong
computation of the WAIT_RING sizes). Unlike all other libdrm_nouveau users,
nvfx gallium tried to use a mininum calls of WAIT_RING, one WAIT_RING could
apply to many methods for different code paths and spread across several
functions. This made it too tricky to find out what the missing or wrong
WAIT_RING was.
By restoring BEGIN_RING, we force one WAIT_RING per method, and it's much
easier to check if the free size required in the pushbuffer is correct. As
curro said, "let's keep it simple for the maintainers until the big
bottlenecks are gone"
Benchmarked on nv35 with openarena, nexuiz and ut2004 and no performance
regression.
The core of this patch was made with Coccinelle, with minor manual fixes
made on top.
Tested-by: Michel Hermier <[email protected]>
Signed-off-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is the new register generation toolkit in use by nouveau.
As far as I know, this is the best register description toolkit in
existence, and you should use it too for your hardware :)
Thanks to Marcin Kościelnicki for inventing it and performing
invaluable reverse engineering work of nVidia chips.
|
|
|
|
|
|
|
|
| |
The old swtnl code was broken by the new shader linkage support for
GLSL.
This is a rewrite of swtnl support, which should instead work properly,
be faster and more closer to the much more tested hardware pipeline.
|
| |
|
|
|
|
|
| |
We probably want to reorganize the remaining files too, but that's
for later, maybe.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
Update all drivers to use draw_set_index_buffer,
draw_set_mapped_index_buffer, and draw_vbo. Remove
draw_set_mapped_element_buffer and draw_set_mapped_element_buffer_range.
|
|
|
|
| |
Should improve performance, possibly significantly.
|
|
|
|
|
|
|
|
|
|
| |
Before, we were discarding the compiled vertex program on each
vertex program change.
Now we compile the program as if there were 6 clip planes and
dynamically patch in an "end program" bit at the right place.
Also, nv30 should now work.
|
|
|
|
|
|
|
| |
This version should hopefully be much clearer and thus less likely
to be subtly broken.
Also fixes point sprites on nv40 and possibly some other bugs too.
|
|
|
|
|
| |
Move declarations before code.
Fix void pointer arithmetic.
|
| |
|
|
|
|
|
|
| |
For some reason nv30 seems to like to reset the viewport, even though
attempts to isolate where exactly it does that have currently been
inconclusive.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a full rewrite of the drawing and buffer management logic.
It offers a lot of improvements:
1. A copy of buffers is now always kept in system memory. This is
necessary to allow software processing of them, which is necessary
or improves performance in many cases.
2. Support for pushing vertices on the FIFO, with index lookup if necessary.
3. "Smart" draw code that tries to intelligently choose the cheapest
way to draw something: whether to use inline vertices or hardware
vertex buffer, and whether to use hardware index buffers
4. Support for all vertex formats supported by the hardware
5. Usage of translate to push vertices, supporting all formats that are
sensible to use as vertex formats
6. Support for base vertex
7. Usage of Ben Skeggs' primitive splitter originally for nv50, allowing
correct splitting of line loops, triangle fans, etc.
8. Support for instancing
9. Precomputation using the vertex elements CSO
Thanks to Ben Skeggs for his primitive splitter originally for nv50.
Thanks to Christoph Bumiller for his nv50 push code, that was the basis
of this work, even though I changed his code dramatically, in particular
to replace his ad-hoc vertex data emitter with translate.
The changes could also go into nv50 too, but there are substantial
differences due to the additional nv50 hardware features.
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds support for creating temporary surfaces to allow
rendering to surfaces that cannot be rendered to.
It uses the _second_ version of the render temporary infrastructure.
This is necessary for swizzled 3D textures and small mipmaps of
swizzled 2D textures.
This version of the patch creates a resource to use as a temporary
instead of a raw BO, making the code simpler.
|
|
|
|
|
|
|
|
| |
Conflicts:
src/gallium/auxiliary/draw/draw_context.c
src/gallium/auxiliary/draw/draw_pipe_aaline.c
src/gallium/drivers/llvmpipe/lp_context.c
|
|
|
|
|
| |
This makes the code faster due to the lack of indirect calls and also
makes it much easier to understand what is actually going on.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Rather than emitting relocations on flush notifications, emit them
in nvfx_state_start.
|
|
|
|
|
| |
The bulk files cannot be unified, but the frontend can and allows to
share some code and simplify state_emit.c
|
|
|
|
| |
Move the remaining content to the common header.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
vertprog.c is similar but has substantial differences:
1. nv40 supports clip planes
2. nv40 uses a more advanced register allocator
3. Some register setup is different
4. Constants with the same name have different values
This patch unifies the two files.
nv30 gains clip plane support and the nv40 register allocator.
A new NVFX_VP(x) macro is introduced that at runtime resolved to
either the nv30 or the nv40 constant value.
nv30 clip planes are not tested and might not work
|
|
|
|
|
|
| |
The files are identical, except for swtnl support which is commented
out on nv30 and restart being initialized on nv30 to avoid a compiler
warning.
|
|
|
|
|
|
|
|
|
|
|
|
| |
nv30_draw.c is a stub.
This patch makes both nv30 and nv40 use the nv40 swtnl path.
Note that this doesn't actually work on nv30 because the vertex program is
encoded in the nv40-only layout.
However, swtnl was unimplemented before on nv30, so this is not a regression.
Furthermore, a patch to fix this is available near the end of the patchset.
|
|
|
|
|
|
|
|
|
| |
The files are mostly the same except:
1. On NV40, some TGSI instructions are emulated with several hardware ones
2. Some instructions such as DDX/DDY, and STR were missing from nv30
3. NV40 has more sophisticated register management
nv30 now supports all instructions and uses the nv40 register management.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The files are significantly different due to:
1. nv30 support 2 render targets, nv40 4
2. z-buffer pitch is set differently
3. nv30 has a limitation of colour_bits >= zeta_bits. This may not
actually exist in the driver though
4. nv30 points color0 at depth in the depth-only case
5. nv30 sets NV34TCL_VIEWPORT_TX_ORIGIN to 0. This is probably
unnecessary
This patch attempts to unify the two files and preserve the existing
behavior.
|
|
|
|
| |
The files are identical, except for an extra comment in nv30.
|
|
|
|
|
|
|
| |
The files are identical, except for the fact that the nv40 version
forgets to unreference the stateobj.
Unified to the correct nv30 version.
|
| |
|
| |
|
| |
|
| |
|
|
The files are the same except for swtnl support on nv40 and for
texture cache flushing on nv40.
Unify them, and use a macro to define 4 versions of render_states,
for all combinations of nvfx and hwtnl/swtnl.
|