| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Remove the redundant public _mesa_prim_name[] array.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
ideally this should be set once in the beginning of CS but there's
no way to change values there while in the middle of rendering.
For now reemitting SQ setup seems to work probably due to
r700WaitForIdleClean after each render
currently does not to try to decrease values once increased
fixes hangs in glsl-vs-vec4-indexing-temp-src-in-nested-loop-combined
glsl-vs-vec4-indexing-temp-dst-in-nested-loop-combined for my rv740
maybe more for other chips
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When --enable-shared-glapi is specified, libGL will share libglapi with
OpenGL ES instead of defining its own copy of glapi. This makes sure an
app will get only one copy of glapi in its address space.
The new option is disabled by default. When enabled, libGL and libglapi
must be built from the same source tree and distributed together. This
requirement comes from the fact that the dispatch offsets used by these
libraries are re-assigned whenever GLAPI XMLs are changed.
For GLX, indirect rendering for has_different_protocol() functions is
tricky. A has_different_protocol() function is assigned only one
dispatch offset, yet each entry point needs a different protocol opcode.
It cannot be supported by the shared glapi. The fix to this is to make
glXGetProcAddress handle such functions specially before calling
_glapi_get_proc_address.
Note that these files are automatically generated/re-generated
src/glx/indirect.c
src/glx/indirect.h
src/mapi/glapi/glapi_mapi_tmp.h
|
| |
|
|
|
|
|
|
| |
I don't have evidence for this amounting to any improvement,
but it does codify a bit more what we understand so far about
the pipeline.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a bunch of unnecessary barriers due to the scheduler not
knowing what that arbitrary register description refers to when trying
to reason about its dependencies.
The result is rescheduling in the convolution kernel shader in
Lightsmark, which results in avoiding register spilling and increasing
the performance of the first scene from 6-7 fps midway through the
panning to 11fps. The register spilling was a regression from Mesa
7.9 to Mesa 7.10.
|
|
|
|
|
|
|
| |
Improves performance of my GLSL demo by 5.1% (+/- 1.4%, n=7). It also
reschedules the giant multiply tree at the end of
glsl-fs-convolution-1 so that we end up not spilling registers,
producing the expected level of performance.
|
| |
|
|
|
|
|
| |
Drivers should override the default range/precision info as needed.
No drivers do this yet.
|
|
|
|
| |
This is a candidate for 7.9 and 7.10
|
| |
|
|
|
|
|
|
| |
(really not sure why I'm doing this).
This is a candidate for 7.9 and 7.10 branches.
|
|
|
|
|
|
|
|
|
| |
Can happen during swrast fallbacks if a buffer is somehow bound as
a render target and a texture.
Fixes gnome-shell on nv20, and gets it mostly working on nv10.
Signed-off-by: Ben Skeggs <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
sw clears were being used and not getting the correct offsets in the span
code.
also not emitting correct offsets for CB draws to texture levels.
(I've no idea why I'm playing with r100).
This is a candidate for 7.9 and 7.10
|
| |
|
|
|
|
|
|
| |
Fixes piglit glsl-fs-texture2d-branching. I couldn't come up with a
testcase that didn't involve dead code, but it's still worthwhile to
fix I think.
|
|
|
|
|
|
| |
This fixes https://bugs.freedesktop.org/show_bug.cgi?id=33247
There might still be some issues with drawing multiple instances
with VBO splitting to investigate someday.
|
|
|
|
|
|
|
|
| |
This revealed a bug in ra_get_spill_benefit where we only considered
the benefit of the first adjacency we were to remove, explaining some
of the ugly spilling I've seen in shaders. Because of the reduced
spilling, it reduces the runtime of glsl-fs-convolution-1 36.9% +/-
0.9% (n=5).
|
| |
|
|
|
|
| |
Reduces runtime of glsl-fs-convolution-1 another 13.9% +/- 0.6% (n=5).
|
|
|
|
|
|
| |
This was recommended in the original paper, but I figued "make it run"
before "make it fast". Now we make it fast. Reduces the runtime of
glsl-fs-convolution-1 by 12.7% +/- 0.6% (n=5).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our use of the register allocator in i965 is somewhat unusual.
Whereas most architectures would have a smaller set of registers with
fewer register classes and reuse that across compilation, we have 1,
2, and 4-register classes (usually) and a variable number up to 128
registers per compile depending on how many setup parameters and push
constants are present. As a result, when compiling large numbers of
programs (as with glean texCombine going through ff_fragment_shader),
we spent much of our CPU time in computing the q[] array. By keeping
a separate list of what the conflicts are for a particular reg, we
reduce glean texCombine time 17.0% +/- 2.3% (n=5).
We don't expect this optimization to be useful for 915, which will
have a constant register set, but it would be useful if we were switch
to this register allocator for Mesa IR.
|
|
|
|
| |
Hopefully better than previous - this passes more mipgen tests
|
|
|
|
| |
border color is RGBA for samples - this passes texenv tests
|
|
|
|
|
| |
use introduced STATE_FB_WPOS_Y_TRANSFORM variable (thanks Marek)
this gets coords also right when using fbo
|
|
|
|
|
|
|
| |
Fixes texrect-many regression with ff_fragment_shader -- as we added
refs to the subsequent texcoord scaling paramters, the array got
realloced to a new address while our params[] still pointed at the old
location.
|
| |
|
|
|
|
| |
Fixes a VTK regression after adding GL_ARB_draw_instanced.
|
| |
|
|
|
|
|
| |
primcount is also a parameter to glMultiDrawElements(). Use numInstances
to avoid confusion between these things.
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
This uses a sampler view to access the texture with the alternate format.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
We just choose the texture format depending on the srgb decode bit
for the sRGB formats.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This implements the extension by choosing a different set of texture
fetch functions when the texture parameter changes.
Signed-off-by: Dave Airlie <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|