| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is necessary because i965 will need to call vbo_bind_array() when
cleaning up after a buffer resolve meta-op.
Detailed Explanation
--------------------
The vbo module tracks vertex attributes separately from the gl_context.
Specifically, the vbo module maintins vertex attributes in
vbo_exec_context::array::inputs, which is synchronized with
gl_context::Array::ArrayObj::VertexAttrib by vbo_bind_array().
vbo_draw_arrays() calls vbo_bind_array() to perform the synchronization
before calling the real draw call, vbo_context::draw_arrays.
Intel hardware accomplishes buffer resolves with a meta-op. Frequently,
that meta-op must be performed within glDraw* in the moment immediately
before the draw occurs (The hardware designers hate us...). After
performing the meta-op, but before calling vbo_bind_array(), the
gl_context's vertex attributes will have been restored to their original
state (that is, their state before the meta-op began), but the vbo
module's vertex attribute are those used in the last meta-op. Therefore we
must manually synchronize the two with vbo_bind_array() before continuing
with the original draw command (that is, the one requested with glDraw*).
See brw_predraw_resolve_buffers(), which will be added in a future commit.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This hook allows the driver to prepare for a glBegin/glEnd.
i965 will use the hook to avoid avoid recursive calls to FLUSH_VERTICES
during a buffer resolve meta-op.
Detailed Justification
----------------------
When vertices are queued during a glBegin/glEnd block, those vertices must
of course be drawn before any rendering state changes. To enusure this,
Mesa calls FLUSH_VERTICES as a prehook to such state changes. Therefore,
FLUSH_VERTICES itself cannot change rendering state without falling into
a recursive trap.
This precludes meta-ops, namely i965 buffer resolves, from occuring while
any vertices are queued. To avoid that situation, i965 must satisfy the
following condition: that it queues no vertex if a buffer needs resolving.
To satisfy this, i965 will use the PrepareExecBegin hook to resolve all
buffers on entering a glBegin/glEnd block.
--------
v2: Don't add dd_function_table::CleanupExecEnd. Anholt and I discovered
that hook to be unnecessary.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In some cases, Intel hardware requires that depth and stencil buffers be
separate. To accommodate swrast, i965 resorts to hackery that causes
a segfault in the fastpaths of draw_depth_stencil_pixels() and
read_depth_stencil_pixels().
The hack is that i965 sets framebuffer->Attachment[BUFFER_DEPTH].Renderbuffer
and framebuffer->Attachment[BUFFER_STENCIL].Renderbuffer to a dummy
renderbuffer for which the GetRow accessors and friends are null. The real
buffers are located at framebuffer->_DepthBuffer and framebuffer->_Stencilbuffer.
To fix the segault, this patch skips the fastpath if
framebuffer->Attachment[BUFFER_DEPTH].Renderbuffer->GetRow is null.
Note: This is a candidate for the 7.11 branch.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When i965 uses (in the near future) meta-ops to perform buffer resolves,
the meta-op stack exceeds depth 2. I bumped it to 8 because... 8 is bigger
than 2, but not too big.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If this flag is set, then _mesa_meta_begin/end will save/restore the state of
GL_SELECT and GL_FEEDBACK render modes.
Intel's future buffer resolve meta-ops will require this, since buffer resolves
may occur when the GL_RENDER_MODE is GL_SELECT.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
This is required in order for meta-ops to save/restore the GL_RENDER_MODE
state.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I initially produced the patch using this bash command:
for file in {intel,i915,i965}/*.{c,cpp,h}; do [ ! -h $file ] && sed -i
's/GLboolean/bool/g' $file && sed -i 's/GL_TRUE/true/g' $file && sed -i
's/GL_FALSE/false/g' $file; done
Then I manually added #include <stdbool.h> to fix compilation errors,
and converted a few functions back to GLboolean that were used in core
Mesa's function pointer table to avoid "incompatible pointer" warnings.
Finally, I cleaned up some whitespace issues introduced by the change.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Chad Versace <[email protected]>
Acked-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It was previously under gpu_shader4, but I'm pretty sure everyone's
going to be doing GLSL 1.30 first (since gpu_shader4 is basically 1.30
plus a bunch of extra stuff).
Fixes piglit glsl-1.30/texel-offset-limits.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When saving the active program in _mesa_meta_begin, it was actually
saving the fragment program instead. This means that if the
application binds a program that only has a vertex shader then when
the meta saved state is restored it will forget the bound program.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41969
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a step towards providing a direct route for drivers accepting
GLSL IR for codegen. Perhaps more importantly, it runs the fixed
function fragment program through the GLSL IR optimization. Having
seen how easy it is to make ugly fixed function texenv code that can
do unnecessary work, this may improve real applicatinos.
|
|
|
|
|
|
|
|
| |
On converting fixed function programs to generate GLSL, the linker
became cranky that we were trying to make something that wasn't a
linked vertex+fragment program. Given that the Mesa GLES2 drivers
also support desktop GL with EXT_sso, just telling the linker to shut
up seems like the easiest solution.
|
|
|
|
|
| |
These will be used by the FF VS/FS to represent the current attributes
when they don't have an active vertex array.
|
|
|
|
|
| |
This is a slight simplification on the way to actually generating GLSL
fragment shaders.
|
|
|
|
|
| |
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Fix a typo that should result in the same code.
|
|
|
|
| |
TARGET was not defined, so make checked directory instead of file
|
|
|
|
| |
Dependency on target directory caused unnecessary relink. Remove them.
|
| |
|
|
|
|
|
|
|
|
|
| |
As pointed out by Michel Dänzer, gcc -lstdc++ doesn't work on all systems,
because it may require other libraries which are only pulled in implicitly
by g++. And libstdc++ is available only with GNU compiler.
Use c++ compiler for linking and remove redundant LDFLAGS += -lstdc++
all over the tree.
|
|
|
|
|
|
|
| |
Scalar instruction that need to write to the xyz components of a
register must reserve the RGB instruction slot for a REPL_ALPHA
instruction. With this commit, the scheduler will attempt to free
the RGB slot by moving the write to the w component of a register.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
save_CompressedTex(Sub)Image
Introuduce a simple function called copy_data to do the image data copy
stuff for all the save_CompressedTex*Image function. The function check
the NULL data case to avoid some potential segfault. This also would
make the code a bit simpler and less redundance.
Signed-off-by: Yuanhan Liu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
It complicates more than it simplifies, now that there's only one negate
bit on TGSI registers.
|
|
|
|
|
|
|
| |
Instead of separate ifloor / fract calls.
No change for SSE4.1 code, but less FP<->SI conversions on non SSE4.1
systems.
|
|
|
|
|
|
|
|
| |
draw_elements_immediate"
This reverts commit 5506f6ef966b8883e575a3f60ce96ad42ee6ffd2.
It breaks more things than it fixes.
|
|
|
|
|
|
|
|
| |
Fix is in {read,draw}_depth_stencil_pixels(). If depthRb == stencilRb,
then it is redundant to check depthRb->x *and* stencilRb->x.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For glReadPixels, the user supplied pixels have format
GL_UNSIGNED_INT_24_8. But, when the depthstencil buffer's format was
MESA_FORMAT_S8_Z24, the fastpath read from the buffer without reordering
the depth and stencil bits. To fix this, this patch just skips the
fastpath when the format is not MESA_FORMAT_Z24_S8.
The problem and fix for glWritePixels is analagous.
Fixes the Piglit tests below on i965/gen6 and causes no regressions.
general/depthstencil-default_fb-drawpixels-24_8
general/depthstencil-default_fb-readpixels-24_8
EXT_packed_depth_stencil/fbo-depthstencil-GL_DEPTH24_STENCIL8-drawpixels-24_8
EXT_packed_depth_stencil/fbo-depthstencil-GL_DEPTH24_STENCIL8-readpixels-24_8
Note: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
This is required for an accurate implementation of d3d1x's
CheckFormatSupport query.
It also seems generally useful for state trackers, which could
choose alternative rendering paths or formats if blending would
come at a significant performance loss.
|
|
|
|
|
| |
The scheduler and the register allocator are now smart enough to handle
it.
|
|
|
|
|
|
|
|
|
|
|
| |
The texture semaphore allows for prefetching of texture data. On my
RV515, this increases the FPS of Lightsmark by 33% (This is with the
reg_rename pass enabled, which is enabled in the next commit).
There is a new env variable now called RADEON_TEX_GROUP, which allows
you to specify the maximum number of texture lookups to do at once.
The default is 8, but different values could produce better results
for various application / card combinations.
|
| |
|
| |
|
|
|
|
|
|
| |
We no longer emit full instructions immediately after they have been
merged. Instead merged instructions are added to the ready list and
the scheduler can commit them whenever it wants.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is supported by the pseudo-code on pages 27 and 28 (pages 41 and
42 of the PDF) of the OpenGL 2.1 spec. The last part of the
implementation of ArrayElement is:
if (generic attribute array 0 enabled) {
if (generic vertex attribute 0 array normalization flag is set, and
type is not FLOAT or DOUBLE)
VertexAttrib[size]N[type]v(0, generic vertex attribute 0 array element i);
else
VertexAttrib[size][type]v(0, generic vertex attribute 0 array element i);
} else if (vertex array enabled) {
Vertex[size][type]v(vertex array element i);
}
Page 23 (page 37 of the PDF) of the same spec says:
"Setting generic vertex attribute zero specifies a vertex; the
four vertex coordinates are taken from the values of attribute
zero. A Vertex2, Vertex3, or Vertex4 command is completely
equivalent to the corresponding VertexAttrib* command with an
index of zero."
Fixes piglit test attribute0.
NOTE: This is a candidate for stable branches.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This should fix a bug added by f5bfe54a.
Might also fix:
https://bugs.freedesktop.org/show_bug.cgi?id=41715
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't allow any "CPU" buffers to be allocated by the pb_fenced
buffer manager, since we can't protect against failures during
buffer validation.
Also, add an extra slab buffer manager to allocate buffers from
the kernel if there is a failure to allocate from our big buffer pool.
The reason we use a slab manager for this, is to avoid allocating
many very small buffers from the kernel.
v2: Increased VMW_MAX_BUFFER_SIZE and fixed some comments.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Returns a configuration that makes the dri state-tracker-manager
throttle.
Also disable kernel-based throttling.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
Hooks up throttling if there is a configuration function present and
it indicates that throttling is desired.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
Adds a possibility for the state tracker manager to query the
target for a specific configuration.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
| |
Needed for throttling.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrant <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
But don't hook it up just yet until we figure out a good way to do that.
Also, we should, in the future, add driconf options to control what
throttling reasons should be honored, and the number of outstanding
swaps allowed.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The X server has limited throttle support on the server side,
but doing this in the client has some benefits:
1) X server throttling is per client. Client side throttling can be done
per drawable.
2) It's easier to control the throttling based on what client is run,
for example using "driconf".
3) X server throttling requires drm swap complete events.
So implement a dri2 throttling extension intended to be used by direct
rendering clients.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
If no pixels pass the clip test, return false.
|
| |
|
|
|
|
| |
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=41768
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change releases the stw_framebuffer::mutex past creation of
the pbuffer stw_framebuffer. Without this change the pbuffers
lock is never released. Since on win32 mutexes are recursive, this
does not hurt as long as all actions on a context are done from
the same thread. But if, for example, context creation happens in
a different thread than usage, every access to the context will
block for ever.
Signed-off-by: José Fonseca <[email protected]>
|