| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch add the support of gl_PointCoord gl builtin variable for
platform gen4 and gen5(ILK).
Unlike gen6+, we don't have a hardware support of gl_PointCoord, means
hardware will not calculate the interpolation coefficient for you.
Instead, you should handle it yourself in sf shader stage.
But badly, gl_PointCoord is a FS instead of VS builtin variable, thus
it's not included in c.vue_map generated in VS stage. Thus the current
code doesn't aware of this attribute. And to handle it correctly, we
need add it to c.vue_map manually to let SF shader generate the needed
interpolation coefficient for FS shader. SF stage has it's own copy of
vue_map, thus I think it's safe to do it manually.
Since handling gl_PointCoord for gen4 and gen5 platforms is somehow a
little special, I added a lot of comments and hope I didn't overdo it ;)
v2: add a /* _NEW_BUFFERS */ comment to note the state flag dependency
and also add the _NEW_BUFFERS dirty mask (Eric).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45975
Piglit: glsl-fs-pointcoord and fbo-gl_pointcoord
NOTE: This is a candidate for stable release branches.
Signed-off-by: Yuanhan Liu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have to do fallback when the 'Clipped Drawing Rectangle X/Y Max'
exceed the hardware's limit no matter the drawing rectangle offset
changed or not.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46665
NOTE: This is a candidate for stable release branches.
Signed-off-by: Yuanhan Liu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's a GNU extension that isn't supported by clang right now:
http://gcc.gnu.org/onlinedocs/gcc-4.6.3/gcc/Nested-Functions.html
http://clang.llvm.org/docs/UsersManual.html#c_unimpl_gcc
With this, clang now compiles the nouveau classic driver.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44061
(Types changed from e.g. 'unsigned char' to 'GLubyte' so that the types can
be concatenated to form a unique function name without any whitespace
interfering.)
[ Francisco Jerez: give meaningful names to the dispatch functions. ]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's even a comment in the code containing the right swizzling
computations!
Previously this has not been noticed because we need to manually
enabled swizzling on snb/ivb (kernel 3.4 will do that) and we
don't use the separate stencil on ilk (where the bios enables
swizzling). This fixes
piglit ./bin/fbo-stencil readpixels GL_DEPTH32F_STENCIL8 -auto
on recent drm-intel-next kernels.
Also remove the comment about ivb, it's stale now.
Swizzling detection is done by allocating a temporary x-tiled
buffer object. Unfortunately kernels before v3.2 lie on snb/ivb
because they claim that swizzling is enable, but it isn't. The
kernel commit that fixes this for backport to pre-v3.2 is
commit acc83eb5a1e0ae7dbbf89ca2a1a943ade224bb84
Author: Daniel Vetter <[email protected]>
Date: Mon Sep 12 20:49:16 2011 +0200
drm/i915: fix swizzling on gen6+
But if the kernel doesn't lie, this now works on swizzling and
not swizzling machines.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That is when building with --disable-opengl.
Fix for commit cb045880b113b0042d8dfb7e4cdf76e6cc76c1d1.
CC: Paul Berry <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That is when building with --disable-opengl.
Fix for commit c5f4024a793f1209b1693aed9a46be9374ba4741.
CC: Chad Versace <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current code would ignore the point size specified by gl_PointSize
builtin variable in vertex shader on Pineview. This patch servers as
fixing that.
This patch fixes the following issues on Pineview:
webglc: https://cvs.khronos.org/svn/repos/registry/trunk/public/webgl/sdk/tests/conformance/rendering/point-size.html
piglit: glsl-vs-point-size
NOTE: This is a candidate for stable release branches.
v2: pick Eric's nice tip for fixing this issue in hardware rendering.
v3: the last arg of EMIT_ATTR specify the size in _byte_. (Eric)
Signed-off-by: Yuanhan Liu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is a direct port of the i915 patch in
a856da63247a4b403f6350914f732e14d1530ed1.
Fixes glean's pbo test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41372
Reviewed-by: Eric Anholt <[email protected]>
NOTE: This is a candidate for release branches.
|
|
|
|
|
|
|
|
|
|
|
| |
We were looking at the size of batch.map for how big the batchbuffer
was, but on 865 we just use a single-page batchbuffer due to hardware
limits.
v2: Removed check for sizeof map < bo->size, since that's always false.
[change by anholt]
NOTE: This is a candidate for release branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41495
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to prevent an overflow of the batch buffer when emitting
triangles, we need to limit the initial primitive to fit within the
current batch. To do we need to measure the remaining space and thence
compute the maximum number of vertices that fit into that space.
Reported-by: Kurt Roeckx <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41495
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
NOTE: This is a candidate for release branches.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware, like i915, uses an inclusive bounds on min and max for
the drawing rectangle, but we were providing a number for exclusive.
The number of bits used by the hardware only covers this value going
up to the maximum size, so when we programmed 2048 as the maximum
inclusive X, it saw a maximum X of 0 and clipped all rendering. This
caused rendering failures in gnome-shell.
Fixes piglit fbo-maxsize.
v2: dropped changes to the blitter, which does use an exclusive x2, y2.
[change by anholt]
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45558
Reviewed-by: Eric Anholt <[email protected]>
NOTE: This is a candidate for release branches.
|
|
|
|
| |
swtnl doesn't handle this extension.
|
|
|
|
|
|
|
|
|
| |
This is a direct port of fc4fba52cf7e9616c70dd76b4d6bdba6582e157b from
i915, and fixes GPU hangs when running piglit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41372
Reviewed-by: Eric Anholt <[email protected]>
NOTE: This is a candidate for release branches.
|
|
|
|
|
|
|
|
|
|
|
| |
We currently don't support gl_PrimitiveID, and I believe asking the
hardware to generate it results in vertex cache invalidations.
This could result in slowdowns for applications that use gl_InstanceID,
which would be counter-productive. Just turn it off for now.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
visit(ir_variable *) sets dst_reg::writemask to the appropriate channel
for system values. Unfortunately, visit(ir_dereference_variable *) then
calls swizzle_for_size, which for a float, sets the swizzle to .x.
This works for gl_VertexID, since we store it in the .x component (see
brw_draw_upload.c:732 - VID), but fails for gl_InstanceID (IID) since we
store it in the .y channel.
To fix this, avoid calling swizzle_for_size on ir_var_system_values.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kernels prior to 271d81b84171d84723357ae6d172ec16b0d8139c (March 2011)
don't support relocations outside of the target buffer object. Rather
than guarding this with a I915_PARAM_HAS_RELAXED_DELTA check, just
smash the bound to 0xfffff001 like we do on Ironlake.
This effectively gives us no upper bound check, just like we did prior
to commit 271d81b84171d84723357ae6d172ec16b0d8139c.
Daniel Vetter would also like to mention that this relies on the guard
page at the end of the GTT.
NOTE: This is a candidate for release branches.
Fixes a regression since 271d81b84171d84723357ae6d172ec16b0d8139c.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=46766
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
All users of the shine table outside of the tnl module
are gone. Move the implementation into the tnl module and
prefix the public functions with _tnl.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We need to allocate new space every time to avoid blocking on the last
HiZ op completing. There are two easy ways to do this:
brw_state_batch() and intel_upload_data(). brw_state_batch() is
simpler and avoids another buffer allocation.
Improves Unigine Tropics performance 0.376416% +/- 0.148722% (n=7).
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Define new MAX_VIEWPORT_WIDTH/HEIGHT and MAX_RENDERBUFFER_SIZE values
instead.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 980f6f1 (mesa: move gl_texture_image::Width/Height/DepthScale
fields to swrast) moved the initialization of the Width, Height, and
DepthScale fields to _swrast_alloc_texture_image_buffer(). However,
i915 doesn't call this function because it performs its own buffer
allocation. As a result, the Width, Height, and DepthScale fields
weren't getting initialized properly, and some operations requiring
swrast would fail.
This patch ensures that Width, Height, and DepthScale are properly
initialized by separating the code that sets them into a new function,
_swrast_init_texture_image(), which is called by
intel_alloc_texture_image_buffer() as well as
_swrast_alloc_texture_image_buffer(). It also moves the
initialization of _IsPowerOfTwo into this function.
Fixes piglit test fbo/fbo-cubemap on i915.
Partially fixes https://bugs.freedesktop.org/show_bug.cgi?id=41216
This is a candidate for the 8.0 branch.
Reviewed-and-tested-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
To indicate support for the format query.
Reviewed-by: Kristian Høgsberg <[email protected]>
Signed-off-by: Jesse Barnes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
GBM needs the buffer format in order to communicate with DRM and clients
for things like scanout.
So track the DRI format requested in the various back ends and use it to
return the DRI format back to GBM when requested. GBM will then map
this into the GBM surface type (which is in turn based on the DRM fb
format list).
Signed-off-by: Jesse Barnes <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In the gen6 GS case, we were under-counting and so other state would
get smashed. In the VS case, we were over-counting, so everything was
fine.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was copy and paste from the VS where I had similar code. We're
only looking at things derived from BRW_NEW_VERTEX_PROGRAM in this
block.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
I obviously didn't test on gen6 before pushing.
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes GPU hangs in OilRush, Trine, and Amnesia: The Dark Descent,
which all use MRT (multiple render targets).
NOTE: This is a candidate for release branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38720
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=40059
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45216
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
It was concerned that the 4 pad bytes on LP64 were uninitialized.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Improves VS state change microbenchmark performance by 7.08729% +/-
1.22289% (n=10) on gen7, because we don't upload the 64 dwords of
unused binding table any more.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This is a step toward making the samplers/binding tables reflect
sampler uniform mappings instead of embedding those in the programs.
No significant performance difference on the microbenchmark (n=10).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Improves VS state change microbenchmark performance 2.38246% +/-
1.15046% (n=20).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We always say no. Improves VS state change microbenchmark performance
7.68747% +/- 1.40826% (n=10).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Improves VS state change microbenchmark performance 1.78817% +/-
0.556878% (n=25).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
With this and the previous patch, 640x480 nexuiz is running 0.169118%
+/- 0.0863696% faster (n=121). On a VS state change microbenchmark,
performance is increased 8.28645% +/- 0.460478% (n=52).
v2: Fix CACHE_NEW_VS comment.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This reduces recomputation of state based on non-clipping-related
transform changes, and is a step toward removing VUE map
recomputation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If you're resorting to the dummy shader, you've probably already turned
off SIMD16 mode. But if you didn't, it would die in a fire.
We could either fail to compile in SIMD16 mode...or just fix it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The dummy FB write failed to specify EOT and a message length, causing
the GPU to hang. Now we can enjoy "everyone's favorite color" again.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a serious trap for drivers: RenderTexture() does not indicate
that the texture is currently bound to the draw buffer, despite
FinishRenderTexture() signaling that the texture is just now being
unbound from the draw buffer.
We were acting as if RenderTexture() *was* the start of rendering and
that we could make texturing incoherent with the current contents of
the renderbuffer. This caused intel oglconform sRGB
Mipmap.1D_textures to fail, because we got a call to TexImage() and
thus RenderTexture() on a texture bound to a framebuffer that wasn't
the draw buffer, so we skipped validating the new image into the
texture object used for rendering.
We can't (easily) make RenderTexture() indicate the start of drawing,
because both our driver and gallium are using it as the moment to set
up the renderbuffer wrapper used for things like MapRenderbuffer().
Instead, postpone the setup of the workaround render target miptree
until update_renderbuffer time, so that we no longer need to skip
validation of miptrees used as render targets. As a bonus, this
should make GL_NV_texture_barrier possible.
(This also fixes a regression in the gen4 small-mipmap rendering since
3b38b33c1648b07e75dc4d8340758171e109c598, which switched
set_draw_offset from image->mt to irb->mt but didn't move the irb->mt
replacement up before set_draw_offset).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44961
NOTE: This is a candidate for the 8.0 branch.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I recently discovered this text in the BSpec. It seems wise to comply,
though I haven't observed it to fix anything yet.
Fixes a regression in glean/fbo since 28cfa1fa213fe.
NOTE: This is a candidate for stable release branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45221
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit dc7f449d1ac53a66e6efb56ccf2a5953418a26ca introduced a new method
for avoiding MOVs: try to rewrite the destination of the instruction
that produced the RHS so it writes into the LHS.
Unfortunately, this is not safe for swizzled texturing operations, as
they return a set of four contiguous registers. Consider the following:
(assign (x)
(var_ref vec_ctor_x)
(swiz x (tex vec4 (var_ref m_sampY) (var_ref m_cordY) 0 1 ())))
In this case, the source and destination registers are equal, since
reg_offset is 0 for both. Yet, this is only a partial move: the texture
operation generates four registers, and the LHS only covers one.
Fixes color distortion in XBMC when using GLSL shaders.
NOTE: This is a candidate for the 8.0 branch (with the previous commit).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=44333
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|