| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Now that we can handle it both for sampling and as depth/stencil enable it.
Passes nearly all additional piglit tests which are now performed, with two
exceptions (one being a framebuffer blit which fails for all other formats
including stencil too as we don't support stencil blits, the other reporting
a unexpected GL error so doesn't look to be llvmpipe's fault).
|
|
|
|
|
|
|
| |
We need to split up the depth and stencil values in this case, and there's
some new logic required to handle float depth and stencil simultaneously.
Also make sure we get the 64bit zs clear values and masks propagated
correctly.
|
|
|
|
|
|
|
|
|
| |
We do rendering to linear color buffers for quite some time, and since
switching to linear depth buffers all the tiled/linear logic was unused.
So get rid of (most) of it - there's still some LAYOUT_NONE things and
late allocation of resources which probably could be simplified.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code avoided first_layer parameter in the sampler interface (and needing
to do another calculation at runtime) by fixing up the base texture pointer
instead. Unfortunately, this didn't actually work as we have mip-first
texture layout so fixing up the base ptr by a fixed amount is very wrong if
there are mipmaps present. The wrong offsets caused misrendering and crashes.
Fix this by just adjusting the individual mip level offsets instead.
Spotted by Jose.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change was meant as a stepping stone to use PMADDUBSW SSSE3
instruction, but actually this refactoring by itself yields a 10%
speedup on texture intensive shaders (e.g, Google Earth's ocean water
w/o S3TC on a Ivy Bridge machine), while giving yielding exactly the
same results, whereas PMADDUBSW only gave an extra 5%, at the expense of
2bits of precision in the interpolation.
I belive that the speedup of this change comes from the reduced register
pressure (as 8.8 fixed point intermediates take twice the space of 8bit
unorm).
Also, not dealing with 8.8 simplifies lp_bld_sample_aos.c code
substantially -- it's no longer necessary to have code duplicated for
low and high register halfs.
Note about lp_build_sample_mipmap(): the path for num_quads > 1 is never
executed (as it is faster on AVX to split the 256bit wide texture
computation into two 128bit chunks, in order to leverage integer
opcodes). This path might be useful in the future, so in order to
verify this change did not break that path I had to apply this change:
@@ -1662,11 +1662,11 @@ lp_build_sample_soa(struct gallivm_state *gallivm,
/*
* we only try 8-wide sampling with soa as it appears to
* be a loss with aos with AVX (but it should work).
* (It should be faster if we'd support avx2)
*/
- if (num_quads == 1 || !use_aos) {
+ if (/* num_quads == 1 || ! */ use_aos) {
if (num_quads > 1) {
if (mip_filter == PIPE_TEX_MIPFILTER_NONE) {
LLVMValueRef index0 = lp_build_const_int32(gallivm, 0);
/*
and then run texfilt mesademo:
LP_NATIVE_VECTOR_WIDTH=256 ./texfilt
Ran whole piglit without regressions.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pass in the size of the index buffer, when available, and use it
to handle out of bounds conditions. The behavior in the case of
an overflow needs to be the same as with other overflows in the
vertex processing pipeline meaning that a vertex should still
be generated but all attributes in it set to zero.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We would crash when stride was bigger than the size of the buffer.
The correct behavior is to just fetch zero's in this case.
Unfortunatly with user_buffer's there's no way to validate the size
because currently we're just not getting it. Adjust the draw interface
to pass the size along the mapped buffer, which works perfectly
for buffer backed vertex_buffers and, in future, it will allow
us to plumb user_buffer sizes through the same interface.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
v2: fix typo 65535 -> 65536
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It should be unsigned, not enum pipe_flush_flags.
Fixed a build error:
src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error:
invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive]
v2: replace all occurrences of enum pipe_flush_flags by unsigned
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
[olv: document the parameter now that the type is unsigned]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Eliminating this we no longer need to copy between linear and swizzled layout.
This is probably not quite ideal since it's a bit more work for now, could do
some optimizations by moving depth testing outside the fragment shader loop
(but tricky for early depth test as we don't have neither the mask nor the
interpolated z in the right order handy).
The large amount of tile/untile code is no longer needed will be deleted
in next commit.
No piglit regressions.
v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR.
v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
That is, when llvmpipe is run in single-threaded mode.
Trivial.
Tested with
LP_NUM_THREADS=0 glean --run results --overwrite --quick --tests occluQry
|
|
|
|
|
|
|
| |
Fixes a crash when one of the so targets is null.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
On the mesa-users list, Burlen Loring reported a speed-up with 16 cores
and his test/app.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
half_pixel_center.
Squashed commit of the following:
commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <[email protected]>
Date: Tue Apr 23 17:37:18 2013 +0100
gallium: s/lower_left_origin/bottom_edge_rule/
commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <[email protected]>
Date: Tue Apr 23 17:35:04 2013 +0100
gallium: Move diagram to docs.
commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <[email protected]>
Date: Fri May 11 17:50:55 2012 +0100
gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
This change is necessary to achieve correct results when using OpenGL
FBOs.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
Because we don't support, and the u_format fallback doesn't work for
zs formats.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Prevents assertion failures inside the driver for such state combinations.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Zack Rusin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is the only sane solution for nv50 and nvc0 (really, trust me),
but since on other hardware the border colour is tightly coupled with
texture state they'd have to undo the swizzle, so I've added a cap.
The dependency of update_sampler on the texture updates was
introduced to avoid doing the apply_depthmode to the swizzle twice.
v2: Moved swizzling helper to u_format.c, extended the CAP to
provide more accurate information.
|
|
|
|
|
|
|
| |
Tested with graw/fs-fragcoord 2/3, and piglit
glsl-arb-fragment-coord-conventions.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
No longer used.
If we ever want the old behavior we can run a loop unroller pass.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Never used.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a basic implementation of the pipeline statistics in the
draw module. The interface is similar to the stream output statistics
and also requires that the callers explicitly enable it.
Included is the implementation of the interface in llvmpipe and
softpipe. Only softpipe enables the pipeline statistics capability
though because llvmpipe is lacking gathering of the fragment shading
and rasterization statistics.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were missing the implementation of PIPE_QUERY_SO_STATISTICS
query, this change implements it on top of the existing
facilities.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
At least on llvm 3.2 this appears to work fine. Tested on an Athlon XP
2600+, which has sse and 3dnow but not sse2.
Reviewed-by: Jose Fonseca <[email protected]>
Signed-off-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When geometry shaders are present, one needs to be able to create
an empty geometry shader with stream output that needs to be
resolved later and attached to the currently bound vertex shader.
Lets add support for it to llvmpipe and draw. draw allows attaching
independent stream output info to any vertex shader and llvmpipe
resolves at draw time which vertex shader the given empty geometry
shader should be linked to.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need to reset the internal state of the so buffers or we'll
keep appending even though we're not supposed to.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code. That
leads to various rendering glitches.
Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
We weren't correctly propagating the samplers and sampler views
when they were related to geometry shaders.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This commits implements code generation of the geometry shaders in
the SOA paths. All the code is there but bugs are likely present.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Fixes assign instead of compare defects reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).
Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).
v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)
v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it possible to identify gl_TexCoord and gl_PointCoord
for drivers where sprite coordinate replacement is restricted.
The new PIPE_CAP_TGSI_TEXCOORD decides whether these varyings
should be hidden behind the GENERIC semantic or not.
With this patch only nvc0 and nv30 will request that they be used.
v2: introduce a CAP so other drivers don't have to bother with
the new semantic
v3: adapt to introduction gl_varying_slot enum
|
|
|
|
|
|
|
| |
instead just warn when creating the surface, rendering will simply happen
to first layer.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Trivial. Matches softpipe's code.
|
|
|
|
|
|
|
| |
It fails on 32-bit systems (I only tested on 64-bit). Power of two
size isn't required, so just remove the assertion.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
Use 0 instead.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
|
|
|
|
|
|
| |
Note: This is a candidate for the stable branches.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We advertise a max texture/surfaces size of 8K x 8K but the old values
for these limits didn't actually allow us to handle that surface size.
For 8K x 8K we'll have 16384 bins. Each bin needs at least one cmd_block
object which was 2192 bytes in size. Since 16384 * 2192 exceeded
LP_SCENE_MAX_SIZE we'd silently fail in lp_scene_new_data_block() and not
draw the complete scene.
By reducing CMD_BLOCK_MAX to 29 we get nice 512-byte cmd_blocks. And
by increasing LP_SCENE_MAX_SIZE to 9 MB we can allocate enough command
blocks for 8K x 8K, plus a few regular data blocks.
Fixes the (improved) piglit fbo-maxsize test.
Note: This is a candidate for the stable branches.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since c8eb2d0e829d0d2aea6a982620da0d3cfb5982e2 llvmpipe checks if it's
actually legal to create a surface. The opengl state tracker doesn't quite
obey this so for now just warn instead of assert.
Also warn instead of disabled assert when creating sampler views
(same reasoning).
Addresses https://bugs.freedesktop.org/show_bug.cgi?id=61647.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
texel offsets should have been the last missing feature for 130, and in
fact 140 as well (last there were texture buffers). In any case we still
don't do OpenGL 3.0 (missing MSAA which will be difficult,
plus EXT_packed_float, ARB_depth_buffer_float and EXT_framebuffer_sRGB).
v2: bump to 140 instead - we have everything except we crash when not writing
to gl_Position (but softpipe crashes as well) so let's just say this is a bug
instead. Also (by Dave Airlie's suggestion) update llvm-todo.txt.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now that buffers can be used as textures or render targets
make sure they aren't skipped.
Fix suggested by Jose Fonseca.
v2: added a couple of assertions so we can actually guarantee
we check the resources and don't skip them. Also added some comments
that this is actually a lie due to the way the opengl buffer api works.
|