| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The _mesa_dlist_alloc() function is only guaranteed to return a pointer
with 4-byte alignment. On 64-bit systems which don't support unaligned
loads (e.g. SPARC or MIPS) this could lead to a bus error in the VBO code.
The solution is to add a new _mesa_dlist_alloc_aligned() function which
will return a pointer to an 8-byte aligned address on 64-bit systems.
This is accomplished by inserting a 4-byte NOP instruction in the display
list when needed.
The only place this actually matters is the VBO code where we need to
allocate a 'struct vbo_save_vertex_list' which needs to be 8-byte
aligned (just as if it were malloc'd).
The gears demo and others hit this bug.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88662
Cc: "10.4" <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes build with Windows SDK 7.0.7600.
Tested with u_atomic_test, both on x86 and x86_64.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
The intrinsics are universally available, whereas older Windows SDKs (e.g.
7.0.7600) don't have the non-intrisic entrypoint.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the SKL bspec the 3DSTATE_CONSTANT_* commands only take
effect on the next corresponding 3DSTATE_BINDING_TABLE_POINTER_*
command. This patch just makes it set the BRW_NEW_SURFACES state when
uploading the push constants to ensure the binding tables will be
updated.
This fixes the fbo-blending-formats Piglit test and possibly others.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When color buffers alone are concerned the depth is not needed.
No regression on BDW where meta blit is used instead of blorp. I
also disabled blorp temporarily for fbo-blits on IVB and saw no
regressions there either.
I also compared several graphics benchmarks on BDW and saw neither
regressions or improvements.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implementing an idea from Ken, on i965 the shader program for 2D
blits becomes significantly simpler.
Before:
pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
send(8) g2<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q };
mov(8) g123<1>F g2<8,8,1>F { align1 1Q compacted };
mov(8) g124<1>F g3<8,8,1>F { align1 1Q compacted };
mov(8) g125<1>F g4<8,8,1>F { align1 1Q compacted };
mov(8) g126<1>F g5<8,8,1>F { align1 1Q compacted };
mov(8) g127<1>F g2<8,8,1>F { align1 1Q compacted };
nop ;
sendc(8) null g123<8,8,1>F
render RT write SIMD8 LastRT Surface = 0 mlen 5 rlen 0 { align1 1Q EOT };
After:
pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 1Q compacted };
send(8) g124<1>UW g6<8,8,1>F
sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 1Q };
sendc(8) null g124<8,8,1>F
render RT write SIMD8 LastRT Surface = 0 mlen 4 rlen 0 { align1 1Q EOT };
v2 (Matt): Removed unintended white-space change
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Currently all blit programs are unconditionally compiled with
gl_FragDepth.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5998190 -> 5997603 (-0.01%)
instructions in affected programs: 54276 -> 53689 (-1.08%)
helped: 293
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5998321 -> 5998287 (-0.00%)
instructions in affected programs: 4520 -> 4486 (-0.75%)
helped: 8
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
the search
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This allows you to match on an unknown value but only if it is of a given
type. 90% of the uses of this are for matching only booleans, but adding
the generality of arbitrary types is no more complex.
nir_algebraic.py doesn't handle this yet but that's ok because the C
language will ensure that the default type on all variables is void.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some algebraic transformations that we want to do but only if
certain things are constants. For instance, we may want to replace
a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant.
While this generates more instructions, some of it will get constant
folded.
nir_algebraic.py doesn't handle this yet, but that's ok because the C
language will make sure that false is the default for now.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This allows us to indicate a concept of an invalid type.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We end up with these from TGSI-to-NIR because the pass generating the
comparisons doesn't know if the arg is actually a bool input or not. vc4
results:
total instructions in shared programs: 41801 -> 41508 (-0.70%)
instructions in affected programs: 4253 -> 3960 (-6.89%)
Reviewed-by: Matt Turner <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will be used by tgsi_to_nir, which needs to get vec4 types for
declaring shader input/output variables.
v2: Add a missing space.
Reviewed-by: Matt Turner <[email protected]> (v2)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch advertises support for GL_OES_texture_*float* extensions
when using i965 drivers.
Signed-off-by: Kevin Rogovin <[email protected]>
Signed-off-by: Kalyan Kondapally <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This patch adds needed support for accepting HALF_FLOAT_OES as valid type
for TexImage*D and TexSubImage*D when Texture FLoat extensions are supported.
Signed-off-by: Kevin Rogovin <[email protected]>
Signed-off-by: Kalyan Kondapally <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch series adds support for following GLES2 Texture Float extensions:
1)GL_OES_texture_float,
2)GL_OES_texture_half_float,
3)GL_OES_texture_float_linear,
4)GL_OES_texture_half_float_linear.
This patch adds basic infrastructure and needed boolean flags to advertise
support for these extensions, by default the support is disabled. Next patch
in the series introduces support for HALF_FLOAT_OES token.
v4: take assert away and make valid_filter_for_float conditional (Tapani),
fix the alphabetical order (Emil)
Signed-off-by: Kevin Rogovin <[email protected]>
Signed-off-by: Kalyan Kondapally <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It now emits vector MOVs instead of a series of individual MOVs, which
should be useful to any vector backends. This pushes the problem of
src/dest aliasing of channels on a scalar chip to the backend, but if
there are any vector operations in your shader then you needed to be
handling this already.
Fixes fs-swap-problem with my scalarizing patches.
v2: Rename to insert_mov(), and add a comment about what it does.
v3: Rewrite the comment.
Reviewed-by: Connor Abbott <[email protected]> (v3)
|
|
|
|
|
|
|
| |
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
The idea is that after a remove_from_list(), you might want to be able to
do a remove_from_list() on it again or an is_empty_list(). This is
apparently relied on by r300g.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We have two copies of it in the tree, I'm going to delete one.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Only emit write SPI_TMPRING_SIZE once per packet.
- Use context global scratch buffer.
v3:
- Patch shaders using WRITE_DATA packet instead of map/unmap.
- Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
VS_PARTIAL_FLUSH when patching shaders.
v4:
- Code cleanups.
- Remove unnecessary multiplies.
v5:
- Patch shaders in system memory and re-upload to vram.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This moves scratch buffer allocation from si_launch_grid() to
si_create_compute_state(). This helps to reduce the overhead of
launching a kernel and also fixes a bug in the code that would cause
the scratch buffer to be too small if a kernel with smaller scratch size
was launched before a kernel with a larger scratch size.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commits d6eb572905e39c36168b8f5da240af961f9dde0a and
58e8468d113c7d3d4a59ea4a8d70fd45b78e85e6.
This is no longer necessary as we aren't using it in NIR anymore. Also, it
broke the build on some strange systems so let's put it back in querymatrix
where it came from.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88852
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit d7d340fb2f68c46bd5a0008ecf53c6693e29c916.
We have an isnormal() implementation available, the only problem was that
we had the wrong return type (fixed in a later patch).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The problem is that the fallbacks we have at the moment don't work in C++.
While we could theoretically fix the fallbacks it would also raise the
issue of correctly detecting the fpclassify function. So, for now, we'll
just disable it until we actually have a C++ user.
Reported-by: Tom Stellard <[email protected]>
Tested-by: Tom Stellard <[email protected]>
Tested-by: EdB <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Sven Arvidsson <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=87076
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88806
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I haven't actually seen this bug in the wild, but it's possible that
someone could ask to do a S3TC PBO download or something. This protects us
from accidentally creating a render target with a compressed or otherwise
non-renderable format.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88792
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Patch adds 2 error messages that point user directly to fix
mispelled or impossible swizzle field for a format.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
[ Francisco Jerez: As discussed on the mailing list, this is intended
to produce more useful debug output in cases where the compilation
terminates unexpectedly. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
[ Francisco Jerez: As we're at it make debug_options[] local to its
only user and remove temporary. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL 1.50 specifies a fragment shader may have a primitive id
input without a geometry shader present.
On r600 hw there is a special GS scenario for this, you have
to enable GS_SCENARIO_A and pass the primitive id through
the vertex shader which operates in GS_A mode.
This is a first pass attempt at this, and passes the piglit
tests that test for this.
v1.1: clean up debug print + no need to assign
key value to setup output.
v2: add r600 support
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In order to detect that a pixel shader has a prim id
input when we have no geometry shader we need to reorder
the shader selection so the pixel shader is selected
first, then the vertex shader key can take into account
the primitive id input requirement and lack of geom shader.
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes reading beyond allocated memory:
==1936== Invalid read of size 1
==1936== at 0x4C2C1B4: strlen (vg_replace_strmem.c:412)
==1936== by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==1936== by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698)
==1936== by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936== by 0x5B20152: clBuildProgram (program.cpp:182)
==1936== by 0x400F41: main (hello_world.c:109)
==1936== Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd
==1936== at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==1936== by 0x5B398F0: alloc (compat.hpp:59)
==1936== by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98)
==1936== by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327)
==1936== by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936== by 0x5B20152: clBuildProgram (program.cpp:182)
==1936== by 0x400F41: main (hello_world.c:109)
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes writing beyond the allocated buffer:
==31855== Invalid write of size 1
==31855== at 0x50AB2A9: vsprintf (iovsprintf.c:43)
==31855== by 0x508F6F6: sprintf (sprintf.c:32)
==31855== by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526)
==31855== by 0x5B2B7DE: get_compute_param<char> (device.cpp:37)
==31855== by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201)
==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855== by 0x5B20152: clBuildProgram (program.cpp:182)
==31855== by 0x400F41: main (hello_world.c:109)
==31855== Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd
==31855== at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324)
==31855== by 0x5B2B7C2: allocate (new_allocator.h:104)
==31855== by 0x5B2B7C2: allocate (alloc_traits.h:357)
==31855== by 0x5B2B7C2: _M_allocate (stl_vector.h:170)
==31855== by 0x5B2B7C2: _M_create_storage (stl_vector.h:185)
==31855== by 0x5B2B7C2: _Vector_base (stl_vector.h:136)
==31855== by 0x5B2B7C2: vector (stl_vector.h:278)
==31855== by 0x5B2B7C2: get_compute_param<char> (device.cpp:35)
==31855== by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201)
==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855== by 0x5B20152: clBuildProgram (program.cpp:182)
==31855== by 0x400F41: main (hello_world.c:109)
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|