| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Resolving a multisampled depth texture into
a single sampled texture is supported on >= SM4.1
hw. It is possible some previous hw support it.
The ability was tested on radeonsi and nvc0.
Apparently is is also supported for radeon >= r700.
This patch adds the MULTISAMPLE_Z_RESOLVE cap and
add it to the drivers. It is advertised for drivers
for which it is sure the ability is supported.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nothing special needs to be done.
Even though llvmpipe copies constant (ie uniform) buffers internally, the
application is supposed to flush and sync, so all should work.
All bufferstorage piglit tests pass.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As ffsll doesn't exist in MSVC yet, and u_bit_scan64 is only used by
radeonsi which is never built with MSVC.
This is just a stop-gap fix to unbreak MSVC build until we refactor these
mathematical portability wrappers into src/util.
Trivial.
|
|
|
|
|
|
| |
Trivial.
(Fixing MSVC will be far less so, as _BitScanForward64 is only supported on x64.)
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
I will need this for polygon stippling.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
We need a slot for the stipple texture and the pixel shader already uses
32 textures (16 API slots + 16 FMASK slots).
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
The hardware obeys swizzles even if the resource is NULL.
This will be used by set_polygon_stipple.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
This used to hang.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
This will allow supporting NULL textures.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Also remove unused "tokens".
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit ARB_base_instance/arb_base_instance-drawarrays.
Cc: 10.3 10.4 <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The FILLED_SIZE counter is uninitialized at the beginning, so we can't use it.
Instead, use offset = 0, which is what we always do when not appending.
This unexpectedly fixes spec/ARB_texture_multisample/sample-position/*.
Yes, the test does use transform feedback.
Cc: 10.3 10.4 <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
For drivers that use higher slots not to crash in tgsi_shader_info.
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
E.g. r600g can use slot 17, which is outside of the API range.
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Same as u_bit_scan, but for uint64_t.
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88930
Cc: 10.4 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
Unclear circumstances lead to undefined symbols on x86.
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=536916
Cc: [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This limits the style changes to modes inherited from prog-mode. The
main reason to do this is to avoid setting fill-column for people
using Emacs to edit commit messages because 78 characters is too many
to make it wrap properly in git log. Note that makefile-mode also
inherits from prog-mode so the fill column should continue to apply
there.
v2: Apply to all the .dir-locals.el files, not just the one in the
root directory.
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I could have done this in the bit that generates the ANDs and ORs, but
it's probably generally useful. Sadly, I still need this even if I move
to NIR, because I can't yet express my read of the destination color in
NIR, which I would need to move my blend/logicop/colormask handling into
NIR.
total uniforms in shared programs: 13497 -> 13455 (-0.31%)
uniforms in affected programs: 101 -> 59 (-41.58%)
total instructions in shared programs: 40797 -> 40296 (-1.23%)
instructions in affected programs: 1639 -> 1138 (-30.57%)
|
|
|
|
|
| |
Since the VPM reads have to be in order, it's useful to see their indices
in the dump.
|
|
|
|
|
|
|
|
|
| |
since the address reg holds integer values, ARL/ARR do an implicit float-to-int
conversion, so clarify that. Thus it is also incorrect to say that FLR really
does the same as ARL.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
The code was exactly the same, except util/ has c++ guards and a struct
simple_node declaration.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We have two copies of it in the tree, I'm going to delete one.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Only emit write SPI_TMPRING_SIZE once per packet.
- Use context global scratch buffer.
v3:
- Patch shaders using WRITE_DATA packet instead of map/unmap.
- Emit ICACHE_FLUSH, CS_PARTIAL_FLUSH, PS_PARTIAL_FLUSH, and
VS_PARTIAL_FLUSH when patching shaders.
v4:
- Code cleanups.
- Remove unnecessary multiplies.
v5:
- Patch shaders in system memory and re-upload to vram.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This moves scratch buffer allocation from si_launch_grid() to
si_create_compute_state(). This helps to reduce the overhead of
launching a kernel and also fixes a bug in the code that would cause
the scratch buffer to be too small if a kernel with smaller scratch size
was launched before a kernel with a larger scratch size.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
[ Francisco Jerez: As discussed on the mailing list, this is intended
to produce more useful debug output in cases where the compilation
terminates unexpectedly. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
[ Francisco Jerez: As we're at it make debug_options[] local to its
only user and remove temporary. ]
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL 1.50 specifies a fragment shader may have a primitive id
input without a geometry shader present.
On r600 hw there is a special GS scenario for this, you have
to enable GS_SCENARIO_A and pass the primitive id through
the vertex shader which operates in GS_A mode.
This is a first pass attempt at this, and passes the piglit
tests that test for this.
v1.1: clean up debug print + no need to assign
key value to setup output.
v2: add r600 support
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In order to detect that a pixel shader has a prim id
input when we have no geometry shader we need to reorder
the shader selection so the pixel shader is selected
first, then the vertex shader key can take into account
the primitive id input requirement and lack of geom shader.
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes reading beyond allocated memory:
==1936== Invalid read of size 1
==1936== at 0x4C2C1B4: strlen (vg_replace_strmem.c:412)
==1936== by 0x9E00C30: std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(char const*, std::allocator<char> const&) (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.20)
==1936== by 0x5B44FAE: clover::compile_program_llvm(clover::compat::string const&, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&, pipe_shader_ir, clover::compat::string const&, clover::compat::string const&, clover::compat::string&) (invocation.cpp:698)
==1936== by 0x5B39A20: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936== by 0x5B20152: clBuildProgram (program.cpp:182)
==1936== by 0x400F41: main (hello_world.c:109)
==1936== Address 0x56fee1f is 0 bytes after a block of size 15 alloc'd
==1936== at 0x4C28C20: malloc (vg_replace_malloc.c:296)
==1936== by 0x5B398F0: alloc (compat.hpp:59)
==1936== by 0x5B398F0: vector<std::basic_string<char> > (compat.hpp:98)
==1936== by 0x5B398F0: string<std::basic_string<char> > (compat.hpp:327)
==1936== by 0x5B398F0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==1936== by 0x5B20152: clBuildProgram (program.cpp:182)
==1936== by 0x400F41: main (hello_world.c:109)
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes writing beyond the allocated buffer:
==31855== Invalid write of size 1
==31855== at 0x50AB2A9: vsprintf (iovsprintf.c:43)
==31855== by 0x508F6F6: sprintf (sprintf.c:32)
==31855== by 0xB59C7EC: r600_get_compute_param (r600_pipe_common.c:526)
==31855== by 0x5B2B7DE: get_compute_param<char> (device.cpp:37)
==31855== by 0x5B2B7DE: clover::device::ir_target() const (device.cpp:201)
==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855== by 0x5B20152: clBuildProgram (program.cpp:182)
==31855== by 0x400F41: main (hello_world.c:109)
==31855== Address 0x56fed5f is 0 bytes after a block of size 15 alloc'd
==31855== at 0x4C29180: operator new(unsigned long) (vg_replace_malloc.c:324)
==31855== by 0x5B2B7C2: allocate (new_allocator.h:104)
==31855== by 0x5B2B7C2: allocate (alloc_traits.h:357)
==31855== by 0x5B2B7C2: _M_allocate (stl_vector.h:170)
==31855== by 0x5B2B7C2: _M_create_storage (stl_vector.h:185)
==31855== by 0x5B2B7C2: _Vector_base (stl_vector.h:136)
==31855== by 0x5B2B7C2: vector (stl_vector.h:278)
==31855== by 0x5B2B7C2: get_compute_param<char> (device.cpp:35)
==31855== by 0x5B2B7C2: clover::device::ir_target() const (device.cpp:201)
==31855== by 0x5B398E0: clover::program::build(clover::ref_vector<clover::device> const&, char const*, clover::compat::vector<clover::compat::pair<clover::compat::string, clover::compat::string> > const&) (program.cpp:63)
==31855== by 0x5B20152: clBuildProgram (program.cpp:182)
==31855== by 0x400F41: main (hello_world.c:109)
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88783
Signed-off-by: Jan Vesely <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previous code semantic was:
. if ff ps will not run a ff stage, then do not output texture coords for this stage
for vs
. if XYZRHW is used (position_t), use only the mode where input coordinates are copied
to the outputs.
Problem is when apps don't give texture inputs. When apps precise PASSTHRU, it means
copy texture coord input to texture coord output if there is such input. The case
where there is no texture coord input wasn't handled correctly.
Drivers like r300 dislike when vs has inputs that are not fed.
Moreover if the app uses ff vs with a programmable ps, we shouldn't look at
what are the parameters of the ff ps to decide to output or not texture
coordinates.
The new code semantic is:
. if XYZRHW is used, restrict to PASSTHRU
. if PASSTHRU is used and no texture input is declared, then do not output
texture coords for this stage
The case where ff ps needs a texture coord input and ff vs doesn't output
it is not handled, and should probably be a runtime error.
This fixes 3Dmark05, which uses ff vs with programmable ps.
Reviewed-by: Tiziano Bacocco <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
| |
declaration
Reviewed-by: Tiziano Bacocco <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the shader does indirect addressing on the constants,
we allocate a temporary constant buffer to which we copy
the constants from the app given user constants and
the constants filled in the shader.
This patch makes this buffer be allocated once.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Signed-off-by: Tiziano Bacocco <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Tiziano Bacocco <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Relative addressing needs the constant buffer to get all
the correct constants, even those defined by the shader.
The code to copy the shader constants to the constant buffer
was enabled only for debug build. Enable it always.
Cc: "10.4" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since constant indirect adressing is not allowed for ps,
we can remove our code to handle that.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
relative adressing for constants is possible only for vs float
constants.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|