| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The new name for the intrinsic was introduced in LLVM r258558.
v2: use ternary operator instead of preprocessor
Reviewed-by: Michel Dänzer <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a VM fault and possible lockup in high memory pressure situations.
Cc: "11.1" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The key for a geometry shader would be interpreted as the key for a vertex
shader further down the line, which really doesn't make sense.
This does not affect the contents of shader->key because geometry shaders
don't have any key entries anyway.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Hence remove the misleading branch on is_gs_copy_shader.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
It is only used during shader creation now, so no need to keep it around
afterwards.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
We now have an explicit parameter that contains the same information, and
this will allow us to get rid of is_gs_copy_shader in the si_shader struct.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically, when the API switches from using a GS to not using a GS and then
back to using the same GS again, we do not have to re-send all the GS state,
but we do have to send VGT_GS_MODE. So make VGT_GS_MODE consistently be a part
of the VS state.
This fixes a rendering bug in Dolphin, but surely other applications are
affected as well.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93648
Cc: "11.0 11.1" <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Cc: "11.0 11.1" <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Like other resources, the indirect draw buffer must be unwrapped.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Set RADEON_ALL_BOS=1 to use it.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This is just a copy-n-paste and rename of vc4 Android makefiles.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's a subset of ETC2. Tested.
For more information, see page 42 and onward:
http://www.graphicshardware.org/previous/www_2007/presentations/strom-etc2-gh07.pdf
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
Requested by Matt Arsenault.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
v2: account for LDS usage in PS
the limit is per SIMD, not per CU
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
v2: take the number of CUs into account
v3: change in LS allocation
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
v2: After more discussion with hw teams, the kernel already contains the
optimal settings allowing us to use all CUs.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 843855bbf0da2204ce536623ba957bfa83fdbd52.
It became redundant due to Marek's earlier pushed 8667a1ae which achieves
the same thing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a fragment shader is used that has no outputs but does conditional
discard (KILL_IF), all fragments are killed without this patch.
By comparing various register settings, my conclusion is that the exec mask
is either not properly forwarded to the DB by NULL exports or ends up being
unused, at least when there is _only_ a NULL export (the ISA documentation
claims that NULL exports can be used to override a previously exported exec
mask).
Of the various approaches I have tried to work around the problem, this one
seems to be the least invasive one.
v2: take discard by alpha test into account as well
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93761
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Tested and working.
|
|
|
|
|
|
|
| |
We always get per-sample input position.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
radeon sets this correctly, but not amdgpu
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
because not using SPI_SHADER_32_ABGR doubles fill rate.
We should also get optimal performance if alpha isn't needed or blending
isn't enabled.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
This avoids the fp16 packing instructions.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This does change the behavior slightly:
If a shader writes COLOR[i] and that color buffer isn't bound,
the shader will export MRT_NULL instead and discard the IR tree that
calculates the output. The only exception is alpha-to-coverage, which
requires an alpha export.
v2: - update a comment about 16BPC
- account for MRTZ when when fixing alpha-test/kill
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
most likely useless, but doesn't hurt
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By using whole static libraries the android buildsystem provides
whole-archive (alike) solution. This means that we don't need to worry
about the order of the static libraries and any reverse, recursive or
circular dependencies that they have between one another.
Without this the linker will discard any unused hunks of one library
and we'll end up with unresolved symbols as those are required by
another static library. This issue has become more prominent with the
introduction of pipe-loader.
Whole static libraries has been used in i915/i965 for a very long
time, so we might do the same.
v2:
- Better commit message (Ilia)
- Keep external dependencies as [normal] static libs (Mauro)
Cc: [email protected]
Cc: Mauro Rossi <[email protected]>
Reported-by: Mauro Rossi <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Writes string to cmdstream in payload of a no-op packet.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since the GREMEDY extensions are normally only exposed by the gremedy
debugger (and could possibly trigger debug paths in the app), we don't
expose the extension by default, but instead only with
ST_DEBUG=gremedy.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The buffers are referenced from r600_update_driver_const_buffers()
-> r600_set_constant_buffer() -> u_upload_data(), but nothing
ever releases the reference. Similar case with driver_consts.
Found using valgrind.
Signed-off-by: Grazvydas Ignotas <[email protected]>
Cc: <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Take reading shader outputs into account, and use setFlagsDef for the
carry since we rely on having i->flagsDef being set.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Doing that is clearly a bug. We can't quite assert as st/mesa may hit this,
but increase at least visibility of it a bit.
(For the non-refcounted objects it would be illegal too, but we can't detect
that unless we'd store the context ourselves. Plus, those don't tend to cause
random crashes at context or object destruction time... So just sampler views,
surfaces and so targets for now.)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I removed this mistakenly in 2dbc20e45689e09766552517a74e2270e49817b5. I
actually thought it should not be necessary and a piglit run didn't show
any differences, but this shouldn't have been in there.
draw_prepare_shader_outputs() is in fact dependent on NEW_RASTERIZER.
The new polygon-mode-facing test indeed shows why this is necessary, there's
lots of invalid reads and writes with valgrind (also crashes without
valgrind), because the pre-pipeline vertex size doesn't match the
post-pipeline vertex size (note this won't help much with stages which don't
have the prepare hook which can grow the vertex size, in particular the wide
point stage, but this isn't used by llvmpipe). The test still won't pass, of
course, but it is only usage of uninitialized values now, which is much
less dangerous...
(Albeit I'm pretty sure for i915 it really is not needed anymore as it
doesn't care about the extra outputs and doesn't call
draw_prepare_shader_outputs().)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Fixes: 31fde8fa (nv50/ir: flip shl(add, imm) into add(shl, imm))
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
It was accidentally using the store opcode.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
This is enough for the plain TGSI BARRIER implementation.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we have a d24x8 format, there is no stencil. Therefore, we can always
clear these bits too, which means this will be some kind of memset rather
than read-modify-write.
This is good for some 7% increase or so in gears with huge window size -
seems to have a bigger effect if things aren't in caches. Of course, any
real app won't spend nearly as much time comparatively in clearing
depth buffer in the first place, so the speedup will be much lower.
Reviewed-by: Jose Fonseca <[email protected]>
|