| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were incorrectly computing the buffer offset when using the
instances. The buffer offset is always equal to:
start_instance * stride + (instance_num / instance_divisor) *
stride
We were completely ignoring the start instance quite
often producing instances that completely wrong, e.g. if
start instance = 5, instance divisor = 2, then on the first
iteration it should be:
5 * stride, not (5/2) * stride as we'd have currently, and if
start instance = 1, instance divisor = 3, then on the first
iteration it should be:
1 * stride, not 0 as we'd have.
This fixes it and adjusts all the code to the changes.
Signed-off-by: Zack Rusin <[email protected]>
|
|
|
|
|
|
| |
fetch_rgba_float is NULL for integer formats, and vice-versa.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Fixes same on both sides defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
These were leaking.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is for GL_ARB_vertex_type_2_10_10_10_rev.
I just took the code from u_format_table.c. It's based on pack_rgba_float.
I had no other choice. The u_format hooks are not exactly compatible
with translate. The cleanup of it is left for future work.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
The conversion is limited to only a few cases, because converting to any other
type shouldn't happen in any driver.
Reviewed-by: Dave Airlie <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
Causes crash and stack corruption.
Needs more investigation. Disable for now.
|
|
|
|
|
|
| |
This fixes piglit's draw-instanced-divisor test for softpipe on both
the generic and SSE paths. This is temporary until we have the
correct per-array max_index information.
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
|
| |
|
| |
|
|
|
|
| |
Fixes #29771.
|
| |
|
|
|
|
| |
Changed by me to use movd instead of movss to avoid penalties.
|
| |
|
|
|
|
| |
Initialize variables on error paths.
|
|
|
|
| |
According to Vinson, enabling it causes no regressions
|
| |
|
|
|
|
|
| |
We were putting the source pointer in a register used as a temporary,
breaking all paths that don't read the data in a single instruction.
|
|
|
|
| |
Fixes MSVC build.
|
|
|
|
| |
Assuming the side-effect of x86_make_reg is also unnecessary.
|
|
|
|
| |
Non-portable.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NOTE: Win64 is untested, and is thus currently disabled.
If you have such a system, please enable it and report whether it works.
To enable it, change src/gallium/auxiliary/translate/translate.c
Changes in v5:
- On Win64, preserve %xmm6 and %xmm7 as required by the ABI
- Use _WIN64 instead of WIN64
Changes in v4:
- Use x86_target() and x86_target_caps()
- Enable translate_sse in x86-64, but not in Win64
Changes in v3:
- Win64 support (untested)
- Use u_cpu_detect.h constants instead of #ifs
Changes in v2:
- Minimize #ifs
- Give a name to magic number CHANNELS_0001
- Add support for CPUs without SSE (only memcpy and swizzles, like non SSE2)
- Fixed comments
translate_sse is currently very limited to the point of
being useless in essentially all cases.
In particular, it only support some float32 and unorm8
formats and doesn't work on x86-64.
This commit rewrites it to support:
1. Dumb memory copy for any pair of identical formats
2. All formats that are swizzles of each other
3. Converting 32/64-bit floats and all 8/16/32-bit integers to 32-bit float
4. Converting unorm8/snorm8 to snorm16 and uscaled8/sscaled8 to sscaled16
5. Support for x86-64 (doesn't take advantage of it in any way though)
This new translate can even be useful to translate index buffers for
cards that lack 8-bit index support.
It passes the testsuite I wrote, but note that this is a major change, and more
testing would be great.
|
|
|
|
|
| |
Currently, only 32-bit indices are supported, but some use cases
translate needs support for all types.
|
|
|
|
|
|
|
|
|
|
| |
Currently translate_sse puts two trivial wrappers in the translate vtable.
These slow it down and enlarge the source code for no gain, except perhaps
the ability to set a breakpoint there, so remove them.
Breakpoints can be set on the caller of the translate functions, with no
loss of functionality.
|
|
|
|
| |
This moves the common code into a separate ALWAYS_INLINE function.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes in v3:
- If we can do a copy, don't try to get an emit func, as that can assert(0)
Changes in v2:
- Add comment regarding copy_size
When used in GPU drivers, translate can be used to simultaneously
perform a gather operation, and convert away from unsupported formats.
In this use case, input and output formats will often be identical: clearly
it would make sense to use a memcpy in this case.
Instead, translate will insist to convert to and from 32-bit floating point
numbers.
This is not only extremely expensive, but it also loses precision for
32/64-bit integers and 64-bit floating point numbers.
This patch changes translate_generic to just use memcpy if the formats are
identical, non-blocked, and with an integral number of bytes per pixel (note
that all sensible vertex formats are like this).
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently translate asserts on unsupported output formats, making
it impossible to use for some purposes, such as testing whether it
actually works on all formats it supports.
Removing the assert was met with opposition, so this change allows
clients to ask whether an output format is supported, and they are thus
able to avoid attempting to use it.
Since this is just an addition to the API, no adverse effect is
possible, and it makes the testsuite work again.
|
|
|
|
|
|
|
|
|
|
|
| |
supported"
This reverts commit 16b45ca7cefb3432b4133fe9d0b1dbfe3f286131.
José Fonseca asked for a revert.
Note that the testsuite will now segfault since it attempts to test
all possible formats.
|
|
|
|
|
|
|
| |
translate was attempting to output A8R8G8B8_UNORM as if it were
R8G8B8A8_UNORM.
Now the tests just added pass.
|
|
|
|
| |
This gives the caller a chance to recover (or crash anyway otherwise).
|
|
|
|
| |
Plus more debug code and do clamping in generic_run().
|
| |
|
|
|
|
|
| |
PIPE_FORMAT_R10G10B10X2_USCALED, half floats, were not supported, so
just rely on u_format for (almost) universal format support.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The mapping for vertex_array_bgra:
(gl -> st -> translate)
GL_RGBA -> PIPE_FORMAT_R8G8B8A8 (RGBA) -> no swizzle (XYZW)
GL_BGRA -> PIPE_FORMAT_A8R8G8B8 (ARGB) -> ZYXW (BGRA again??)
Iẗ́'s pretty clear that PIPE_FORMAT_A8R8G8B8 here is wrong. This commit
fixes the pipe format and removes obvious workarounds in util/translate.
Tested with: softpipe, llvmpipe, r300g.
Signed-off-by: José Fonseca <[email protected]>
|
| |
|
| |
|
|\
| |
| |
| |
| |
| | |
Conflicts:
src/gallium/auxiliary/tgsi/tgsi_dump.c
src/gallium/include/pipe/p_shader_tokens.h
|
| | |
|
| |
| |
| |
| |
| | |
Makes integration of gallium into out of tree components much easier. No
pratical change for components in this tree,
|
| |
| |
| |
| | |
It's all screaming for integer support -- fake it with float for now.
|
| | |
|
| | |
|
| | |
|
| |
| |
| |
| |
| | |
Modify the translate module to respect instance divisors and accept
instance id as a parameter to calculate input vertex offset.
|