| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This new macro uses a for loop to create an actual code block in which to
place the macro setup code. One advantage of this is that you syntatically
use braces instead of parentheses. Another is that the code in the block
doesn't even get executed if anv_batch_emit_dwords fails.
Acked-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is only valid for other atomic operations (including CAS). This
fixes an invalid opcode error from dmesg. While we are it, make sure
to initialize global addr to 0 for other atomic operations.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
mcpu=generic doesn't enable sse2, and anvil definitly needs it
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
After some investigation, it seems like that disabling the UNK02C4
command avoid a read fault with texelFetch() from a compute shader.
I have no clue on what this method actually does, but this avoid the
GPU to hang with basic-texelFetch.shader_test without introducing any
compute-related regressions.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally, we split uniforms at the end but in Vulkan, we bail because we
don't want pull constants. However, we still need them split because
pack_uniforms relies on it.
I really don't like this patch not because it doesn't work (it does) but
because now that we're using MOV_INDIRECT, uniform numbers and sizes don't
really matter anymore. In the FS backend, uniform splitting and packing is
handled all at once (actual re-assignment of locations happens later) and
we really should do it that way in vec4 eventually as well.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
|
| |
This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
| |
We shouldn't be reading the const_index directly
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
|
|
| |
All of the code that did something special based on vec4 vs. scalar is
bogus. In the backend, everything is now in units of bytes and the vec4
backend can handle full std140 packing so we don't need to do anything
special anymore.
Signed-off-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
|
|
|
|
|
|
|
|
|
|
| |
Code was using an incorrect address for the base pointer.
v2: use swr_resource_data() utility function.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94979
Reviewed-by: Bruce Cherniak <[email protected]>
Tested-by: Markus Wick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for OpenCL global memory buffers, note this has only
been tested with regular load and stores and likely needs more work
for e.g. atomic ops.
Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only
apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for
OpenCL global buffers.
This commits changes the buffer code to use FILE_MEMORY_BUFFER at the
ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL
for use with OpenCL global buffers.
Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL
register file.
Tested with piglet on a gf119 and a gk107:
./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader
[9/9] pass: 9 /
./piglit run -o shader -t '.*arb_compute_shader.*' results/shader
[20/20] skip: 4, pass: 16 |
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
v2: - set interop_version
- simplify the offset_after macro
v2.1: - use version numbers, remove offset_after
- set "out_driver_data_written"
v2.2: - set buf_offset & buf_size for GL_ARRAY_BUFFER too
- add whandle.offset to buf_offset
- disable the minmax cache for GL_TEXTURE_BUFFER
|
|
|
|
| |
v2: - use const
|
|
|
|
| |
v2: - use const
|
|
|
|
| |
v2: - use const
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - use "enum" to define stuff
v3: - more comments, define MESA_GLINTEROP_UNSUPPORTED
v4: - add mesa_glinterop_device_info::interop_version
- more comments
- remove #define MESA_GLINTEROP_VERSION
- use const for "in"
v4.1: - use version numbers for structures
- add "out_driver_data_written"
v4.2: - buf_offset & buf_size affect GL_ARRAY_BUFFER too, this is required
for sharing suballocations within a larger buffer
|
|
|
|
|
|
|
|
|
|
| |
When creating egl images we do a bytes to pixel conversion by deviding
by 4 regardless of the pixel format. This does not work for RGB565. In
this patch, we avoid useless conversion and use proper API when the
conversion cannot be avoided.
Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
This code is already duplicated twice and will be useful again. This
will also help when adding formats.
Signed-off-by: Nicolas Dufresne <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions. But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures. Work around this
by doing the srgb->linear conversion in the shader instead.
This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*
(The remaining fails seem to be a bug w/ ASTC + linear filtering, also
possibly a420.0 specific.)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
No need for it not to be const, and lets caller declare it const if
desired.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The separate FS/VS entrypoints are no longer used since a3ed98f. So
just inline them.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Provide an improved lowering for LRP, which can be implemented in two
MAD instructions with a bit of rearranging of the equation, rather
than the literal implementation of two multiplies, an add and a
subtract.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Improve XPD lowering to consume less instructions by using the
MAD instruction to perform the multiply and subtraction together.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for lowering TRUNC using the following sequence:
FRC tmpA, |src|
SUB tmpA, |src|, tmpA
CMP dst, -tmpA, tmpA
Note that this is incompatible with FRC lowering.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for lowering FLR and CEIL to FRC/SUB and FRC/ADD
instructions for GPUs that support FRC but not FLR or CEIL. Since
these uses FRC, it is invalid to ask for FLR or CEIL to be lowered
along with FRC, so add an assert to catch this invalid configuration.
We also need to deal with FLR instructions emitted by the lowering
code. Fix these up with the FRC+SUB equivalent when FLR lowering is
enabled.
Signed-off-by: Russell King <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Use chip_class instead of family.
v3: Check kernel version for SI.
v4: Preemptively allow amdgpu winsys for SI.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
si_shader_create corrects the SGPR count with si_fix_num_sgprs. We then
recompute the rsrc1 register to use the new SGPR count.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Also depend on atomic counters.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|