| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Change type of anv_render_pass_attachment::format from VkFormat to const
struct anv_format*. This elimiates the repetitive lookups into the
VkFormat -> anv_format table when looping over attachments during
anv_cmd_buffer_clear_attachments().
|
|
|
|
|
|
| |
Stop creating a temporary VkImageCreateInfo with overriden
format=VK_FORMAT_S8_UINT. Instead, just pass the format override
directly to anv_image_make_surface().
|
|
|
|
|
|
|
| |
Stencil formats are often a special case. To reduce the number of lookups
into the VkFormat-to-anv_format translation table when working with
stencil, expose the table's entry for VK_FORMAT_S8_UINT as global
variable anv_format_s8_uint.
|
|
|
|
|
|
| |
Change type of anv_surface_view::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
|
|
|
|
|
|
|
| |
This moves the translation of VkFormat to anv_format from
anv_fill_buffer_surface_state() to its caller.
A prep commit to reduce more VkFormat -> anv_format translations.
|
|
|
|
|
|
| |
Change type of anv_image::format from VkFormat to const struct
anv_format*. This reduces the number of lookups in the VkFormat ->
anv_format table.
|
|
|
|
|
|
| |
Store the original VkFormat as anv_format::vk_format. This will be used
to reduce format indirection, such as lookups into the VkFormat ->
anv_format translation table.
|
| |
|
|
|
|
|
|
| |
These are now available in intel_aubdump from intel-gpu-tools.
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
We need to make sure we use the VkImage infrastructure for creating
dmabuf images.
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
|
|
|
|
| |
This prevents make from stomping on nir_spirv.h
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| | |
The address calculations are all different (e.g. see GP), there appear
to be sync's in programs, and probably a bunch of other differences.
Just disable it for now.
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
NIR instruction count results on i965:
total instructions in shared programs: 1261954 -> 1261937 (-0.00%)
instructions in affected programs: 455 -> 438 (-3.74%)
One in yofrankie, two in tropics. Apparently i965 had also optimized all
of these out anyway.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
There are so many flags in textures, that the CSE pass would have a hard
time referencing the correct set when figuring out if two texture ops are
the same. By zeroing, we can avoid that fragility.
Reviewed-by: Jason Ekstrand <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This massively reduces our dependency on VC4-specific optimization passes.
shader-db:
total uniforms in shared programs: 32077 -> 32067 (-0.03%)
uniforms in affected programs: 149 -> 139 (-6.71%)
total instructions in shared programs: 98208 -> 98182 (-0.03%)
instructions in affected programs: 2154 -> 2128 (-1.21%)
|
| |
| |
| |
| |
| |
| |
| |
| | |
In order to move more of our lowering into NIR, we need the ability to
reference various pipeline state (like texture rectangle scaling factors
or blend colors), so we just set those up as a load_uniform with a big
offset to indicate that it's not within the shader's uniform storage and
is one of our state values.
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Avoids regressions in vc4 when trying to do our blending in NIR.
v2: Add the other unpack ops I meant to when writing the original commit
message.
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We may find a cause to do more undef optimization in the future, but for
now this fixes up things after if flattening. vc4 was handling this
internally most of the time, but a GLB2.7 shader that did a conditional
discard and assign gl_FragColor in the else was still emitting some extra
code.
total instructions in shared programs: 100809 -> 100795 (-0.01%)
instructions in affected programs: 37 -> 23 (-37.84%)
v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas
Helland).
v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg
over to src[0], too (by anholt).
Reviewed-by: Thomas Helland <[email protected]> (v2)
Tested-by: Thomas Helland <[email protected]> (v2)
|
| |
| |
| |
| |
| |
| |
| | |
Fixes fs-simple-texture-size.shader_test
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.6" <[email protected]>
|
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| | |
Signed-off-by: Ilia Mirkin <[email protected]>
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
The bug was misunderstood. Besides that, the bug affects a DB feature we
don't use yet.
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
add context buffer to fix H265 uvd decode issue.
fix H265 corruption issue caused by incorrect assigned ref_pic_list.
v2: disable interlace for HEVC
add CZ sps flag workaround
fix coding style
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Boyuan Zhang <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
VCE dual instances are encoding in parallel, it needs two frames for
encoding with their own parameters in one IB. Master instance will check
the task info to find another frame, assign it to the slave instance
Signed-off-by: Leo Liu <[email protected]>
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
since VCE 3.0 with dual instances, we need stack frames for them.
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
We need a negative offset for FW 50.
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The config task has own task ID, extract the configuration functions
into config task.
v2 (chk): calculate offset automatically
Signed-off-by: Leo Liu <[email protected]>
Signed-off-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
v2: rebase by Marek
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
v2: -make tonga use new h264 performance HW decoder;
-integrate it scaling buffer to msg_fb buffer
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
v2: (leo) add checking for driver backend
v3: (leo) change variable name from use_amdgpu to use_vm
v4: rebase by Marek
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
v2: (leo) add checking for driver backend
v3: (leo) change variable name from use_amdgpu to use_vm
v4: rebase by Marek
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
v2: incorporate comments from Marek
v3: add missing fiji case in winsys init
use tonga raster config (double check this)
v4: rebase on harvest patch
Reviewed-by: Marek Olšák <[email protected]> (v3)
Reviewed-by: Christian König <[email protected]> (v3)
Reviewed-by: David Zhang <[email protected]> (v3)
Signed-off-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
v2: fix tonga chip check
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Christian König <[email protected]>
Reviewed-by: David Zhang <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Properly calculate the PA_SC_RASTER_CONFIG[_1] settings
for harvest chips.
v2: - fix default raster config settings for CZ and KV
- Suggestions from Michel
v3: - handle multiple packers properly for CI+
- GRBM_GFX_INDEX is privileged on VI+
Reviewed-by: Michel Dänzer <[email protected]> (v2)
Signed-off-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| | |
Reviewed-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| | |
Need to take into account the number of RBs.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This enables the second RB on asics that support it which
should boost performance.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
Cc: [email protected]
|
| |
| |
| |
| | |
Reviewed-by: Christian König <[email protected]>
|
| | |
|