| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
These will be used in the following patch to allow copying directly
to the param list when packing is enabled.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently everything is padded to 4 components. Making the list
more flexible will allow us to do uniform packing.
V2 (suggestions from Nicolai):
- always pass existing calls to _mesa_add_parameter() true for padd_and_align
- fix bindless param value offsets
- remove left over wip logic from pad and align code
- zero out param value padding
- whitespace fix
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Will be used to determine whether to take packing code paths or not.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
As you're debugging register allocation, you may have changed the
intervals and not recomputed yet. Just skip the dump in that case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our register spilling support is nice to have since vc4 couldn't at all,
but we're still very restricted due to needing to not spill during a TMU
operation, or during the last segment of the program (which would be nice
to spill a value of, when there's a long-lived value being passed through
with little modification from the start to the end).
We could do better by emitting unspills for the last-segment values just
before the last thrsw, since the last segment is probably not the maximum
interference area.
Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3
others.
|
|
|
|
| |
The point was to get the MOV, which the MOV_dest already returned.
|
|
|
|
| |
This is nice for debugging when you've made a bad instruction.
|
|
|
|
|
| |
This will let me do lowering late in compilation using the same
instruction builder as we use in nir_to_vir.
|
|
|
|
| |
Anywhere we want to multiply, we probably want this.
|
| |
|
|
|
|
| |
Otherwise our start/ends ips won't line up with the actual instructions.
|
|
|
|
| |
This will be used for detecting last thread segment in register spilling.
|
|
|
|
|
| |
These helpers will be used in register spilling to determine where to add
a last thrsw if needed, and might help refactor QPU scheduling.
|
|
|
|
|
| |
The QPU scheduling code calling this function already separately checked
this signal.
|
|
|
|
| |
This will be reused in register spilling.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have some cases where in subpass we want the layer but having
it be 0 and loaded in the frag shader without the vertex shader
exporting it is fine.
So don't export the layer if we don't have a value to put in it.
Fixes: d4c74aed7a8 (radv/multiview: mark layer_input if we have input attachments.)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All of the shaders that had loops changed were in Tomb Raider. The one
shader that lost SIMD16 is one of those.
Skylake
total instructions in shared programs: 14391653 -> 14390468 (<.01%)
instructions in affected programs: 111891 -> 110706 (-1.06%)
helped: 501
HURT: 0
helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1
helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01%
95% mean confidence interval for instructions value: -3.23 -1.50
95% mean confidence interval for instructions %-change: -1.77% -1.45%
Instructions are helped.
total cycles in shared programs: 532793024 -> 532776598 (<.01%)
cycles in affected programs: 987682 -> 971256 (-1.66%)
helped: 348
nnHURT: 41
helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18
helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68%
HURT stats (abs) min: 1 max: 422 x̄: 65.39 x̃: 24
HURT stats (rel) min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02%
95% mean confidence interval for cycles value: -64.08 -20.38
95% mean confidence interval for cycles %-change: -2.78% -1.23%
Cycles are helped.
total loops in shared programs: 4854 -> 4829 (-0.52%)
loops in affected programs: 27 -> 2 (-92.59%)
helped: 18
HURT: 0
LOST: 1
GAINED: 0
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a shader only writes to an output via a constant initializer we
need to lower it before we call nir_remove_dead_variables so that
this pass sees the stores from the initializer and doesn't kill the
output.
Fixes test failures in new work-in-progress CTS tests:
dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float
This is ported from anv:
99b57daf4a anv/pipeline: lower constant initializers on output variables earlier
from Iago Toral Quiroga <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
For each view bit we need to emit a timestamp query.
Fixes: dEQP-VK.multiview.queries*
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For multiview we need to emit a number of sequential queries
depending on the view mask.
This avoids dEQP-VK.multiview.queries.15 waiting forever
on the CPU for query results that are never coming.
We only really want to emit one query,
and the rest should be blank (amdvlk does the same),
so we emit begin/end pairs for all the others except
the first query.
v2: fix tests
v3: split out patch.
Fixes: dEQP-VK.multiview.queries*
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
This just splits out the begin/end query hw emissions,
it makes it easier to add multiview support for queries.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes:
dEQP-VK.multiview.input_attachments*
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the intermediate states of active_stages are not used,
i.e. active_stages is read only after all stages were set into it,
just set its value before compiling the shaders.
This will allow to conditionally run certain passes based on what
other shaders are being used, e.g. a certain pass might only be
applicable to the vertex shader if there's no geometry or tessellation
shader being used.
v2: Use vk_to_mesa_shader_stage. (Lionel)
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
v2: Add Fixes tag. (Lionel)
Fixes: e50d4807a35e679 ("anv: Compile TCS/TES shaders.")
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change allows the disk shader cache to work with programs loaded
with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set,
then they try to use the shader cache.
Since the program loaded by ProgramBinary is similar to loading the
shader from the disk cache, this is probably more appropriate.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, we only look in the disk shader cache if we see that the
shader program is in the cache during the link step.
If the shader cache entry isn't found during the program link, there
are still some (fairly unlikely) scenarios where later it might be
useful to search the cache for gen binary programs.
1. If the cache evicts the serialized glsl cache, there might still be
valid gen program entries in the disk cache.
2. If two applications are running in parallel, then it is possible
that one may write out the cached gen program item which the other
application can then make use of.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the shader cache is used, this can be generated. In fact, the
shader cache uses this sha1 to lookup the serialized GL shader
program.
If a GL shader program is restored with ProgramBinary, the shaders are
not available, and therefore the correct sha1 cannot be generated. If
this is restored, then we can use the shader cache to restore the
binary programs to the program that was loaded with ProgramBinary.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We used this to prevent usage of the disk shader cache when transform
feedback was enabled via the GL API. This is no longer used.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444
Suggested-by: Timothy Arceri <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
We've seen this fail internally.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The fragment shader was trying to read this, but nothing
was exporting it from the vertex shader. This handles
it like the prim id export.
Fixes:
dEQP-VK.multiview.secondary_cmd_buffer.*
dEQP-VK.multiview.index.fragment_shader.*
v1.1: updated to use 0x1 (Samuel)
Fixes: e3265c10c89 (radv: Implement multiview draws.)
Reviewed-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There was a missing absolute value when
checking if the determinant was big enough.
Fixes: https://github.com/iXit/Mesa-3D/issues/292
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Patrick Rudolph <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
CC: "17.3 18.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Makes the conversion explicit.
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=102542
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Patrick Rudolph <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
CC: "17.3 18.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Stateblocks with NINESBT_ALL should track all textures.
For better performance they have a faster path which
copies all the required.
This path was only tracking ps textures.
Fixes: https://github.com/iXit/Mesa-3D/issues/303
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Patrick Rudolph <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
CC: "17.3 18.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
An incorrect formula was used to compute bound_samplers_mask_vs.
Since s is above always 8 for vs and the variable is encoded on 8 bits,
it was always 0.
This resulted in commiting the samplers every call when
there was at least one texture read in the vs shader.
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Patrick Rudolph <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
| |
No need to bother the linker about them.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
| |
It seems to be a leftover from u_format_table.py.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
We only get VK_SUCCESS if it was initialized, but apparently my compiler
doesn't track that far.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
We only have a cfg != NULL if we went through one of the paths that set
it, but my compiler doesn't figure that out.
Reviewed-by: Lionel Landwerlin <[email protected]>
Fixes: 6411defdcd6f ("intel/cs: Re-run final NIR optimizations for each SIMD size")
|
|
|
|
|
|
|
|
|
| |
This is a legitimate warning: if anv's blorp_alloc_binding_table() throws
an error from anv_cmd_buffer_alloc_blorp_binding_table(), we silently
continue to use this undefined value. The rest of this code doesn't seem
very allocation-error-proof, though, either.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the unit test filled out a minimal devinfo struct. A previous
patch caused the test to begin assert failing because the devinfo was
not complete. Avoid this by using the real mechanism to create devinfo.
Note that we have to drop icl from the table, since we now rely on the
name -> PCI ID translation done by gen_device_name_to_pci_device_id(),
and ICL's PCI IDs are not upstream yet.
Fixes: f89e735719a6 ("intel/compiler: Check for unsupported register sizes.")
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
| |
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
Similar to previous patch, make xcb 1.13 optional.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|