| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This make shader-db's report.py work on Haswell and earlier platforms.
The problem is that the script would detect the "sends" output for
scalar shaders and expect in in vec4 shaders too. When it didn't find
it, the script would fail with:
Traceback (most recent call last):
File "./report.py", line 351, in <module>
main()
File "./report.py", line 182, in main
before_count = before[p][m]
KeyError: 'sends'
Fixes: f192741ddd8 ("intel/compiler: Report the number of non-spill/fill SEND messages")
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake")
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These reworks were combined into this patch:
* Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+
* Francisco Jerez: intel/eu/validate/gen12: Disable
qword_low_power_no_depctrl eu_validate test.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
| |
Reworks:
* adjust 64-bit support, hiz (Jason Ekstrand)
* sim-id (Lionel Landwerlin)
* adjust threads, urb size (Rafael Antognolli)
* adjust urb size (Kenneth Graunke)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
|
| |
|
| |
|
|
|
|
| |
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On gen11 and older, compressed images are tiled and aligned to 4K. On
gen12 this 4K alignment restriction was removed. However, only aligning
the fast clear color buffer to 64B (a cacheline, as it's on the
documentation) is causing some bugs where the fast clear color is not
converted during the fast clear operation. Aligning things to 4K seems
to fix it.
v2: Assert that image->planes[plane].offset is 4K aligned (Nanley)
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
| |
TGL uses different data (and even a different format!) for each source.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
| |
TGL will have separate tables for src0 and src1, so the shared function
will no longer make sense.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
| |
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EU compaction unit test fuzzes the compaction code by flipping bits.
We use a simple skip_bits() function with a list of reserved bits to
ignore, but for more complex cases like invalid combinations of register
file:type, we need either machinery to check validity or for these
functions to simply inform us whether a combination was valid.
enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it
with an "INVALID" value, just return -1 and let the caller check for
that.
Scott suggested redefining unreachable() within the unit test to
longjmp() which would allow driver code like this to still use it and
allow the test to handle expected failures like this. If that plan works
out, I plan to revert this.
|
|
|
|
|
|
|
|
| |
This shaves around 4-5% off of a CPU-limited example running with the
Dawn WebGPU implementation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 0e4a75f917, Ken added a flag brw_stage_prog_data which indicates
whether any UBO pulls ever occur. Unfortunately, he neglected to set
the bit in the vec4 back-end. This was fine at the time because the
optimization was intended for iris which does not support gen7 and using
the vec4 back-end on Gen8+ requires an environment variable. We want to
use this in Vulkan which does support Gen7 so we want the information
from the vec4 back-end as well as scalar.
Fixes: 0e4a75f917 "intel/compiler: Record whether any pull constant..."
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
v2: (Nanley Chery)
- Fix commit title
- Fix comment
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When set, the stencil buffer is filled with the true stencil values and
we have to disable stencil buffer clear enable bit.
v2: 1) Refactor code little bit (Nanley Chery)
2) Fix assertion (Nanley Chery)
v3: 1) Remove unncessary assignment (Nanley Chery)
2) Fix GEN_GEN check (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Enable stencil compression enable and control surface enable bit if
stencil buffer lossless compression is enabled.
v2: Remove unnecessary GEN_GEN check (Nanley Chery)
v3: (Nanley Chery)
- Change commit subject tag from intel/isl to intel
- Keep assignment order correct
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
| |
On Gen12+, Stencil buffer's lossless compression should be resolved
with WM_HZ_OP packet.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
We never saw any failures regarding this typo but it's good to assign
correct stencil view while constructing blorp_params.
Fixes: 0cabf93b80d0 "intel/blorp: Add an entrypoint for clearing depth and stencil"
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
| |
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
The original value of 256 was under the assumption that you're a batch
buffer which is likely going to have a large number of relocations.
However, pipeline objects on Gen7 will have at most 6 relocations (one
per shader stage and one for the workaround BO) so this is a lot of
per-pipeline wasted space.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old relocation list code always allocated 256 relocations and a hash
set up-front without knowing whether or not we really need them. In
particular, in the softpin case, this is two fairly large allocations
that we don't need to be making. Also, for pipeline objects on haswell
where we don't have softpin, we don't need relocations unless scratch is
used so this is extra data per-pipeline. Instead, we should do it
on-demand. This shaves 3.5% off of a cpu-limited example running with
the Dawn WebGPU implementation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
|
|
|
|
|
|
|
| |
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
| |
For gen12 we set the streamout buffers using 4 separate
commands instead of 3DSTATE_SO_BUFFER.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
|
|
|
|
|
| |
Add depth bounds testing to the list of supported
physical device features.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
| |
The bias for the 3DSTATE_DEPTH_BOUNDS instruction
should be 2 not 1.
Signed-off-by: Plamena Manolova <plamena.manolova@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
|
|
|
|
|
|
|
|
|
| |
A few equations/programming changes for ICL.
v2: Fix a couple of issues in naming and floating/integer operations (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
|
| |
The anv_batch_bo contents are linked one to another, and when printing
we have to start with the first of those. Since in `u_vector` new
elements are added to the head, to get the first element we need the
vector's tail.
Fixes: 32ffd90002b ("anv: add support for INTEL_DEBUG=bat")
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
|
|
|
|
|
|
|
|
|
|
| |
The existing "fallback" code didn't actually do anything, so this
removes it, and instead we just always fallback to `iris` for future
PCI IDs.
Suggested-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
| |
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
|
|
|
|
|
|
|
|
|
|
| |
GEN12 adds the ability to losslessly compress each sample plane in a
multisampled buffer that uses MCS compression.
v2: Remove unnecessary assertion (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depending on MCS_CSS or MCS we can emit blorp blit shaders.
As we support MCS_CSS and MCS, it makes sense to use
isl_aux_usage_has_mcs function.
v2: Fix commit message (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
| |
If aux for MCS is already configured, don't configure again.
v2: Fix missing period in commit message (Nanley Chery)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
|
|
|
|
|
|
|
|
|
| |
Store the converted depth value into two dwords. Avoids regressing the
piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is
enabled in a later commit.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
Write through to the CCS if the surface is used as a texture and can be
sampled by the HW with CCS.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
Prevent the piglit test,
amd_vertex_shader_layer-layered-depth-texture-render, from regressing in
in a future commit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Prepare this function to be used in iris and to handle new Gen12 behavior.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
Add a helper to determine if an ISL surface supports the write-through
mode of HIZ_CCS.
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
| |
Allow it in depth buffer instructions but disable it for blits.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Avoid unexpected behavior if the caller happens to pass in a HiZ aux
usage.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
Add an extra aux parameter which will be filled out with CCS if the
first two isl_surf parameters fit the requirements for HiZ_CCS.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
| |
Return false more often to reduce the burden on the caller.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
| |
While this format isn't listed in BSpec: 53911, other documentation and
empirical evidence suggest that it's fine to remap it to R32_FLOAT. I've
filed a bug for the BSpec page.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
| |
v2. Remove undocumented CCS_E-only mode for depth. (Nanley)
Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|
|
|
|
|
|
|
|
|
| |
v2 (Nanley):
* Maintain a chronological ordering for HiZ alignments. Suggested by
Ken.
Co-authored-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
|