| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2 (Nanley):
* Maintain a chronological ordering for HiZ alignments. Suggested by
Ken.
Co-authored-by: Nanley Chery <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Fixes: 2cddc953cd0 ("iris: some initial HiZ bits")
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll start doing slow depth clears more often on HIZ_CCS buffers in a
future commit. Reduce the performance impact by making them use less
bandwidth.
From the Depth Test section of the BSpec:
This function is enabled by the Depth Test Enable state variable. If
enabled, the pixel's ("source") depth value is first computed. After
computation the pixel's depth value is clamped to the range defined
by Minimum Depth and Maximum Depth in the selected CC_VIEWPORT state.
Then the current ("destination") depth buffer value for this pixel is
read.
and from the Depth Buffer Updates section of the BSpec:
If depth testing is disabled or the depth test passed, the incoming
pixel's depth value is written to the Depth Buffer.
Taken together, it's clear that depth testing isn't necessary to perform
a depth buffer clear. Mark Janes and I analyzed this patch with
frameretrace and a depthrange piglit test. I disabled HiZ to ensure we'd
get slow depth clears. We've observed the bandwidth consumption by the
depth buffer access to be cut ~50% on BDW and SKL during depth clears.
On a more graphically intensive workload, the Shadowmapping Sascha
benchmark, I took the average of 3 runs on a BDW with a display
resolution of about 1920x1200 (minus some desktop environment
decorations). I measured a 22.61% FPS improvement when HiZ is disabled.
v2. The BSpec doesn't mandate this behavior, update comment accordingly.
(Ken)
Fixes: bc4bb5a7e30 ("intel/blorp: Emit more complete DEPTH_STENCIL state")
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In ISL:
Update the format table to add CCS_E support for some 8BPP formats,
some 16BPP formats, and R10G10B10A2_UNORM_SRGB.
In the helper for determining CCS_E support, we return false for some
16BPP formats because they aren't properly handled in blorp_copy().
In BLORP:
Allow the new and non-problematic formats for CCS_E-enabled copies.
v2. Update other fields for A1B5G5R5_UNORM and A4B4G4R4_UNORM in table.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
| |
The CCS could be described in a number of ways, but this format was
chosen to minimize churn in the drivers. We may decide on an different
direction in the future.
v2. Increase alignment for display surfaces. (Nanley)
Reviewed-by: Jordan Justen <[email protected]> (v1)
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Use a helper that will automatically handle Gen12's CCS tiling when
creating a CCS isl_surf.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
The Gen12 CCS is not Y-tiled.
Reviewed-by: Sagar Ghuge <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In the function which translates ISL tilings to i915 tilings, map ISL's
HiZ and CCS tilings to Y instead of NONE (linear). The HW docs describe
HiZ and pre-Gen12 CCS surfaces as being Y-tiled in memory.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2 (Nanley):
* Avoid driver churn for now.
* Include some media compression changes.
Co-authored-by: Nanley Chery <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
v2. Avoid driver churn for now. (Nanley)
Co-authored-by: Nanley Chery <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
v2. Add media compression. (Nanley)
Co-authored-by: Nanley Chery <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Avoid the compiler warnings for the new enums that will be introduced in
a future commit.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The isl_surf structs for Gen12's CCS won't describe how many slices in
the main surface can be compressed. All slices will be compressable if
CCS is enabled, so lookup the main surface's logical dimension.
v2. Add a space before a `?`. (Jordan)
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This isn't accurate enough for HiZ which can have a discontiguous range
of supported aux slices. This also won't work with the plan to represent
Gen12 CCS as a single slice surface.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
Update their dimensions according to the Bspec.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From "Render Target Fast Clear" description for Gen12:
"SW must store clear color using MI_STORE_DATA_IMM with
ForceWriteCompletionCheck bit set."
From Instruction_MI_STORE_DATA_IMM, bitfield 10 (when set to 1):
"Following the last write from this command, Command Streamer
will wait for all previous writes are completed and in global
observable domain before moving to next command."
We use 4 SDIs to store the clear color (one per channel). From the
description, it looks to me that setting that flag only on the last SDI
should be enough.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Gen12's CCS requires that the main surface have a pitch aligned to 512B.
v2. Provide a BSpec citation. (Ken)
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
There's no longer a clear-only compression mode of CCS on Gen12+.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Sagar Ghuge <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
There's no longer a clear-only compression mode of CCS on Gen12+.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Clear-only compression no longer exists on TGL.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Sagar Ghuge <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Clear-only compression no longer exists on TGL.
v2. Add BSpec reference. (Sagar)
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Sagar Ghuge <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
The format of the CCS has changed.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
The format of the CCS has changed.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
add_aux_state_tracking_buffer() actually checks the aux usage when
determining how many dwords to allocate for state tracking. Move the
function call to the point after the CCS_E aux usage is assigned.
Fixes: de3be618016 ("anv/cmd_buffer: Rework aux tracking")
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Avoid failing the `info->use_clear_address` assertion in ISL on Gen12+.
Fixes: 6c9f9a82d78 ("intel/genxml,isl: Add gen12 render surface state changes")
Reported-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction
to enable depth bounds testing.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
In gen12 we use the 3DSTATE_DEPTH_BOUNDS instruction
to enable depth bounds testing.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
In gen12 we add the 3DSTATE_DEPTH_BOUNDS instruction
which enables support for depth bounds testing.
Signed-off-by: Plamena Manolova <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Better be safe, even if we could technically avoid this for
some fields.
Cc: <[email protected]>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1999
Signed-off-by: Danylo Piliaiev <[email protected]>
Tested-by: Witold Baryluk <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The new function was introduced
in addcf410.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
This makes it clear that it's a boolean test and not an action
(eg. "empty the list").
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Just use the inlined function directly. The macro was replaced with
the function in ebe304fa540f.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
When building with "-flto" brw::block_data definitions
were colliding.
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Some implementations don't support the lineWidth-feature, so let's
avoid setting invalid state to them. But since we don't have a fallback
for this, inform the user.
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
The driver can report a minimum alignment for UBOs, and that can be
larger than 64, which we've currently been using. Let's play ball, and
use the reported value instead.
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There's two things that goes wrong in this code on some drivers:
1. Rounding off the line-width to granularity can push it outside the
legal range.
2. A granularity of 0.0 results in NaN, because we divide by zero.
So let's make this code a bit more robust.
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Acked-by: Jordan Justen <[email protected]>
|