| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
We used to use a meta path because blorp didn't support 16x MSAA. Now it
does, so we don't need the meta paths anymore.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
We used to use a meta path because blorp didn't support 16x MSAA. Now it
does, so we don't need the meta paths anymore.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The helper was initially created to allow us to set reasonable defaults as
we mutated the brw_blorp_prog_data structure in preparation for NIR. Now
that everything is going through brw_blorp_compile_nir_shader() which fully
fills out the brw_blorp_prog_data structure, we don't need the helper.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
NIR gets kind of awkward when you have a 3-component vector with two floats
and one int. This led to us accidentally going through float for the
sample index. It doesn't hurt anything but it also isn't needed.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The original code-flow tried to map original blorp. This puts things more
where they belong and simplifies some of the logic.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Many of the more complex cases still fall back to the old shader builder.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95373
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
There's no reason to be passing a whole struct around just for a single
boolean. We can create it later when we actually need to use it as a key.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
This array allows the push constants to be re-arranged on upload. The
actual arrangement will, eventually, come from the back-end compiler.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is what BLORP does. Making them match cuts down on the noise when
looking at AUB diffs.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The hardware packets organize kernel pointers and GRF start by slots that
don't map directly to dispatch width. This means that all of the state
setup code has to re-arrange the data from prog_data into these slots.
This logic has been duplicated 4 times in the GL driver and one more time
in the Vulkan driver. Let's just put it all in brw_fs.cpp.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This better matches gen8 state setup
Acked-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we have a persample_shading bit in prog_data we can reduce the
amount the state setup code needs to be looking at the GL state. In
particular, it no longer pulls anything directly out of the
gl_fragment_program and no longer depends on NEW_FRAGMENT_PROGRAM.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit reworks and simplifies the way we handle persample shading in
the shader key and prog_data. The previous approach had three different
key bits that had slightly different and hard-to-decern meanings while the
new bits are far more clear. This commit changes it to two easily
understood bits that communicate everything we need:
1) key->persample_interp: means that the user has requested persample
interpolation through the API. This is equivalent to having
SAMPLE_SHADING enabled and having MIN_SAMPLE_SHADING_VALUE set high
enough that you actually get multiple per-sample invocations.
2) key->multisample_fbo: means that the shader will be running on an
actual multi-sampled framebuffer.
This commit also adds a new "persample_dispatch" bit to prog_data which
indicates that the shader should be run in persample mode. This way the
state setup code doesn't have to look at the fragment program or GL state
and can just pull that data out of the prog_data.
In theory, this shuffle could mean more recompiles. However, in practice,
we were shoving enough state into the key before that we were probably
hitting a recompile on every per-sample shader anyway.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 5310bca024f77da40ea6f4c275455f9cb0528f9e added a new "double df"
field to the brw_reg struct, adding an extra 4 bytes of data that isn't
usually initialized (or may contain irrelevant garbage if the struct is
mutated). This means that it's no longer safe to memcmp().
Instead, add a brw_regs_equal() function which ignores the extra df bits
unless they matter. To keep the implementation cheap, we wrap the first
set of fields in a union/struct so that we can use a single DWord
comparison.
v2: Drop unnecessary casts (caught by Francisco Jerez).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
| |
I'll fix this up on Monday, so leave the docs changes in place.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a few dEQP tests like
dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.no_qualifiers.default_framebuffer
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: don't bother with cull dist varyings except to assert.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: make too large array a compile error
v3: squash mesa/prog patch to avoid static compiler errors in bisect
Signed-off-by: Tobias Klausmann <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
airlied:
v2: rename LowerClipDistance to LowerCombinedClipCullDistnace.
I don't think we want any other behaviour with any current hw.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I tried first creating the auxiliary buffer the same time with the
color buffer. That, however, led me into a situation where we would
later create the rest of the mip-levels and the compression would
need to be disabled (it is only supported for single level buffers).
Here we try to create it on demand just before the hardware starts
to render. This is similar what we do with fast clear buffers,
their creation is deferred until the first clear.
This setup also gives the opportunity to detect if the miptree
represents the temporaty texture used internally in the mesa core.
This texture is mostly written by cpu and therefore enabling
compression for it doesn't make much sense.
Note that a heuristic is included. Floating point formats are not
enabled yet as they are only seen to hurt performance.
Some highlights with window system driver kept fixed to default
and only the application driver changing:
Manhattan: 8.32152% +/- 0.355881%
Offscreen: 9.09713% +/- 0.340763%
Glb trex: 8.46231% +/- 0.460624%
Offscreen: 9.31872% +/- 0.463743%
v2 (Ben): Re-use msaa layout type for single sampled case.
v3: Moved the deferred allocation of mcs to brw_try_draw_prims() and
brw_blorp_blit_miptrees() instead.
v4: (Ken): Drop MIPTREE_LAYOUT_ACCELERATED_UPLOAD when allocating mcs.
Do not enable for scanout buffers
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Add support for blorp and removed the support for meta
v3 (Ben): Add assertion on compressed non-fast clear - must
be partial clear.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Blorp blits use sampling engine which is capable of resolving
on the fly. Buffers are still resolved for blitter engine. Current
understanding is that blitter doesn't understand lossless compression.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now mcs was associated to single sampled buffers only for
fast clear purposes and it was therefore the responsibility of the
clear logic to allocate the aux buffer when needed. Now that normal
3D render or blorp blit may render with mcs enabled also, they need
to prepare the mcs just as well.
v2: Do not enable for scanout buffers
v3 (Ben):
- Fix typo in commit message.
- Check for gen < 9 and return early in brw_predraw_set_aux_buffers()
- Check for gen < 9 and return early in intel_miptree_prepare_mcs()
v4: Check for msaa_layput and number of samples to determine if
lossless compression is to used. Otherwise one cannot distuingish
between fast clear with and without compression.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Consider later on adding specific disable flags such as
MIPTREE_LAYOUT_DISABLE_AUX_MCS = 1 << 3, /* CCS_D */
MIPTREE_LAYOUT_DISABLE_AUX_CCS_E = 1 << 4,
MIPTREE_LAYOUT_DISABLE_AUX = MIPTREE_LAYOUT_DISABLE_AUX_MCS |
MIPTREE_LAYOUT_DISABLE_AUX_CCS_E,
and equivalent boolean/enums into miptree.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Check explicitly against base type of GL_FLOAT instead of
using _mesa_is_format_integer_color(). Otherwise we miss
GL_UNSIGNED_NORMALIZED.
v3 (Ben): Also call intel_miptree_supports_non_msrt_fast_clear()
in order to really check everything.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 (Ben): Use combination of msaa_layout and number of samples
instead of introducing explicit type for lossless
compression (intel_miptree_is_lossless_compressed()).
v3 (Ben): Do not set fast claer state in surface state setup.
Moved into brw_postdraw_set_buffers_need_resolve()
using a separate patch.
v4: Support for blorp
v5 (Ben): Re-use gen8_get_aux_mode()
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|