| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
For the handful of registers that depend on the union of program/
framebuffer/rasterizer state.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
| |
Move the program state which we can't pre-bake to a streaming state
object, rather than emitting directly in the draw cmdstream.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This lets us move PC_PRIMITIVE_CNTL into the rasterizr stateobj, rather
than unconditionally emitting it directly in the cmdstream on every
draw.
This also starts adding some tracking about previous draw state, so that
following patches can limit some of the register writes we currently
emit on every draw.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
| |
All but one of the reg values is only used in the stateobj, so we can
inline the register value setup and stateobj construction. While we
are at it, switch over to the new register builders.
Prep work for next patch.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The overhead does seem to matter when you have a high enough # of draw
calls that effect few bins/pixels, because these writes would happen
unconditionally (ie. not part of a state-group).
Possibly we could keep these if we moved them into a state-group so the
register writes would be no-ops on bins with no geometry. OTOH I
usually end up adding in a WFI when using them scratch reg values to
track down a crash. (So add a WFI to mitigate the annoyance of needing
to use a debug build to get scratch regs to locate the position of a
crash/hang in the cmdstream.)
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3435>
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
Also add unit tests for u_vector.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3453>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
time.
This involves permuting the registers of barycentric vectors to have
the standard X[0-n] Y[0-n] layout at NIR translation time.
Barycentrics are converted to the format expected by the PLN
instruction in the lower_barycentrics() pass run after the
optimization loop.
Main reason is correctness of SIMD32 fragment shaders. The
shuffle_from_pln_layout() and shuffle_to_pln_layout() helpers used
during NIR translation are busted for SIMD32. This leads to serious
corruption at present with INTEL_DEBUG=do32, especially on Gen11+
where these helpers are hit more frequently due to the lack of a
hardware PLN instruction.
Of course one could have chosen to fix those helpers instead, but
there is another far more subtle issue that was reported during review
of the SIMD32 fragment shader codegen changes: The SIMD splitting pass
currently handles SIMD32 barycentric vectors as if they had the
standard X[0-n] Y[0-n] layout, even though they are interleaved for
the PLN instruction, which causes incorrect execution masks to be
applied to the MOVs unzipping barycentric vectors in cases where a
LINTERP instruction occurs under non-uniform control flow.
I'm not aware of any conformance regressions due to the latter issue
at present, but for our peace of mind let's move the conversion to the
PLN layout into the lower_barycentrics() pass run after
lower_simd_width().
This leads to the following shader-db improvements (including SIMD32
shaders) in combination with the previous back-end preparation changes
-- Without them (especially the copy propagation changes) this would
lead to a massive number of regressions. On ICL:
total instructions in shared programs: 20662316 -> 20466903 (-0.95%)
instructions in affected programs: 10538474 -> 10343061 (-1.85%)
helped: 68775
HURT: 6
total spills in shared programs: 8938 -> 8748 (-2.13%)
spills in affected programs: 376 -> 186 (-50.53%)
helped: 9
HURT: 5
total fills in shared programs: 8965 -> 8663 (-3.37%)
fills in affected programs: 965 -> 663 (-31.30%)
helped: 9
HURT: 6
LOST: 146
GAINED: 43
On SKL:
total instructions in shared programs: 18725867 -> 18614912 (-0.59%)
instructions in affected programs: 3876590 -> 3765635 (-2.86%)
helped: 27492
HURT: 2
LOST: 191
GAINED: 417
On SNB:
total instructions in shared programs: 14573613 -> 13980646 (-4.07%)
instructions in affected programs: 5199074 -> 4606107 (-11.41%)
helped: 29998
HURT: 0
LOST: 21
GAINED: 30
Results are somewhat less impressive but still significant without
SIMD32 fragment shaders enabled. On ICL:
total instructions in shared programs: 16148728 -> 16061659 (-0.54%)
instructions in affected programs: 6114788 -> 6027719 (-1.42%)
helped: 42046
HURT: 6
total spills in shared programs: 8218 -> 8028 (-2.31%)
spills in affected programs: 376 -> 186 (-50.53%)
helped: 9
HURT: 5
total fills in shared programs: 8953 -> 8651 (-3.37%)
fills in affected programs: 965 -> 663 (-31.30%)
helped: 9
HURT: 6
LOST: 0
GAINED: 3
On SKL:
total instructions in shared programs: 14927994 -> 14926738 (-0.01%)
instructions in affected programs: 168850 -> 167594 (-0.74%)
helped: 711
HURT: 2
On SNB:
total instructions in shared programs: 10770538 -> 10734403 (-0.34%)
instructions in affected programs: 2702172 -> 2666037 (-1.34%)
helped: 17818
HURT: 0
All of the hurt shaders are either spilling slightly more or emitting
additional NOP instructions due to the SIMD16 POW workaround for
Gen8-9 combined with differences in scheduling.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The goal is to represent barycentrics with the standard vector layout
during optimization and particularly SIMD lowering. Instead of
emitting the barycentric layout conversions at NIR translation time,
do it later as a lowering pass. For the moment this is only applied
to PI messages, but we'll give the same treatment to LINTERP
instructions too.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're about to change the layout of barycentric vectors, which will
involve permuting the GRFs of barycentrics fetched from the thread
payload. Make room for this in a function separate from the generic
fetch_payload_reg(), since the permutation will only be applicable to
barycentric vectors. This allows simplifying fetch_payload_reg(),
since there was no need for handling multiple-component payload
registers except for barycentrics.
This causes some minor shader-db noise due to the new helper emitting
a LOAD_PAYLOAD instruction unconditionally, but it will be cleaned up
shortly.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
workaround.
This prevents regressions on SNB due to the redundant MOVs lying
around in cases where fetch_payload_reg() returns a VGRF (currently
only in SIMD32 but soon in pretty much all cases). The MOVs can't be
register-coalesced due to their source being a FIXED_GRF, and they
can't be copy-propagated either due to the unlit centroid workaround
partial writes. They can be copy-propagated just fine into a SEL
instruction though.
On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:
total instructions in shared programs: 13996898 -> 14001982 (0.04%)
instructions in affected programs: 197461 -> 202545 (2.57%)
helped: 0
HURT: 1251
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is mainly meant to avoid shader-db regressions on SNB as we start
using VGRFs for barycentrics more frequently. Currently the
aligned_pairs_class is only useful in SIMD8 mode, because in SIMD16
mode barycentric vectors are typically 4 GRFs. This is not a problem
on Gen4-5, because on those platforms all VGRF allocations are
pair-aligned in SIMD16 mode. However on Gen6 we end up using either
the fast or the slow path of LINTERP rather non-deterministically
based on the behavior of the register allocator.
Fix it by repurposing aligned_pairs_class to hold PLN-aligned
registers of whatever the natural size of a barycentric vector is in
the current dispatch width.
On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:
total instructions in shared programs: 13983257 -> 14527274 (3.89%)
instructions in affected programs: 1766255 -> 2310272 (30.80%)
helped: 0
HURT: 11608
LOST: 26
GAINED: 13
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
mitigation.
This avoids regressions on SNB due to the bank conflict mitigation
pass moving a VGRF-allocated barycentric vector to a misaligned
location, which would prevent the PLN instruction from being used.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
LINTERP use.
Previously we would hardcode fs_visitor::delta_xy barycentrics to be
allocated from aligned_pairs_class on hardware with PLN source
alignment restrictions (pre-Gen7). Instead allocate any registers
consumed by LINTERP from aligned_pairs_class, even if some barycentric
vector had ended up in a temporary.
On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:
total instructions in shared programs: 13983257 -> 14527274 (3.89%)
instructions in affected programs: 1766255 -> 2310272 (30.80%)
helped: 0
HURT: 11608
LOST: 26
GAINED: 13
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is particularly useful in cases where register coalaesce is
unlikely to succeed because the LOAD_PAYLOAD isn't a plain copy --
E.g. when a LOAD_PAYLOAD is shuffling the contents of a barycentric
vector in order to transform it into the PLN layout.
This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series. On SKL:
total instructions in shared programs: 18596672 -> 18976097 (2.04%)
instructions in affected programs: 7937041 -> 8316466 (4.78%)
helped: 39
HURT: 67427
LOST: 466
GAINED: 220
On SNB:
total instructions in shared programs: 13993866 -> 14202963 (1.49%)
instructions in affected programs: 7611309 -> 7820406 (2.75%)
helped: 624
HURT: 52943
LOST: 6
GAINED: 18
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In cases where a LOAD_PAYLOAD instruction copies a single block of
sequential GRF registers into the destination (see
is_identity_payload()), splitting the block copy into a number of ACP
entries (one for each LOAD_PAYLOAD source) is undesirable, because
that prevents copy propagation into any instructions which read
multiple components at once with the same source (the barycentric
source of the LINTERP instruction is going to be the overwhelmingly
most common example).
Technically it would also be possible to do this for VGRF sources, but
there is little benefit from that since register coalesce already
covers many of those cases -- There is no way for a block of
FIXED_GRFs to be coalesced into a VGRF though.
This prevents the following shader-db regressions (including SIMD32
programs) in combination with the interpolation rework part of this
series. On SKL:
total instructions in shared programs: 18595160 -> 18828562 (1.26%)
instructions in affected programs: 13374946 -> 13608348 (1.75%)
helped: 7
HURT: 108977
total spills in shared programs: 9116 -> 9106 (-0.11%)
spills in affected programs: 404 -> 394 (-2.48%)
helped: 7
HURT: 9
total fills in shared programs: 8994 -> 9176 (2.02%)
fills in affected programs: 898 -> 1080 (20.27%)
helped: 7
HURT: 9
LOST: 469
GAINED: 220
On SNB:
total instructions in shared programs: 13996898 -> 14096222 (0.71%)
instructions in affected programs: 8088546 -> 8187870 (1.23%)
helped: 2
HURT: 66520
total spills in shared programs: 2985 -> 2961 (-0.80%)
spills in affected programs: 632 -> 608 (-3.80%)
helped: 2
HURT: 0
total fills in shared programs: 3144 -> 3128 (-0.51%)
fills in affected programs: 1515 -> 1499 (-1.06%)
helped: 2
HURT: 0
LOST: 0
GAINED: 4
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will be useful for eliminating redundant copies from the FS
thread payload, particularly in SIMD32 programs. For the moment we
only allow FIXED_GRFs with identity strides in order to avoid dealing
with composing the arbitrary bidimensional strides that FIXED_GRF
regions potentially have, which are rarely used at the IR level
anyway.
This enables the following commit allowing block-propagation of
FIXED_GRF LOAD_PAYLOAD copies, and prevents the following shader-db
regressions (including SIMD32 programs) in combination with the
interpolation rework part of this series. On ICL:
total instructions in shared programs: 20484665 -> 20529650 (0.22%)
instructions in affected programs: 6031235 -> 6076220 (0.75%)
helped: 5
HURT: 42073
total spills in shared programs: 8748 -> 8925 (2.02%)
spills in affected programs: 186 -> 363 (95.16%)
helped: 5
HURT: 9
total fills in shared programs: 8663 -> 8960 (3.43%)
fills in affected programs: 647 -> 944 (45.90%)
helped: 5
HURT: 9
On SKL:
total instructions in shared programs: 18937442 -> 19128162 (1.01%)
instructions in affected programs: 8378187 -> 8568907 (2.28%)
helped: 39
HURT: 68176
LOST: 1
GAINED: 4
On SNB:
total instructions in shared programs: 14094685 -> 14243499 (1.06%)
instructions in affected programs: 7751062 -> 7899876 (1.92%)
helped: 623
HURT: 53586
LOST: 7
GAINED: 25
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
source.
This involves indexing the ACP tables used internally by
fs_copy_prop_dataflow::setup_initial_values() by reg_space() instead
of register number. Both are nearly equivalent for virtual GRFs
(barring the single bit of entropy lost in the hash), and this makes
handling FIXED_GRFs straightforward.
Because we're only going to support FIXED_GRFs for the source of a
copy, this change is only strictly necessary during the second pass
that checks for source interference, but we also apply the same change
to the first pass for consistency.
Note that this shouldn't change the behavior of the copy propagation
pass until we start inserting FIXED_GRF entries into the ACP. Even
then FIXED_GRF writes are extremely rare so this change will hardly
ever have an effect, but they aren't completely non-existing so we
need to handle them for correctness.
No functional nor shader-db changes.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
helpers.
This reworks the current fs_inst::is_copy_payload() method into a
number of classification helpers with well-defined semantics. This
will be useful later on in order to optimize LOAD_PAYLOAD instructions
more aggressively in cases where we can determine it's safe to do so.
The closest equivalent of the present fs_inst::is_copy_payload()
method is the is_coalescing_payload() helper introduced here.
No functional nor shader-db changes.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
No functional nor shader-db changes.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In cases where LOAD_PAYLOAD is provided a pair of contiguous registers
as header sources, try to use a single SIMD16 instruction in order to
initialize them. This is unlikely to affect the overall cycle count
of the shader, since the compressed instruction has twice the issue
time, except due to the reduced pressure on the instruction cache.
Main motivation is avoiding instruction-count regressions in
combination with the following copy propagation improvements, which
will allow the SIMD16 g0-1 header setup emitted for framebuffer writes
to be copy-propagated into its LOAD_PAYLOAD, leading to the emission
of two SIMD8 MOV instructions instead of a single SIMD16 MOV.
Reverting this commit on top of the copy propagation changes would
lead to the following shader-db regressions on SKL and other
platforms:
total instructions in shared programs: 14926738 -> 14935415 (0.06%)
instructions in affected programs: 1892445 -> 1901122 (0.46%)
helped: 0
HURT: 8676
Without the following copy propagation changes this doesn't have any
effect on shader-db on Gen7+, because we would typically set up the FB
write header with a separate SIMD16 MOV that isn't currently
copy-propagated into the LOAD_PAYLOAD, so the individual SIMD8 MOVs
result of LOAD_PAYLOAD lowering would get register-coalesced away
under normal circumstances. However that wasn't the case for MRF
LOAD_PAYLOAD destinations on Gen6 and earlier, because register
coalesce only kicks in for GRFs, leaving a number of redundant SIMD8
MOVs lying around. On SNB this leads to the following shader-db
improvements:
total instructions in shared programs: 10770538 -> 10734681 (-0.33%)
instructions in affected programs: 2700655 -> 2664798 (-1.33%)
helped: 17791
HURT: 0
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
| |
Enable new tests in dEQP-VK.image.swapchain_mutable.*
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
|
| |
This is only the core WSI code for the extension. It adds the image
format list and the flags to vkCreateImage as well as handling things
properly in the modifier queries.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
|
| |
Just because a modifier is returned for the given format, that doesn't
mean it works with all usages and flags. We need to filter the list by
calling vkGetPhysicalDeviceImageFormatProperties2.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
|
|
| |
The anv implementation still isn't quite complete, but we can at least
start using the structs from the real extension.
v2: Fix circular pNext list (Lionel)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
| |
Future changes will be easier if we can modify it based on whether or
not we're using modifiers.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Images with modifiers come with restrictions:
1. They have to be simple 2D images right now
2. They need to have a sensible format (not compressed, multi-plane, or
non-power-of-two)
3. If a CCS modifier is being requested, they have to actually support
CCS_E and be CCS-compatible with any other formats the client may
wish to use for image views.
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
| |
The DRM format modifiers extension adds a TILING_DRM_FORMAT_MODIFIER
which will be used for modifiers so we can no longer use OPTIMAL to
indicate tiled inside the driver.
Reviewed-by: Lionel Landwerlin <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3434>
|
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM only supports GFX8+. Using CLRXdisasm works most of the time,
so it's useful to add support for it.
Original patch by Daniel Schürmann.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3439>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure the hardware cursor is disabled when we set the mode for a
VkDisplayKHR object. The extension doesn't expose any mechanisms to
program the hardware cursor, so we need to ensure it is hidden.
Currently, it seems like X is responsible for disabling the cursor
before handing over the lease. But that seems a little frail, and we
should be disabling the cursor ourselves so it works correctly
independently of how the lease was prepared for us.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1922>
|
|
|
|
|
|
|
|
|
| |
When swr driver is in use it print detected architecture
message to std::err. It can be harmfull when swr is using
in multinodes environments.
It can be enabled setting env var SWR_PRINT_INFO to 1.
Reviewed-by: Jan Zielinski <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Same as GFX10, only GFX8/GFX9 moved that bit near the opcode.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3437>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current logic considers that the nir_intrinsic_component(store_intr)
encodes the source components start, but it actually encodes the
destination one. Source component offset adjustment is taken care of in
install_registers_instr(), when offset_swizzle() is called.
This fixes dEQP-GLES2.functional.shaders.random.all_features.fragment.45
when PAN_MESA_DEBUG=deqp (looks like exposing GLES3 features has an
impact on the varyings layout).
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3429>
|
|
|
|
|
|
|
|
| |
Fixes: f8148d0cc17 "docs: remove mailing list as way of submitting patches"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
| |
Fixes: bc17ac58661 "docs: add documentation for building with meson"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pre-tag right before is a block-level tag, which means it implicitly
terminates the paragraph. So there's no paragraph to close after this.
Instead, move the paragraph-closing before the pre-tag, to explicitly
close the paragraph.
Fixes: 41b3eb08d9f "docs: update meson docs for windows"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
| |
Similar to the previous two commits, it seems more appropriate to use
code-tags here than pre-tag.
Fixes: 9af6c38deff "docs: Add use of Closes: tag for closing gitlab issues"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
| |
Similar to the previous commit, code-tags seems more appropriate than
pre-tags here. So let's change it.
Fixes: ca0c1e69cab "docs: update releasing process to use new scripts and gitlab"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
| |
It's unlikely the author meant to use <pre>-here, as that starts a whole
new block. Instead, the inline code-tag seems more appropriate here.
Fixes: 41b3eb08d9f "docs: update meson docs for windows"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
| |
Fixes: 44c5e634a5c "docs: update meson docs for windows"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
| |
Paragraphs are terminated by pre-tags, so the latter one closes a new,
empty one. Let's split the paragraph in two around the pre-tag instead.
Fixes: c0dfe8c6dfd "docs: do not use div for line-breaking"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
| |
Fixes: 5d11a828e10 "docs: update install docs for meson"
Acked-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3431>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following valgrind error:
Invalid read of size 16
at 0x28F458A1: si_set_sampler_view_desc (in radeonsi_drv_video.so)
by 0x28F4657E: si_set_sampler_views (in radeonsi_drv_video.so)
by 0x28D62BF5: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Address 0x18142a10 is 0 bytes inside a block of size 48 free'd
at 0x48369AB: free (vg_replace_malloc.c:540)
by 0x28D62D51: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Block was alloc'd at
at 0x4837B65: calloc (vg_replace_malloc.c:762)
by 0x28EFB2EC: si_create_sampler_state (in radeonsi_drv_video.so)
by 0x28D62C30: util_compute_blit (in radeonsi_drv_video.so)
by 0x28D3A944: vlVaHandleVAProcPipelineParameterBufferType (in radeonsi_drv_video.so)
by 0x28D34EE1: vlVaRenderPicture (in radeonsi_drv_video.so)
by 0x4B2582B: vaRenderPicture (in libva.so.2.500.0)
Fixes: 69430d7e59e ("va: use a compute shader for the blit")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2321
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3428>
|
|
|
|
|
|
|
|
|
|
| |
I kept wondering what "429" meant in variable declarations, when it was
just a truncated ~0 snprintf.
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3423>
|
|
|
|
|
|
|
| |
Cc: [email protected]
Signed-off-by: Eric Engestrom <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3391>
|
|
|
|
|
| |
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3417>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Introduce separate helper functions to set the blendfactor bits.
Lima uses bits 0-2 for the type, bit 3 sets the inverted function
and bit 4 is set if alpha is used.
alpha_src_factor and alpha_dst_factor don't need the alpha bit, so
they are masked with 0xf. There is only place for 4 bits anyway.
If alpha_src_factor is PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE, we need
to change it to PIPE_BLENDFACTOR_ONE first.
This is exactly what the blob does and we pass all
dEQP-GLES2.functional.fragment_ops.blend.* tests now.
Better than the blob btw...
Reviewed-by: Vasily Khoruzhick <[email protected]>
Signed-off-by: Andreas Baierl <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3411>
|