| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.
v2: use boolean for invariant instead of unsigned.
Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Co-authored-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
We can just emit the MOV in the two places where we use this.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
There's no reason for us to emit it a pile of times and then have a
whole pass to clean it up. Just emit it once like we really want.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This generalizes the unlit centroid workaround so it's less code and now
supports SIMD32.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Disallow gl_SampleId in SIMD32 on gen7
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
And handle 32-wide payload register reads in fetch_payload_reg().
v2 (Jason Ekstrand);
- Fix some whitespace and brace placement
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
While we're here, we change to using horiz_offset() instead of abusing
half().
v2 (Jason Ekstrand):
- Use horiz_offset() instead of half()
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original code manually handled splitting the MOVs to 8-wide to
handle various regioning restrictions. Now that we have a SIMD width
splitting pass that handles these things, we can just emit everything at
the full width and let the SIMD splitting pass handle it. We also now
have a useful "subscript" helper which is designed exactly for the case
where you want to take a W type and read it as a vector of Bs so we may
as well use that too.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number. When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead. Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.
v2 (Jason Ekstrand):
- Take advantage of both accumulators and emit LINE LINE MAC MAC
(Based on a patch from Francisco Jerez)
- Unify the gen4 and gen4x-6 cases using a loop
v3 (Jason Ekstrand):
- Don't unify gen4 with gen4x-6 as this turns out to be more fragile
than first thought without reworking the gen4 barycentric coordinate
layout.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs. In both cases, the accumulator is written
by the first of the two instructions and read by the second. Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware. Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.
Cc: [email protected]
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU
operation and less like a send. This is less code over-all and, as a
side-effect, it now properly handles execution groups and lowering so
SIMD32 support just falls out.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.
v2 (Jason Ekstrand):
- Add some extra commentary
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This prevents a crash in some arb_enhanced_layouts tests that would be
caused by the next commit.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Current discard handling requires dedicating the second flag register to
discard. However, control-flow in SIMD32 requires both flag registers
so it's incompatible with the current discard handling. Just don't
support SIMD32+discard for now.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The hardware's control flow logic is 16-wide so we're out of luck
here. We could, in theory, support SIMD32 if we know the control-flow
is uniform but we don't have that information at this point.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 0d905597f fixed an issue with the placement of the zip and unzip
instructions. However, as a side-effect, it reversed the order in which
we were emitting the split instructions so that they went from high
group to low instead of low to high. This is fine for most things like
texture instructions and the like but certain render target writes
really want to be emitted low to high. This commit just switches the
order back around to be low to high.
Reviewed-by: Matt Turner <[email protected]>
Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
The pixel shader dispatch table is kind-of a confusing mess. This adds
some helpers for dealing with it and for easily extracting the correct
data from wm_prog_data.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Doing instruction header setup in the generator is awful for a number
of reasons. For one, we can't schedule the header setup at all. For
another, it means lots of implied writes which the instruction scheduler
and other passes can't properly read about. The second isn't a huge
problem for FB writes since they always happen at the end. We made a
similar change to sampler handling in ff4726077d86.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that we have the implied header in src[0] for tracking purposes, we
may as well use it in the generator. This makes things a tiny bit more
general.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
The FB write opcode on gen4-5 does implied copies from g0 and g1 to the
message payload. With this commit, we start tracking that as part of
the IR by having the FB write read from g0-1.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
It doesn't matter since we don't ever run replicated write shaders
through the optimizer but it's good to be complete.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Which I forgot to do when 18.1.2 came out.
Signed-off-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is
changed to anything other than 0x20.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset
could be set to a specific value. The IndirectPropagation pass expected
it to return whether the offset could be increased by a specific value,
which is what TargetNV50::insnCanLoadOffset() does.
Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812
("nvc0/ir: be careful about propagating very large offsets into const load")
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output
file argument to addPassesToEmitFile and hook it up to dwo output.").
CXX rasterizer/jitter/libmesaswr_la-JitManager.lo
rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3
pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile);
~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
- Functionality replaced with emulated intrinsics
- Fixes Bug 106558
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|