| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
The XML ends up noisier if you're only looking at one version, but from
the diffstat there's obvious wins in terms of deduplication. This will
get even more significant if we ever support 3.2 or 4.0.
|
|
|
|
|
|
|
| |
The XML zipper wants one XML per version for filling out its tables, but
we want to do more than one GPU version per XML now. Assume that the
"gen" field will be the same as min_ver and look up our XML text assuming
that they're listed in increasing min_ver.
|
|
|
|
|
| |
This will be used to merge together the V3D 3.3-4.1 XML with the variants
disabled based on the version.
|
|
|
|
|
|
|
| |
It turns out that most V3D versions change very few packets, so keeping
separate copies of the XML per version makes changing the XML a pain as
you have to replicate your changes to each one. This is the start of
changing it so that one XML can generate headers for multiple versions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Running VK-CTS in batch execution mode was raising the
VK_ERROR_INITIALIZATION_FAILED error in multiple tests. But when the
same failing tests were run isolated they always passed.
createDevice and destroyDevice were called before and after every
tests. Because the binding_table_pool was never closed, we reached the
maximum number of open file descriptors (ulimit -n) and when that
happened every call to createDevice implied a
VK_ERROR_INITIALIZATION_FAILED error.
Fixes: c7db0ed4e94dce563d722e1b098684fbd7315d51
("anv: Use a separate pool for binding tables when soft pinning")
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is mainly useful for when one needs to add new opcodes in a painless
and reliable way.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Having this if statement here prevented the next if statement from being
reached in the case of image stores, which is needed for instructions with
indirect bindless handles like "STORE TEMP[ADDR[2].x+1](1) ...".
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is a parallel make build issue in src/egl/drivers/dri2/
for wayland builds. Can be reproduced with:
$ rm src/egl/drivers/dri2/*.h src/egl/drivers/dri2/platform_wayland.lo
$ make -C src/egl/ drivers/dri2/platform_wayland.lo
../../../mesa-18.1.2/src/egl/drivers/dri2/platform_wayland.c:50:10: fatal error: linux-dmabuf-unstable-v1-client-protocol.h: No such file or directory
This patch adds the missing dependency.
Fixes: 02cc359372773800de817 "egl/wayland: Use linux-dmabuf interface for buffers"
Reviewed-by: Eric Engestrom <[email protected]>
[Eric: fixed up the commit title]
Signed-off-by: Eric Engestrom <[email protected]>
|
| |
|
|
|
|
| |
for vertex color clamping.
|
| |
|
|
|
|
| |
I think we don't need it on other chips.
|
| |
|
| |
|
|
|
|
| |
Cc: 18.1 <[email protected]>
|
| |
|
|
|
|
| |
Compute shaders were not using the shader cache.
|
|
|
|
| |
This is kinda a hack, but it's enough for the shader cache.
|
| |
|
|
|
|
|
| |
This removes useless s_waitcnt before barriers.
Only radeonsi uses this function.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add support for glsl 'invariant' modifier for output data declarations.
Gallium drivers that use TGSI serialization currently loose invariant
modifiers in glsl shaders.
v2: use boolean for invariant instead of unsigned.
Tested: chromiumos on qemu with virglrenderer.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Co-authored-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
We can just emit the MOV in the two places where we use this.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
There's no reason for us to emit it a pile of times and then have a
whole pass to clean it up. Just emit it once like we really want.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This generalizes the unlit centroid workaround so it's less code and now
supports SIMD32.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Disallow gl_SampleId in SIMD32 on gen7
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
And handle 32-wide payload register reads in fetch_payload_reg().
v2 (Jason Ekstrand);
- Fix some whitespace and brace placement
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
While we're here, we change to using horiz_offset() instead of abusing
half().
v2 (Jason Ekstrand):
- Use horiz_offset() instead of half()
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original code manually handled splitting the MOVs to 8-wide to
handle various regioning restrictions. Now that we have a SIMD width
splitting pass that handles these things, we can just emit everything at
the full width and let the SIMD splitting pass handle it. We also now
have a useful "subscript" helper which is designed exactly for the case
where you want to take a W type and read it as a vector of Bs so we may
as well use that too.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On g4x through Sandy Bridge, src1 (the coordinates) of the PLN
instruction is required to be an even register number. When it's odd
(which can happen with SIMD32), we have to emit a LINE+MAC combination
instead. Unfortunately, we can't just fall through to the gen4 case
because the input registers are still set up for PLN which lays out the
four src1 registers differently in SIMD16 than LINE.
v2 (Jason Ekstrand):
- Take advantage of both accumulators and emit LINE LINE MAC MAC
(Based on a patch from Francisco Jerez)
- Unify the gen4 and gen4x-6 cases using a loop
v3 (Jason Ekstrand):
- Don't unify gen4 with gen4x-6 as this turns out to be more fragile
than first thought without reworking the gen4 barycentric coordinate
layout.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we don't have PLN (gen4 and gen11+), we implement LINTERP as either
LINE+MAC or a pair of MADs. In both cases, the accumulator is written
by the first of the two instructions and read by the second. Even
though the accumulator value isn't actually ever used from a logical
instruction perspective, it is trashed so we need to make the scheduler
aware. Otherwise, the scheduler could end up re-ordering instructions
and putting a LINTERP between another an instruction which writes the
accumulator and another which tries to use that result.
Cc: [email protected]
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU
operation and less like a send. This is less code over-all and, as a
side-effect, it now properly handles execution groups and lowering so
SIMD32 support just falls out.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We want consistent behavior in the meaning of the flag_subreg field
between SNB and IVB+.
v2 (Jason Ekstrand):
- Add some extra commentary
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This prevents a crash in some arb_enhanced_layouts tests that would be
caused by the next commit.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|