| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
PIPE_CAP_SM3 has always been an odd one out of all our caps. While most
other caps are fine-grained and single-purpose, this cap encode several
features in one. And since OpenGL cares more about single features, it'd
be nice to get rid of this one.
As it turns, this is now relatively simple. We only really care about
three features using this cap, and those already got their own caps. So
we can remove it, and make sure all current drivers just give the same
response to all of them.
The only place we *really* care about SM3 is in nine, and there we can
instead just re-construct the information based on the finer-grained
caps. This avoids DX9 semantics from needlessly leaking into all of the
drivers, most of who doesn't care a whole lot about DX9 specifically.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
required by OpenCL
v2: fix setting globalAccess
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required by OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required for OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
required by OpenCL
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
| |
right now that's dead code
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Sagar Ghuge <[email protected]>
Suggested-by: Matt Turner <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Not all drivers support TGSI_OPCODE_DIV, so we should have a cap to be able
to check this.
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
There doesn't seem to be any reason to keep these opcodes around:
* fnot/fxor are not used at all.
* fand/for are only used in lower_alu_to_scalar, but easily replaced
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.
Tested-by: Eric Anholt <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
| |
This is pointless in that we won't ever hit those paths in real life,
but coverity complains.
Fixes: f014ae3c7cce ("nouveau: add support for nir")
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main motivation for this change is API ergonomics: most operations
on dynarrays are really on elements, not on bytes, so it's weird to have
grow and resize as the odd operations out.
The secondary motivation is memory safety. Users of the old byte-oriented
functions would often multiply a number of elements with the element size,
which could overflow, and checking for overflow is tedious.
With this change, we only need to implement the overflow checks once.
The checks are cheap: since eltsize is a compile-time constant and the
functions should be inlined, they only add a single comparison and an
unlikely branch.
v2:
- ensure operations are no-op when allocation fails
- in util_dynarray_clone, call resize_bytes with a compile-time constant element size
v3:
- fix iris, lima, panfrost
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We originally had a single lower_fmod option. In commit 2ab2d2e5, Sam
split 32 and 64-bit lowering into separate flags, with the rationale
that some drivers might want different options there. This left 16-bit
unhandled, so Iago added a lower_fmod16 option in commit ca31df6f.
Now that lower_fmod64 is gone (in favor of nir_lower_doubles and
nir_lower_dmod), we re-combine lower_fmod16 and lower_fmod32 into a
single lower_fmod flag again. I'm not aware of any hardware which
need lowering for one bitsize and not the other.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We currently have two duplicate mechanisms for lowering fmod@64.
One is a nir_opt_algebraic rule keyed off of options->lower_fmod64,
and the other is nir_lower_doubles, which offers a full gamut of
fp64 lowering. The latter works slightly better in some corner cases,
so I'm trying to eliminate lower_fmod64 and drop the redundancy.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Neither freedreno nor nv50 expose PIPE_CAP_DOUBLES, so there's no
fmod64 to be lowered.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The difference between imov and fmov has been a constant source of
confusion in NIR for years. No one really knows why we have two or when
to use one vs. the other. The real reason is that they do different
things in the presence of source and destination modifiers. However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.
Reviewed-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Vasily Khoruzhick <[email protected]>
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
TGSI's FBFETCH instruction currently only supports reading from a single
render target, but NIR intrinsics can support multiple render targets.
radeonsi can only support fetching from RT 0, but other drivers may be
able to support fetching from any render target.
To express this, this patch renames PIPE_CAP_TGSI_FS_FBFETCH to simply
PIPE_CAP_FBFETCH, and converts it from a boolean "is FBFETCH supported?"
to an integer number of render targets which can be fetched.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Suggested-by: Ilia Mirkin <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
The _LEVELS assumes that the max is always power of two. For V3D 4.2, we
can support up to 7680 non-power-of-two MSAA textures, which will let X11
support dual 4k displays on newer hardware.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This can be used by both etnaviv and freedreno/a2xx as they are both vec4
architectures with some instructions being scalar-only.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Driver which do not support native integers should use a lowering
pass to go from integers to floats.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
One special case, `src/util/xmlpool/.gitignore` is not entirely deleted,
as `xmlpool.pot` still gets generated (eg. by `ninja xmlpool-pot`).
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
| |
It's legal for a buffer-object to have a NULL-resource, but let's just
skip over it, as there's nothing to do.
Signed-off-by: Erik Faye-Lund <[email protected]>
Acked-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
v3: rebase
v3: make use of u_pipe_screen_get_param_defaults
Signed-off-by: Rhys Perry <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: remove & operator in a couple of memsets
add some memsets
v3: fixup lima
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> (v2)
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
This required to calculate sizes correctly when we have bindless
samplers/images.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
to indicate write usage per buffer.
This is just a hint (it will be used by radeonsi).
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
v2: only emit offsets if those are !0
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I added new barrier bits in 220c1dce1e3194ea867e6d948fc7ff5b9ef2d3a7
and made most drivers skip them. I thought nvc0 was already skipping
those but missed the else case here, which does something. So make it
explicitly skip like I did everywhere else.
Thanks to Ilia for catching this.
Fixes: 220c1dce1e3 gallium: Add PIPE_BARRIER_UPDATE_BUFFER and UPDATE_TEXTURE bits.
|
|
|
|
|
|
|
|
| |
Add the necessary build rules for android, to avoid building errors.
Fixes: f014ae3 ("nouveau: add support for nir")
Signed-off-by: Mauro Rossi <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
v9: convert to C++ style comments
handle for tess eval shaders as well
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
| |
v9: mark as fixed
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v4: use smarter getIndirect helper
use new getSlotAddress helper
v5: use loadFrom helper
v8: don't require C++11 features
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
| |
v5: add more barrier intrinsics
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v3: fix compiler warnings
v4: use loadFrom helper
v5: fix signed min/max
v6: set tex mask
add support for indirect image access
set cache mode
v7: make compatible with 884d27bcf688d36c3bbe01bceca525595add3b33
rework the whole deref thing to prepare for bindless
v8: port to deref instructions
don't require C++11 features
v9: implement MS images
rebase on master (image modifiers)
fix regressions due to variable src compnents
replace '(*it).' with 'it->'
convert to C++ style comments
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
| |
v4: use loadFrom helper
v5: support indirect buffer access
v8: don't require C++11 features
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
v4: use loadFrom helper
v8: don't require C++11 features
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v4: use smarter getIndirect helper
use new getSlotAddress helper
use loadFrom helper
v8: don't require C++11 features
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We store those arrays in local memory and reserve some space for each of the
arrays. With NIR we could store those arrays packed, but we don't do that yet
as it causes MemoryOpt to generate unaligned memory accesses.
v3: use fixed size vec4 arrays until we fix MemoryOpt
v4: fix for 64 bit types
v5: use loadFrom helper
v8: don't require C++11 features
v9: convert to C++ style comments
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: add vote_eq support
use the new subop intrinsic helper
add ballot
v3: add read_(first_)invocation
v8: handle vectorized intrinsics
don't require C++11 features
v9: lower_subgroups to 32 bit (produces less instructions)
use getSSA and getScratch instead of new_LValue
Signed-off-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
| |
v7: don't assert in default case for getSubOp
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Pierre Moreau <[email protected]>
|