| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
vec4 scalarized ALUs typically have 4 equal instruction headers, so remove
the last 3.
There are no bits left in the ALU header for more flags, so future
extensions of NIR will have to use something like instr_type == 15
to describe more complex ALU instructions.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It can be derived from src and var. This frees 10 bits in the header
that will be used later.
"mode" is moved in the structure, because those bits will be used for
something else later.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
- type_cast: deduplicate types if the last one is the same
- derive the type from the parent for other derefs
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The majority of constants can be packed like this.
v2: - use enum for the packing encoding,
- trim packed_value to 20 bits add 1 bit to last_component,
which simplifies a later commit
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
v2: use blob_write_uint8/16
Reviewed-by: Jason Ekstrand <[email protected]> (v1)
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
| |
We need to limit the object count to 1M to free 10 bits for the src
modifiers.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All of the tables are static const, so they only need to be validated
once. As noted in the previous commit, the compiler should be able to
eliminate all of this code when the assertions would pass. Even with
the help of the previous commit, this does not always occur.
-Og: -95.688 +/- 3.91935 (-24.9562% +/- 1.0222%) N=5
-O1: No difference proven at 95.0% confidence. N=5
-O2: -1.962 +/- 0.85001 (-0.860013% +/- 0.372589%) N=5
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I was pretty liberal with these assertions when I wrote this code
because I had assumed that GCC would unroll the loops, inline the look ups
of static const arrays with now constant indices, and then elmininate
all the actuall assertions. It seems none of this happens even at -O3.
Adding the pragmas helps encourage loop unrolling at some optimization
levels. I tested by running shader-db with NIR_VALIDATE=false on a Core
i7 Haswell desktop system.
-Og: No difference proven at 95.0% confidence. N=5
-O1: -48.304 +/- 1.221 (-16.3343% +/- 0.412888%) N=5
-O2: -49.94 +/- 1.23521 (-17.9634% +/- 0.444303%) N=5
v2: Add a _Pragma to an inner loop that was accidentally dropped during
a rebase.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Varyings are similar to already handled cases. And "glsl_zero_init"
name of the workaround already looks like it should include varyings.
The issue was observed in GiMark subtest from GpuTest.
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This loads in the <min_lod, max_lod, lod_bias> settings for a given
sampler, which is necessary for lowering clamps/biases on certain
Midgard chips.
Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Tomeu Vizoso <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
Serializing stripped NIR is faster now.
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
Use the correct bit size
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
This is needed for OpenCL
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This adds the option to lower 64-bit ufind_msb opcodes.
v2: use split_x/y removes component loops (Jason)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This adds support for some missing 24-bit and hi multiply
variants.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This needs to be derived from the address format, not always 1/32.
Suggested by Jason
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
cs is a union so storing this there is wrong.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was apparently missed in 67b32190f3c95, which added support
for ARB_shading_language_include to #line, including the 'path'
field for the location.
Fixes crashes in CTS with all drivers as they attempt to access
an uninitialized path string during parsing.
Fixes: 67b32190f3c95 ("glsl: add ARB_shading_language_include support to #line")
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Jose Maria Casanova <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This should fix android build issues while still allowing scons to
build the standalone compiler.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129
Reviewed-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
also make 8 and 16 compoments invalid. We will enable that later again
when we actually support it.
v2: fix validation of nir_intrinsic_instr::num_components
correct validation of instr->num_components
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will be useful as a deterministic identifier/index for the variable.
v2: fix comment style
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Connor Abbott <[email protected]> (v1)
|
|
|
|
|
|
|
| |
Doesn't shrink it (at least, on x86-64) and leaves space for more members.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
| |
Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Alyssa Rosenzweig <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to continue searching the current path for
relative shader includes.
From the ARB_shading_language_include spec:
"If it is quoted with double quotes in a previously included
string, then the first search point will be the tree location
where the previously included string had been found."
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
|
|
|
| |
If the shader contains an include when need to first run the
preprocessor before deciding if we can skip compilation based
on the shader cache.
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
|
| |
We will reuse this in the following commit.
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
|
| |
In other words make sure the shader does this:
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
| |
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
|
| |
This is a small tidy up and will be useful in the following commit.
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the ARB_shading_language_include spec:
"#line must have, after macro substitution, one of the following
forms:
#line <line>
#line <line> <source-string-number>
#line <line> "<path>"
where <line> and <source-string-number> are constant integer
expressions and <path> is a valid string for a path supplied in the
#include directive. After processing this directive (including its
new-line), the implementation will behave as if it is compiling at
line number <line> and source string number <source-string-number>
or <path> path. Subsequent source strings will be numbered
sequentially, until another #line directive overrides that
numbering."
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
Reviewed-by: Witold Baryluk <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
| |
4 bytes down
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|