| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Normal SFU writes couldn't have SF because they were marked as
multi_instruction, but tex_result and tlb_color_read weren't. This ended
up not being a problem according to anything in shader-db, but it seems
possible.
|
|
|
|
|
|
|
|
|
|
|
| |
There used to be multi-instruction operations that would use src[] twice,
which is why we couldn't do some optimizations on them. This is no longer
the case.
total instructions in shared programs: 77973 -> 77969 (-0.01%)
instructions in affected programs: 84 -> 80 (-4.76%)
total estimated cycles in shared programs: 234165 -> 234157 (-0.00%)
estimated cycles in affected programs: 92 -> 84 (-8.70%)
|
|
|
|
| |
This gets us better validation of our NIR transformations.
|
|
|
|
|
|
|
|
| |
I liked having all my NIR be scalar, but nir_validate() complains that the
intrinsic writes 4 components but the destination we set up was only 1
component. I could generate a new scalar variant, but it's a lot easier
to just leave it as a vec4. This doesn't hurt codegen since we GC unused
uniforms, and UCP dot products use all the components anyway.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't really suppor control flow yet, but it's a lot nicer to render
something and warn on stderr than to crash.
Fixes the following piglit tests:
- shaders/complex-loop-analysis-bug
- shaders/glsl-fs-discard-04
Converts the following piglit tests from crash to fail:
- shaders/glsl-fs-continue-inside-do-while
- shaders/glsl-fs-loop
- shaders/glsl-fs-loop-continue
- shaders/glsl-fs-loop-nested
- shaders/glsl-texcoord-array
- shaders/glsl-vs-continue-inside-do-while
- shaders/glsl-vs-loop
- shaders/glsl-vs-loop-continue
- shaders/glsl-vs-loop-nested
No piglit regressions.
v2 (Eric): Add stronger stderr warning.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We shouldn't have any NIR functions present since all GLSL functions get
inlined, but this would be a more informative error if it does happen.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ensure NIR control flow graph nodes that are unhandled in QIR
are reported with sufficient verbosity to aid debugging.
This improves piglit outputs, amongst other tools.
There are no other remaining uses of assert(0) as a blunt tool
within vc4.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with grep and inspection. Test compiled on RPi hw.
Assists any future effort to remove TGSI as an intermediate stage.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: change check_explicit_uniform_locations() to return an
unsigned 0 (Timothy Arceri)
We were storing the int result of check_explicit_uniform_locations()
in num_explicit_uniform_locs as an unsigned int which caused it to
be 4294967295 when a -1 was returned.
This in turn would cause the following error during linking:
error: count of uniform locations > MAX_UNIFORM_LOCATIONS(4294967295 > 98304)
Results from running piglit tests/all with this patch
and when ARB_explicit_uniform_location disabled:
changes: 178
fixes: 176
regressions: 2
The two regressions are for the following tests:
glean@glsl1-matrix column check (1)
glean@glsl1-matrix column check (2)
which regress from FAIL to CRASH.
The regressions are acceptable because the tests are currently failing due to
the aforementioned linker error.
Signed-off-by: Lars Hamre <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we can use the much simpler rgba8_copy function, we don't need to
hand different functions out based on direction.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This splits the two copy functions into three: One for unaligned copies,
one for aligned sources, and one for aligned destinations. Thanks to the
previous commit, we are now guaranteed that the aligned ones will *only*
operate on aligned memory so they should be safe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each of the [de]tiling functions has three mem_copy calls:
1) Left edge to tile boundary
2) Tile boundary to tile boundary in a loop
3) Tile boundary to right edge
Copies 2 and 3 start at a tile edge so the pointer to tiled memory is
guaranteed to be at least 16-byte aligned. Copy 1, on the other hand,
starts at some arbitrary place in the tile so it doesn't have any such
alignment guarantees.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that the check is restricted to gen8+, we should always get back a non-zero
positive value for the EU and subslice counts.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Older gen platforms do not actually return a value for sublice and eu total
(IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until
we have the actual GPU generation to avoid useless warnings when running on
older platforms with older kernels.
Reported-by: Mark Janes <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the first call in a GL app is glReadPixels(GL_FRONT) we'd fail the
assert(st->ctx->FragmentProgram._Current) at st_atom_shader.c:114 in
update_fp().
This is because we were calling st_validate_state() without first
updating Mesa state with _mesa_update_state().
The regression came from commit 83b589301f4a150f4 "st/mesa: fix
frontbuffer glReadPixels regressions".
The new piglit gl-1.0-simple-readbuffer test exercises this.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
| |
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Tested-by: Julien Isorce <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In other words, vport scissors are derived from viewport states.
If the scissor test is enabled, the intersection of both is used.
The guard band will disable clipping, so we have to clip per-pixel.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This might result in an INVALID_OPCODE dmesg error in case a join is
attached to an atomic operation.
Spotted with arb_shader_image_load_store-host-mem-barrier on GK104.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94835
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is in preparation of raising the number of exposed sampler views to 32
bits, which will raise the total number of sampler views to 33 for the
polygon stipple texture. That texture should never be compressed (and it's
certainly not a depth texture), but this approach seems cleaner to me than
special-casing the last slot in all affected code paths.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous value of 18 was motivated by having drivers that want to expose
16 samplers but also use some additional samplers for internal use. Raising
the value even higher isn't going to hurt that case.
On the other hand, some drivers actually use PIPE_MAX_SAMPLERS as the number
of samplers they expose externally, so raising this number above 32 is fragile
(because several places in the code use bitfields, and tracking down and
widening all of them is prone to miss some case).
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is used as a bitfield, so it seems cleaner to keep it unsigned.
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.
v2: add an assert for bitfield size and use 1u << idx
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
| |
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Line anti-aliasing will fail when there is no free sampler available. Make
the corresponding guard more robust in preparation of raising
PIPE_MAX_SAMPLERS to 32.
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.
v2: add an assert for bitfield size and use 1u << idx
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When hasFixedUnit is false, polygon stippling will fail when there is no free
sampler available. Make the corresponding guard more robust in preparation
of raising PIPE_MAX_SAMPLERS to 32.
The literal 1 is a (signed) int, and shifting into the sign bit is undefined
in C, so change occurences of 1 to 1u.
v2: add an assert for bitfield size and use 1u << idx
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]> (v1)
Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v1)
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
| |
On by default.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
These small mallocs will probably never fail, but static analysis tools
may complain about the missing checks.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit 0daab9878d2b96356cf667591a2c877d912be52d.
The corresponding clang change was reverted.
Trivial.
|
|
|
|
| |
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This is just a cleanup of the code.
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This moves things around so that the global buffer handling
functions in evergreen_compute.c are static.
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
These aren't used outside evergreen_compute.c
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No reason to pull the pieces apart here, also make
one of the functions static as it's unused outside this.
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Another step towards cleaning this up.
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This aligns the code with the style of the rest of the driver.
Makes editing it a lot less painful.
Acked-by: Tom Stellard <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Lets give the developer a little hand if we are going to assert
on a zero literal at the end of a branch.
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For ARB_framebuffer_no_attachment; A is_format_supported() query
with 'PIPE_FORMAT_NONE' passed implies a query of the number of
samples supported from the framebuffer with no attachment.
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Carries across the number of samples and layers state in the
'softpipe_set_framebuffer_state()' callback. This state is
part of 'ARB_framebuffer_no_attachments' support.
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Handle the case of ARB_framebuffer_no_attachment.
Also, kill off a dead debug printf() call while we are here.
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Here we store the number of samples and layers directly in the
pipe_framebuffer_state so that in the case of
ARB_framebuffer_no_attachment we may make use of them directly.
Further, we adjust various gallium/auxiliary helper functions
accordingly.
V2:
Convert branches in util_framebuffer_get_num_layers() and
util_framebuffer_get_num_samples() to their canonical form.
V3:
'git stash pop' the typo fix of 'cbufs' which should be
'nr_cbufs' that was missing in V2, woops! Thanks Marek for
pointing this out yet again.
V4:
Squash in the following patch:
'gallium/util: Ensure util_framebuffer_get_num_samples() is valid'
Upon context creation, internal driver structures are malloc()'ed
and memset() to zero them. This results in a invalid number of
samples 'by default'. Handle this in the simplest way to avoid
elaborate and probably equally sub-optimial solutions.
Signed-off-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|