| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Now that we're about to start generating control flow in our NIR, we want
this in place. It optimizes things frequently in the CS, when the GL VS
has control flow that doesn't affect the vertex position.
|
|
|
|
|
|
|
|
|
|
|
| |
Tiny change on shader-db currently, but it will be important when we start
emitting a lot of SFs from the same variable as part of control flow
support.
total instructions in shared programs: 89463 -> 89430 (-0.04%)
instructions in affected programs: 1522 -> 1489 (-2.17%)
total estimated cycles in shared programs: 250060 -> 250015 (-0.02%)
estimated cycles in affected programs: 8568 -> 8523 (-0.53%)
|
|
|
|
|
|
|
|
|
| |
The DCE pass is going to change significantly to handle control flow,
while we don't really need to change it for the SF handling. We also need
to add some more SF peephole optimization for SF updates generated by
control flow support.
No change on shader-db.
|
|
|
|
|
|
|
|
| |
I'm going to add an optimization for redundant SF update removal, which
will just remove the SF and leave us (in many cases) with an instruction
with a NULL destination and no side effects. Rather than teaching that
pass whether the whole instruction can be removed, leave that
responsibility to this pass.
|
|
|
|
|
|
|
| |
We need to not DCE them even though they don't have a destination in QIR.
We also shouldn't relocate them in vc4_opt_vpm. Neither of these things
happen, but I'm about to make DCE consider instructions with a NULL
destination.
|
|
|
|
|
|
|
| |
Noticed by code inspection. This hasn't been too big of a deal, because
our cond usages all start out as adder ops, either MOVs or the FTOI for Z
writes. MOVs *can* get converted to mul ops during scheduling, but
apparently we hadn't hit this.
|
|
|
|
|
| |
This isn't used since we switched to using the dst.pack field instead of
custom instructions.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
D3D9 has a different behaviour for depth bias.
For OGL/D3D1X, the depth bias unit is the
minimal resolvable value for the depth buffer,
which depends on the format (and has different
behaviour for float depth buffers).
For D3D9, the depth bias unit is 1.0f.
Signed-off-by: Axel Davy <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clean up misrepetitions ('if if', 'the the' etc) found throughout the
comments. This has been done manually, after grepping
case-insensitively for duplicate if, is, the, then, do, for, an,
plus a few other typos corrected in fly-by
v2:
* proper commit message and non-joke title;
* replace two 'as is' followed by 'is' to 'as-is'.
v3:
* 'a integer' => 'an integer' and similar (originally spotted by
Jason Ekstrand, I fixed a few other similar ones while at it)
Signed-off-by: Giuseppe Bilotta <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
This says how many window rectangles are supported by the
implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The expected stride calculation is completely wrong. It should
ultimately be multiplying cpp and width rather than dividing. The width
also needs to be aligned to the tiling width first before converting to
stride bytes.
The whole stride check here is possibly pointless. Any buffers which
were allocated outside of vc4 may have strides with larger alignment
requirements.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix warnings like these due to HAVE_LIBDRM being inconsistently defined:
external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition]
typedef struct drm_clip_rect drm_clip_rect_t;
HAVE_LIBDRM needs to be set project wide to fix this. This change also
harmlessly links libdrm with everything, but simplifies the makefiles a
bit.
Signed-off-by: Rob Herring <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Introduced in 8e2d0843c02daf5280184f179ae8ed440ac90d7f.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that vc4 automated code documentation can be generated with
doxygen, fix the warnings issued by Doxygen 1.8.11.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Push offset down to drivers when importing dmabuf. This is needed
to more fully support EGL_EXT_image_dma_buf_import when a non-zero
offset is specified.
Tesing has been done for freedreno, and compile tested following
gallium drivers:
nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo
Signed-off-by: Stanimir Varbanov <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware supports primitive restart on patch primitives, and other
hardware does not. Modern GL and ES include a query for this feature;
adding a capability bit will allow us to answer it.
As far as I know, AMD hardware does not support this feature, while
NVIDIA and Intel hardware does. However, most Gallium drivers do not
appear to support tessellation shaders yet. So, I've enabled it for
nvc0 and disabled it everywhere else.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We don't really support reading/writing of 3D textures since the hardware
doesn't do 3D, but we do need to make sure that a pipe_transfer for them
has enough space to store the image. This was previously not a problem
because the state tracker only mapped a slice at a time until
fb9fe352ea41c7e3633ba2c483c59b73c529845b. Fixes glean glsl1 tests, which
all have setup of a 3D texture at the start.
|
|
|
|
|
| |
This gets us precompile of vertex shaders at the state tracker level as
well.
|
|
|
|
|
| |
Now we have an immutable nir shader in our shader's CSO that we can clone
and lower/optimize.
|
|
|
|
|
|
|
| |
We always clamp fragment colors, since they're always 8-bit unorm, so
there's no need to have us compile separate shaders based on
GL_ARB_color_buffer_float. This gives us precompilation of fragment
programs to the vc4_shader_state_create() level.
|
|
|
|
|
|
| |
This allows the same pipe_shader_state to be referenced from multiple
contexts. Since our pipe_shader_state is treated as immutable (other than
the variant number) within the driver, this is no problem.
|
|
|
|
|
|
|
|
| |
This will be generated by glsl_to_nir, and it turns out that this is a
more code-efficient path than the floating point math, anyway.
No change on shader-db, but drops an instruction in piglit's
glsl-fs-frontfacing.
|
|
|
|
| |
This came from deriving from freedreno.
|
|
|
|
|
| |
This is apparently enabled as an error in Android builds, and the compiler
can't tell that the return value is safe.
|
|
|
|
|
|
|
|
|
| |
This lets us safely enable or disable the extension as needed
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This will be used for resetting the uniform stream in the presence of
branching, but may also be useful as an optimization to reduce how many
uniforms we have to copy out per draw call (in exchange for increasing
icache pressure).
|
|
|
|
|
|
| |
Seeing the expansion of a QPU_GET_FIELD in an assert isn't very
informative, and it's hard find what's going wrong without getting a dump
of the instruction that failed.
|
|
|
|
|
| |
This has caught a couple of bugs during loop development so far, and I
should probably have written it long ago.
|
|
|
|
| |
Found by the upcoming QIR validate pass.
|
|
|
|
| |
In the process, this made me flatten out the "%s%s%s%s" fprintf arguments.
|
|
|
|
| |
Prevents a bug in the later control-flow support series.
|
|
|
|
|
|
|
|
| |
We should have already emitted a NOP due to the last instruction being a
TLB or VPM write. However, if you disable dead code elimination then you
might get dead code at the end, and that dead code might have the signal
bits set to something non-default, at which point you die in assertion
failure.
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
This should get us the same decode code generated, but with a lot less
custom code in the driver.
|
|
|
|
|
| |
This means doing Newton-Raphson on the RCP, but it's probably actually a
good thing to be accurate on.
|
|
|
|
|
| |
This makes fewer programs with loops assertion fail, replacing them with
the rendering failure warning.
|
|
|
|
|
| |
In particular it's been hard to find the point where we switch from
dumping pre-optimization QIR and post-optimization QIR.
|
|
|
|
|
|
| |
It's not doing anything according to shader-db now that we're using NIR.
It would have had to be reworked significantly anyway, to handle control
flow.
|
|
|
|
|
|
| |
We were generating piles of FRAG_W for interpolation, only to CSE them
away immediately. Since this is the only thing that CSE is doing for us
any more, just avoid making the CSE work necessary.
|
|
|
|
|
|
|
| |
This is one of two uses of the current QIR CSE pass according to
shader-db. The NIR pass means that we'll end up doing Newton-Raphson on
our RCP, which we weren't doing before, but that's probably actually a
good thing.
|
|
|
|
|
| |
Now that we're using NIR for our optimization, there's no need for this
tricky code.
|
|
|
|
|
|
|
|
|
| |
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/
and similar expressions for nir_foreach_instr_safe etc.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
A later patch will add lower_flrp64 option to NIR.
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Part of fixing piglit EXT_framebuffer_multisample/sample-coverage inverted
(there is also a bug with RCL tiled blits)
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
| |
There's no reason we couldn't do non-MSAA full resolution tile buffer
load/stores, but we would have claimed buffer overflow was being
attempted. Nothing does this currently.
|
|
|
|
|
| |
I noticed this as a problem with ET:QW traces emitting coverage code when
the framebuffer was supposed to be single sampled.
|
|
|
|
|
|
|
|
|
|
| |
This was a bug from the MSAA enabling. Tests for surfaces with
nr_samples==1 instead of 0 (generally GL renderbuffers) would incorrectly
fail out.
Fixes the ARB_framebuffer_sRGB piglit tests other than srgb_conformance.
Cc: "11.1 11.2" <[email protected]>
|