| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Are we missing 64-bit support?
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
This introduces random GPU hangs on Vega, at least.
This reverts commit 02a43edf186cb9998741ba765cb948bb238a122d.
|
|
|
|
|
|
|
|
|
|
|
| |
The size has to be multiplied by the number of sets.
This gets rid of the OUT_OF_POOL_KHR error and fixes
a crash with the Tangrams demo.
CC: 18.1 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Instead of passing around BOs and offsets, use addresses which are anv's
GPU equivalent of pointers.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Each query slot is a uint64_t and we were only zeroing half of it.
Fixes: 7ec6e4e68980 "anv/query: implement multiview interactions"
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
Instead of computing an index at the end which we hope maps to the
number of things written, just count the number of things as we go.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
No shader-db changes on any Intel platform... which probably explains
why no bugs have been bisected to this problem since it landed in Mesa
18.1. :( The commit mentioned below is in 18.2, so 18.1 would need a
slightly different fix (due to code refactoring).
Signed-off-by: Ian Romanick <[email protected]>
Fixes: 77f269bb560 "i965/fs: Refactor propagation of conditional modifiers from compares to adds"
Reviewed-by: Alejandro Piñeiro <[email protected]> (reviewed the original patch)
Cc: Matt Turner <[email protected]> (reviewed the original patch)
|
|
|
|
|
|
|
|
|
|
| |
Apparently for compute there are only 16 instead of the 32 for the
graphics path.
Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise using 32 user SGPRs would be broken.
CC: <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.
Suggested-by: Philip Rebohle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes:
dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3
dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following building error in vc4 build:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34:
In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39:
In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56:
gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10:
fatal error: 'cle/v3d_packet_helpers.h' file not found
^~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following building error:
In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38:
In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29:
external/mesa/src/gallium/auxiliary/util/u_math.h:42:10:
fatal error: 'pipe/p_compiler.h' file not found
^~~~~~~~~~~~~~~~~~~
1 error generated.
Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following building error, happening when building both intel and broadcom:
Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h
FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h
/bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \
external/mesa/src/broadcom/cle/v3d_packet_v21.xml \
> gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h"
Traceback (most recent call last):
File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module>
p = Parser(sys.argv[2])
IndexError: list index out of range
header-gen macro is already defined by Intel genxml building rules
and the existing header-gen does not have the $(PRIVATE_VER) argument,
infact the bash command line logged in the building error is missing
exactly $(PRIVATE_VER) argument
Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk
solves the building error, another possible way is to keep the gen rules
commands expanded and not use the macros.
Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.")
Cc: "18.2" <[email protected]>
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Mauro Rossi <[email protected]>
|
|
|
|
|
|
|
|
| |
The offsets now come from the anv_address, these references were not
updated and using the old variable.
Fixes: e1ab8345574 "anv/memcpy: Use addresses instead of bo+offset"
Tested-by: Clayton Craft <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
This shouldn't matter as we'll never write OOB anyway but we may as well
get it right. It's supposed to be in dwords - 1.
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Because this was setting image to true we would end up calling
si_load_image_desc() when we sould be calling
si_load_sampler_desc().
This fixes an assert() in Deus Ex: MD
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
v2: corrected the comment
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using Freecad, I was getting intermittent segfaults inside of
mesa. I traced it down to this path in st_cb_drawpixels.c where the
result of pipe_transfer_map wasn't being checked. In my case, it was
returning NULL because nouveau_bo_new returned ENOENT. I'm by no
means a mesa developer, but this patch solves the problem for me and
seems reasonable enough.
v2: Marek - also unmap the PBO and release the texture, and call
the make_texture function sooner for less cleanup
Cc: 18.1 18.2 <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It shouldn't be needed to emit the initial graphics or compute
state when beginning a new command buffer. Emitting them in
the preamble should be enough and this will reduce IB sizes.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Indirect descriptors only need one entry, we don't have to
emit a location for every descriptors.
Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let say, we first bind a graphics pipeline that needs indirect
descriptors sets. The userdata pointers will be emitted at draw
time. Then if we bind a compute pipeline that doesn't need any
indirect descriptors, the driver will re-emit them for all
grpahics stages.
To avoid this to happen, just check the bind point type.
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
LLVM 6 isn't affected.
Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was wrong for descriptor #0 when all of them are indirect.
This is because indirect_offset was 0 and we emitted a
"normal" descriptor pointer for nothing.
While we are at it remove
radv_userdata_info::indirect_offset which is useless.
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Bumping to 64 should be safe enough.
Fixes some crashes with new CTS:
dEQP-VK.binding_model.descriptorset_random.*
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
According to RadeonSI, it's unnecessary to multiply by
the stride. That field seems to always be 64.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
That removes two special cases for clip/cull distances.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Instead of having holes. The other ring parameters like
offset and stride can be updated later.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
This is just for consistency because LLVM can detect and
remove unused loads.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It's actually just the opposite.
This fixes the new Sascha conditionalrender demo.
CC: 18.2 <[email protected]>
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
Same code is generated because LLVM ends up by using bfe, but
that seems cleaner to me.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In GLSL IR we cheat with switch statements and simply convert them
into loops with a single iteration. This allowed us to make use of
the existing jump instruction handling provided by the loop handing
code, it also allows dead code to be cleaned up once we have
wrapped the code in a loop.
However using loops in this way created previously unrollable loops
which limits further optimisations. Here we provide a way to unroll
loops that end in a break and have multiple other exits.
All shader-db changes are from the dolphin uber shaders. There is a
small amount of HURT shaders but in general the improvements far
exceed the HURT.
shader-db results IVB:
total instructions in shared programs: 10018187 -> 10016468 (-0.02%)
instructions in affected programs: 104080 -> 102361 (-1.65%)
helped: 36
HURT: 15
total cycles in shared programs: 220065064 -> 154529655 (-29.78%)
cycles in affected programs: 126063017 -> 60527608 (-51.99%)
helped: 51
HURT: 0
total loops in shared programs: 2515 -> 2308 (-8.23%)
loops in affected programs: 903 -> 696 (-22.92%)
helped: 51
HURT: 0
total spills in shared programs: 4370 -> 4124 (-5.63%)
spills in affected programs: 1397 -> 1151 (-17.61%)
helped: 9
HURT: 12
total fills in shared programs: 4581 -> 4419 (-3.54%)
fills in affected programs: 2201 -> 2039 (-7.36%)
helped: 9
HURT: 15
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- only allow nir_op_inot or nir_op_b2i when alu input is 1.
- use some helpers as suggested by Jason.
v3:
- evaluate alu op for single input alu ops
- add helper function to decide if to propagate through alu
- make use of nir_before_src in another spot
shader-db IVB results:
total instructions in shared programs: 9993483 -> 9993472 (-0.00%)
instructions in affected programs: 1300 -> 1289 (-0.85%)
helped: 11
HURT: 0
total cycles in shared programs: 219476091 -> 219476059 (-0.00%)
cycles in affected programs: 7675 -> 7643 (-0.42%)
helped: 10
HURT: 1
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since we know what side of the branch we ended up on we can just
replace the use with a constant.
All the spill changes in shader-db are from Dolphin uber shaders,
despite some small regressions the change is clearly positive.
V2: insert new constant after any phis in the
use->parent_instr->type == nir_instr_type_phi path.
v3:
- use nir_after_block_before_jump() for inserting const
- check dominance of phi uses correctly
v4:
- create some helpers as suggested by Jason.
v5 (Jason Ekstrand):
- Use LIST_ENTRY to get the phi src
shader-db results IVB:
total instructions in shared programs: 9999201 -> 9993483 (-0.06%)
instructions in affected programs: 163235 -> 157517 (-3.50%)
helped: 132
HURT: 2
total cycles in shared programs: 231670754 -> 219476091 (-5.26%)
cycles in affected programs: 143424120 -> 131229457 (-8.50%)
helped: 115
HURT: 24
total spills in shared programs: 4383 -> 4370 (-0.30%)
spills in affected programs: 1656 -> 1643 (-0.79%)
helped: 9
HURT: 18
total fills in shared programs: 4610 -> 4581 (-0.63%)
fills in affected programs: 374 -> 345 (-7.75%)
helped: 6
HURT: 0
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we're mapping temp-resources, we clip the resource to the
transfer-box, which means the stride might not be correct any more.
So let's update the stride from the temp-resource, and recompute the
layer-stride.
This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4
Signed-off-by: Erik Faye-Lund <[email protected]>
Fixes: a8987b88ff1 "virgl: add driver for virtio-gpu 3D (v2)"
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Those operations do not map to actual hardware instructions, therefore
those should always be lowered to 32-bit instructions.
Fixes: 009c54aa7af "nv50/ir: Split 64-bit integer MAD/MUL operations"
Signed-off-by: Pierre Moreau <[email protected]>
Reviewed-by: Karol Herbst <[email protected]>
Signed-off-by: Karol Herbst <[email protected]>
|