summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* ac: add 16-bit support to ac_build_bit_count()Samuel Pitoiset2018-09-171-0/+5
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16-bit support to ac_find_lsb()Samuel Pitoiset2018-09-171-2/+13
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16-bit support to ac_build_umsb()Samuel Pitoiset2018-09-171-2/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16-bit support to ac_build_isign()Samuel Pitoiset2018-09-171-5/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add 16-bit constant values for zero and oneSamuel Pitoiset2018-09-172-0/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bifield_reverse() helperSamuel Pitoiset2018-09-173-1/+26
| | | | | | | Are we missing 64-bit support? Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac: add ac_build_bit_count() helperSamuel Pitoiset2018-09-173-6/+31
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix use of unreachable() in the meta blit pathSamuel Pitoiset2018-09-171-4/+4
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* Revert "radv: Optimize rebinding the same descriptor set."Samuel Pitoiset2018-09-171-7/+1
| | | | | | This introduces random GPU hangs on Vega, at least. This reverts commit 02a43edf186cb9998741ba765cb948bb238a122d.
* radv: fix descriptor pool allocation sizeSamuel Pitoiset2018-09-171-1/+2
| | | | | | | | | | | The size has to be multiplied by the number of sets. This gets rid of the OUT_OF_POOL_KHR error and fixes a crash with the Tangrams demo. CC: 18.1 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv/query: Add an emit_srm helperJason Ekstrand2018-09-171-32/+21
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add a mi_memset and use it for zeroing queriesJason Ekstrand2018-09-173-12/+23
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Use anv_address everywhereJason Ekstrand2018-09-171-57/+64
| | | | | | | Instead of passing around BOs and offsets, use addresses which are anv's GPU equivalent of pointers. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Write both dwords in emit_zero_queriesJason Ekstrand2018-09-171-0/+5
| | | | | | | Each query slot is a uint64_t and we were only zeroing half of it. Fixes: 7ec6e4e68980 "anv/query: implement multiview interactions" Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/query: Increment an index while writing resultsJason Ekstrand2018-09-171-36/+31
| | | | | | | Instead of computing an index at the end which we hope maps to the number of things written, just count the number of things as we go. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/fs: Don't propagate conditional modifiers from integer compares to addsIan Romanick2018-09-171-1/+9
| | | | | | | | | | | | No shader-db changes on any Intel platform... which probably explains why no bugs have been bisected to this problem since it landed in Mesa 18.1. :( The commit mentioned below is in 18.2, so 18.1 would need a slightly different fix (due to code refactoring). Signed-off-by: Ian Romanick <[email protected]> Fixes: 77f269bb560 "i965/fs: Refactor propagation of conditional modifiers from compares to adds" Reviewed-by: Alejandro Piñeiro <[email protected]> (reviewed the original patch) Cc: Matt Turner <[email protected]> (reviewed the original patch)
* radv: Only allow 16 user SGPRs for compute on GFX9+.Bas Nieuwenhuizen2018-09-161-1/+1
| | | | | | | | | | Apparently for compute there are only 16 instead of the 32 for the graphics path. Fixes dEQP-VK.binding_model.descriptorset_random.sets16.noarray.ubolimitlow.sbolimitlow.imglimitlow.noiub.comp.0 CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Set the user SGPR MSB for Vega.Bas Nieuwenhuizen2018-09-161-0/+1
| | | | | | | Otherwise using 32 user SGPRs would be broken. CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Optimize rebinding the same descriptor set.Bas Nieuwenhuizen2018-09-161-1/+7
| | | | | | | | This makes it cheaper to just change the dynamic offsets with the same descriptor sets. Suggested-by: Philip Rebohle <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* r600/sb: use safe math optimizations when TGSI contains precise operationsGert Wollny2018-09-153-1/+5
| | | | | | | | | | Fixes: dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* android: broadcom/cle: export the broadcom top level path headersMauro Rossi2018-09-151-0/+2
| | | | | | | | | | | | | | | | | | | | Fixes the following building error in vc4 build: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_render_cl.c:34: In file included from external/mesa/src/gallium/drivers/vc4/kernel/vc4_drv.h:27: In file included from external/mesa/src/gallium/drivers/vc4/vc4_simulator_validate.h:34: In file included from external/mesa/src/gallium/drivers/vc4/vc4_context.h:39: In file included from external/mesa/src/gallium/drivers/vc4/vc4_cl.h:56: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h:12:10: fatal error: 'cle/v3d_packet_helpers.h' file not found ^~~~~~~~~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Mauro Rossi <[email protected]>
* android: broadcom/cle: add gallium include pathMauro Rossi2018-09-151-0/+2
| | | | | | | | | | | | | | | | | Fixes the following building error: In file included from external/mesa/src/broadcom/cle/v3d_decoder.c:38: In file included from external/mesa/src/broadcom/cle/v3d_packet_helpers.h:29: external/mesa/src/gallium/auxiliary/util/u_math.h:42:10: fatal error: 'pipe/p_compiler.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: 5b102160ae ("broadcom/genxml: Introduce a V3D packet/struct decoder.") Cc: "18.2" <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Mauro Rossi <[email protected]>
* android: broadcom/genxml: fix collision with intel/genxml header-gen macroMauro Rossi2018-09-151-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes the following building error, happening when building both intel and broadcom: Gen Header: libmesa_broadcom_genxml_32 <= v3d_packet_v21_pack.h FAILED: gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h /bin/bash -c "python external/mesa/src/broadcom/cle/gen_pack_header.py \ external/mesa/src/broadcom/cle/v3d_packet_v21.xml \ > gen/STATIC_LIBRARIES/libmesa_broadcom_genxml_intermediates/broadcom/cle/v3d_packet_v21_pack.h" Traceback (most recent call last): File "external/mesa/src/broadcom/cle/gen_pack_header.py", line 626, in <module> p = Parser(sys.argv[2]) IndexError: list index out of range header-gen macro is already defined by Intel genxml building rules and the existing header-gen does not have the $(PRIVATE_VER) argument, infact the bash command line logged in the building error is missing exactly $(PRIVATE_VER) argument Renaming the macro as pack-header-gen in src/broadcom/Android.genxml.mk solves the building error, another possible way is to keep the gen rules commands expanded and not use the macros. Fixes: 7f80a9ff13 ("vc4: Introduce XML-based packet header generation like Intel's.") Cc: "18.2" <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Mauro Rossi <[email protected]>
* anv/memcpy: fix build after starting to use addressesCaio Marcelo de Oliveira Filho2018-09-141-2/+2
| | | | | | | | The offsets now come from the anv_address, these references were not updated and using the old variable. Fixes: e1ab8345574 "anv/memcpy: Use addresses instead of bo+offset" Tested-by: Clayton Craft <[email protected]>
* anv/cmd_buffer: Take an address in emit_lrmJason Ekstrand2018-09-141-17/+16
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* anv/memcpy: Use addresses instead of bo+offsetJason Ekstrand2018-09-143-35/+34
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* anv/so_memcpy: Use the correct SO_BUFFER size on gen8+Jason Ekstrand2018-09-141-1/+1
| | | | | | | This shouldn't matter as we'll never write OOB anyway but we may as well get it right. It's supposed to be in dwords - 1. Reviewed-by: Nanley Chery <[email protected]>
* ac: fix get_image_coords() for radeonsiTimothy Arceri2018-09-151-1/+2
| | | | | | | | | | Because this was setting image to true we would end up calling si_load_image_desc() when we sould be calling si_load_sampler_desc(). This fixes an assert() in Deus Ex: MD Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: don't let child processes inherit our thread affinityMarek Olšák2018-09-141-4/+32
| | | | v2: corrected the comment
* gallium/util: start with a random L3 cache index for AMD ZenMarek Olšák2018-09-141-4/+16
|
* st/mesa: Validate the result of pipe_transfer_map in make_texture (v2)Josh Pieper2018-09-141-8/+12
| | | | | | | | | | | | | | When using Freecad, I was getting intermittent segfaults inside of mesa. I traced it down to this path in st_cb_drawpixels.c where the result of pipe_transfer_map wasn't being checked. In my case, it was returning NULL because nouveau_bo_new returned ENOENT. I'm by no means a mesa developer, but this patch solves the problem for me and seems reasonable enough. v2: Marek - also unmap the PBO and release the texture, and call the make_texture function sooner for less cleanup Cc: 18.1 18.2 <[email protected]>
* radv: emit the initial config only once in the preamblesSamuel Pitoiset2018-09-144-50/+48
| | | | | | | | | It shouldn't be needed to emit the initial graphics or compute state when beginning a new command buffer. Emitting them in the preamble should be enough and this will reduce IB sizes. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix setting global locations for indirect descriptorsSamuel Pitoiset2018-09-141-1/+0
| | | | | | | | | | | | Indirect descriptors only need one entry, we don't have to emit a location for every descriptors. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix flushing indirect descriptorsSamuel Pitoiset2018-09-141-3/+9
| | | | | | | | | | | | | | Let say, we first bind a graphics pipeline that needs indirect descriptors sets. The userdata pointers will be emitted at draw time. Then if we bind a compute pipeline that doesn't need any indirect descriptors, the driver will re-emit them for all grpahics stages. To avoid this to happen, just check the bind point type. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix GPU hangs with 32-bit indirect descriptorsSamuel Pitoiset2018-09-141-3/+5
| | | | | | | | | | | LLVM 6 isn't affected. Fixes GPU hangs with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: handle loc->indirect correctly for the first descriptorSamuel Pitoiset2018-09-142-11/+10
| | | | | | | | | | | | | This was wrong for descriptor #0 when all of them are indirect. This is because indirect_offset was 0 and we emitted a "normal" descriptor pointer for nothing. While we are at it remove radv_userdata_info::indirect_offset which is useless. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: bump the maximum number of arguments to 64Samuel Pitoiset2018-09-141-1/+1
| | | | | | | | | | | Bumping to 64 should be safe enough. Fixes some crashes with new CTS: dEQP-VK.binding_model.descriptorset_random.* CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: tidy up ac_setup_rings() for the GSVS ringsSamuel Pitoiset2018-09-141-13/+34
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix setting the number of entries for GSVS on VI+Samuel Pitoiset2018-09-141-3/+0
| | | | | | | | According to RadeonSI, it's unnecessary to multiply by the stride. That field seems to always be 64. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: always compute the number of components from the output maskSamuel Pitoiset2018-09-141-12/+2
| | | | | | | That removes two special cases for clip/cull distances. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit data contiguously in the GS->VS ring bufferSamuel Pitoiset2018-09-141-16/+12
| | | | | | | | Instead of having holes. The other ring parameters like offset and stride can be updated later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of the output usage mask in GS copy shaderSamuel Pitoiset2018-09-141-0/+3
| | | | | | | | This is just for consistency because LLVM can detect and remove unused loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: improve a comment in si_emit_set_predication_state()Samuel Pitoiset2018-09-141-8/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix VK_EXT_conditional_rendering visibilitySamuel Pitoiset2018-09-141-4/+12
| | | | | | | | | | It's actually just the opposite. This fixes the new Sascha conditionalrender demo. CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: make use of ac_unpack_param() instead of ac_build_bfe()Samuel Pitoiset2018-09-141-15/+6
| | | | | | | | Same code is generated because LLVM ends up by using bfe, but that seems cleaner to me. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add loop unroll support for complex wrapper loopsTimothy Arceri2018-09-141-37/+76
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In GLSL IR we cheat with switch statements and simply convert them into loops with a single iteration. This allowed us to make use of the existing jump instruction handling provided by the loop handing code, it also allows dead code to be cleaned up once we have wrapped the code in a loop. However using loops in this way created previously unrollable loops which limits further optimisations. Here we provide a way to unroll loops that end in a break and have multiple other exits. All shader-db changes are from the dolphin uber shaders. There is a small amount of HURT shaders but in general the improvements far exceed the HURT. shader-db results IVB: total instructions in shared programs: 10018187 -> 10016468 (-0.02%) instructions in affected programs: 104080 -> 102361 (-1.65%) helped: 36 HURT: 15 total cycles in shared programs: 220065064 -> 154529655 (-29.78%) cycles in affected programs: 126063017 -> 60527608 (-51.99%) helped: 51 HURT: 0 total loops in shared programs: 2515 -> 2308 (-8.23%) loops in affected programs: 903 -> 696 (-22.92%) helped: 51 HURT: 0 total spills in shared programs: 4370 -> 4124 (-5.63%) spills in affected programs: 1397 -> 1151 (-17.61%) helped: 9 HURT: 12 total fills in shared programs: 4581 -> 4419 (-3.54%) fills in affected programs: 2201 -> 2039 (-7.36%) helped: 9 HURT: 15 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: propagates if condition evaluation down some alu chainsTimothy Arceri2018-09-141-0/+128
| | | | | | | | | | | | | | | | | | | | | | | | | v2: - only allow nir_op_inot or nir_op_b2i when alu input is 1. - use some helpers as suggested by Jason. v3: - evaluate alu op for single input alu ops - add helper function to decide if to propagate through alu - make use of nir_before_src in another spot shader-db IVB results: total instructions in shared programs: 9993483 -> 9993472 (-0.00%) instructions in affected programs: 1300 -> 1289 (-0.85%) helped: 11 HURT: 0 total cycles in shared programs: 219476091 -> 219476059 (-0.00%) cycles in affected programs: 7675 -> 7643 (-0.42%) helped: 10 HURT: 1 Reviewed-by: Jason Ekstrand <[email protected]>
* nir: evaluate if condition uses inside the if branchesTimothy Arceri2018-09-142-0/+138
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Since we know what side of the branch we ended up on we can just replace the use with a constant. All the spill changes in shader-db are from Dolphin uber shaders, despite some small regressions the change is clearly positive. V2: insert new constant after any phis in the use->parent_instr->type == nir_instr_type_phi path. v3: - use nir_after_block_before_jump() for inserting const - check dominance of phi uses correctly v4: - create some helpers as suggested by Jason. v5 (Jason Ekstrand): - Use LIST_ENTRY to get the phi src shader-db results IVB: total instructions in shared programs: 9999201 -> 9993483 (-0.06%) instructions in affected programs: 163235 -> 157517 (-3.50%) helped: 132 HURT: 2 total cycles in shared programs: 231670754 -> 219476091 (-5.26%) cycles in affected programs: 143424120 -> 131229457 (-8.50%) helped: 115 HURT: 24 total spills in shared programs: 4383 -> 4370 (-0.30%) spills in affected programs: 1656 -> 1643 (-0.79%) helped: 9 HURT: 18 total fills in shared programs: 4610 -> 4581 (-0.63%) fills in affected programs: 374 -> 345 (-7.75%) helped: 6 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]>
* virgl: adjust strides when mapping temp-resourcesErik Faye-Lund2018-09-141-0/+2
| | | | | | | | | | | | | | When we're mapping temp-resources, we clip the resource to the transfer-box, which means the stride might not be correct any more. So let's update the stride from the temp-resource, and recompute the layer-stride. This fixes crashes when running dEQP with --deqp-gl-config-name=rgba8888d24s8ms4 Signed-off-by: Erik Faye-Lund <[email protected]> Fixes: a8987b88ff1 "virgl: add driver for virtio-gpu 3D (v2)" Reviewed-by: Dave Airlie <[email protected]>
* nvir: Always split 64-bit IMAD/IMUL operationsPierre Moreau2018-09-131-1/+1
| | | | | | | | | | | Those operations do not map to actual hardware instructions, therefore those should always be lowered to 32-bit instructions. Fixes: 009c54aa7af "nv50/ir: Split 64-bit integer MAD/MUL operations" Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Signed-off-by: Karol Herbst <[email protected]>