summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gitlab-ci: Sort ARM docker image packages in alphabetical orderMichel Dänzer2019-10-221-20/+20
| | | | | | No functional change. Reviewed-by: Eric Engestrom <[email protected]>
* radv: fix updating bound fast ds clear values with different aspectsSamuel Pitoiset2019-10-221-3/+13
| | | | | | | | | | | | | | | On GFX9, the driver is able to do an optimized fast depth/stencil clear with only one aspect (ie. clear the stencil part of a depth/stencil image). When this happens, the driver should only update the clear values of the given aspect. Note that it's currently only supported on GFX9 but I have some local patches that extend this optimized path for other gens. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1967 Cc: 19.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler: Refactor disassembly of sources in 3src instructionSagar Ghuge2019-10-211-19/+10
| | | | | Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Don't move immediate in registerSagar Ghuge2019-10-211-0/+38
| | | | | | | | | On Gen12, we support mixed mode HF/F operands, and also 3 source instruction supports immediate value support, so keep immediate as it is, if it fits properly in 16 bit field. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Set bits according to source fileSagar Ghuge2019-10-211-2/+12
| | | | | | | | On Gen >= 12, if src0 or src2 holds immediate value, we need set src[0/2]_is_imm bits instead of register file. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Add Immediate support for 3 source instructionSagar Ghuge2019-10-211-21/+32
| | | | | | | | On Gen >= 10, Either src0 or src2 can use 16-bit immediate value, but not both. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* ci: Disable lima until its farm can get fixed.Eric Anholt2019-10-211-2/+2
| | | | | | | | | | It's been throwing the following error today: "<Fault -32603: 'Internal Server Error (contact server administrator for details): could not extend file "base/17952/18226": No space left on device\nHINT: Check free disk space.\n'>" Reviewed-by: Daniel Stone <[email protected]>
* intel: Add missing entry for brw_nir_lower_alpha_to_coverage in MakefileSagar Ghuge2019-10-211-0/+1
| | | | | | | Fixes: 7ecfbd4f6d4 ("nir: Add alpha_to_coverage lowering pass") Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* llvmpipe: handle compute shader launch with 0 threadsDave Airlie2019-10-211-0/+9
| | | | | | | | | If you set LP_NUM_THREADS=0 compute shaders would hang, just execute the workloads in sequence if we have no threads in the pool. Fixes: 1b24e3ba75 ("llvmpipe: add compute threadpool + mutex") Reviewed-by: Roland Scheidegger <[email protected]>
* freedreno/ir3: Add missing ir3_nir_lower_tex_prefetch.c to Android.mkMarijn Suijten2019-10-211-0/+1
| | | | | | | | | This file is created in 2a0d45ae6cf09d60c048d7854e3d082bf15e374f but addition to android makefiles was omitted. It breaks the build with missing references which are defined in this file. List the file in ir3_SOURCES to make the build succeed. Signed-off-by: Marijn Suijten <[email protected]>
* ac/llvm: fix ac_to_integer_type() for 32-bit const addr space pointersSamuel Pitoiset2019-10-211-0/+1
| | | | | | | | | | | | This fixes some crashes with dEQP-VK.descriptor_indexing.* when read_first_invocation has its source from a descriptor. Most of these tests still fail because of an LLVM bug (they work with ACO). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* aco: run opt_algebraic in a loopRhys Perry2019-10-211-3/+8
| | | | | | | | | | | | | | | | | Totals from affected shaders: SGPRS: 13920 -> 13656 (-1.90 %) VGPRS: 12972 -> 12960 (-0.09 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1005680 -> 1000648 (-0.50 %) bytes LDS: 91 -> 91 (0.00 %) blocks Max Waves: 688 -> 688 (0.00 %) Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: use nir_lower_idiv_preciseRhys Perry2019-10-211-1/+1
| | | | | | | v7: rename _nv50/_llvm to _fast/_precise Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* nir/lower_idiv: add new llvm-based pathRhys Perry2019-10-218-17/+136
| | | | | | | | | | | | | | | | | v2: make variable names snake_case v2: minor cleanups in emit_udiv() v2: fix Panfrost build failure v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature v4: remove nir_op_urcp v5: drop nv50 path v5: rebase v6: add back nv50 path v6: add comment for nir_lower_idiv_path enum v7: rename _nv50/_llvm to _fast/_precise v8: fix etnaviv build failure Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* intel/compiler: Remove emit_alpha_to_coverage workaround from backendSagar Ghuge2019-10-212-84/+13
| | | | | | | | | | | | | Remove emit_alpha_to_coverage workaround from backend compiler and start using ported workaround from NIR. v2: Copy comment from brw_fs_visitor (Caio Marcelo de Oliveira Filho) Fixes piglit test on HSW: - arb_sample_shading-builtin-gl-sample-mask-mrt-alpha-to-coverage-combinations Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add alpha_to_coverage lowering passSagar Ghuge2019-10-213-0/+171
| | | | | | | | | | | | | | | | | | | | | Importing this pass from fs_visitor::emit_alpha_to_coverage_workaround() in intel/compiler. v2 (Caio Marcelo de Oliveira Filho): - Track store output and sample mask instruction - Nest math insturction for more readability - Bail out early if no gl_SampleMask v3: (Caio Marcelo de Oliveira Filho): - Do math instructions after instruction block - Restructure code - Move pass under src/intel/compiler v4: (Caio Marcelo de Oliveira Filho): - Organize dither mask calculation Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* aco: ensure that uniform booleans are computed in WQM if their uses happen ↵Daniel Schürmann2019-10-211-1/+2
| | | | | | | | in WQM This fixes graphical corruption in SC2. Reviewed-by: Rhys Perry <[email protected]>
* meson: Require meson >= 0.49.1 when using icc or iclDylan Baker2019-10-211-6/+2
| | | | | | | | | | | | | 0.49.0 can compile most of mesa with ICC or ICL, but not SWR without additional workarounds in our meson.build files. Bumping patch version is easier and shouldn't be a big burden anyway, especially to cover a niche compiler. The check originally only covered ICC, but now covers ICL as well. Fixes: 3740ffb59c89d8d879b1e0c1aed32c389dd82a35 ("meson: add switches for SWR with MSVC") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1937 Acked-by: Eric Engestrom <[email protected]>
* docs: update calendar, add news item and link release notes for 19.1.8Juan A. Suarez Romero2019-10-213-9/+14
| | | | Signed-off-by: Juan A. Suarez Romero <[email protected]>
* docs: add release notes for 19.1.8Juan A. Suarez Romero2019-10-211-1/+1
| | | | | Signed-off-by: Juan A. Suarez Romero <[email protected]> (cherry picked from commit cc88eeb6ffc4e86d76dfdbfc601d519bc35b6c41)
* docs: add release notes for 19.1.8Juan A. Suarez Romero2019-10-211-0/+267
| | | | | Signed-off-by: Juan A. Suarez Romero <[email protected]> (cherry picked from commit 5c6d266c591208b1c27e06f61b814210fc6e095f)
* aco/gfx10: Update constant addresses in fix_branches_gfx10.Timur Kristóf2019-10-211-1/+12
| | | | | | | | | Due to a bug in GFX10 hardware, s_nop instructions must be added if a branch is at 0x3f. We already do this, but forgot to also update the constant addresses that come after this instruction. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/gfx10: Fix PS exports for SPI_SHADER_32_AR.Timur Kristóf2019-10-211-1/+7
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco/gfx10: Wait for pending SMEM stores before loadsTimur Kristóf2019-10-212-1/+33
| | | | | | | | | | | | | | | | Currently if you have an SMEM store followed by an SMEM load that loads the same location as was written, it won't work because the store isn't finished before the load is executed. This is NOT mitigated by an s_nop instruction on GFX10. Since we currently don't have proper alias analysis, this commit adds a workaround which will insert an s_waitcnt lgkmcnt(0) before each SSBO load if they follow a store. We should further refine this in the future when we can make sure to only add the wait when we load the same thing as has been stored. Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* panfrost: Fix the DISCARD_WHOLE_RES case in transfer_map()Boris Brezillon2019-10-213-2/+63
| | | | | | | | | | | | | | | | | The current implementation does not synchronize on BO readiness when DISCARD_WHOLE_RES flag is set, which can lead to misbehaviours when the resource being updated is being used by one of the pending or already flushed batches. Adding unconditional BO synchronization would do the trick, but we can sometimes optimize this path by re-allocating a new BO instead of waiting for the existing one to be ready. Reported-by: Daniel Stone <[email protected]> Reported-by: Heinrich Fink <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* st/mesa: only require ESSL 3.1 for geometry shadersIago Toral Quiroga2019-10-211-1/+1
| | | | | | | | | According to the OES_geometry_shader spec, section Dependencies: "OpenGL ES 3.1 and OpenGL ES Shading Language 3.10 are required." Reviewed-by: Kristian H. Kristensen <[email protected]>
* egl/android: Remove our own reference to buffers.Lepton Wu2019-10-211-3/+1
| | | | | | | | | | | | | | | | | | | We currently doesn't maintain it correctly and the buffer gets leaked if surface is destroyed before calling swapping buffers. From Android frameworks/native/libs/nativewindow/include/system/window.h: The window holds a reference to the buffer between dequeueBuffer and either queueBuffer or cancelBuffer, so clients only need their own reference if they might use the buffer after queueing or canceling it. v2: Remove our own reference. Fixes: 0212db35040 ("egl/android: Cancel any outstanding ANativeBuffer in surface destructor") Reviewed-by: Chia-I Wu <[email protected]> (v1) Reviewed-By: Tapani Pälli <[email protected]> Signed-off-by: Lepton Wu <[email protected]>
* radv: advertise VK_KHR_spirv_1_4Samuel Pitoiset2019-10-212-1/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not dump descriptors twice in hang reportsSamuel Pitoiset2019-10-211-10/+15
| | | | | | | | If a pipeline has both graphics and compute, descriptors are same. While we are at it, use queue->device for simplicity. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: dump trace files earlier if a GPU hang is detectedSamuel Pitoiset2019-10-211-1/+2
| | | | | | | | To make sure a trace file is generated in case the driver crashes during the hang report generation (which happens sometimes). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: print which ring is dumped in hang reportsSamuel Pitoiset2019-10-211-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not print useless descriptors info in hang reportsSamuel Pitoiset2019-10-211-74/+1
| | | | | | | | This information has never been useful. All descriptors are already dumped with colors etc, and it's more useful. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: enable VK_KHR_shader_float_controls on GFX6-GFX7Samuel Pitoiset2019-10-212-4/+4
| | | | | | | Disable 16-bit features because fp16 isn't exposed on these chips. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* panfrost/ci: Update expectations listAlyssa Rosenzweig2019-10-202-215/+2
| | | | | | | | A bunch of blend tests fixed on T760. A single blend test regressed on both T760/T860 but I am unable to reproduce locally so am just documenting the regression and moving on. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement SIMD-aware dead code eliminationAlyssa Rosenzweig2019-10-201-8/+57
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We would like to eliminate not just entire dead instructions, but also dead components, which increases scheduler flexibility (since some vector instructions can become scalar after eliminating dead components). This also will allow better RA in the future. Results are meh. total instructions in shared programs: 3453 -> 3451 (-0.06%) instructions in affected programs: 60 -> 58 (-3.33%) helped: 2 HURT: 0 total bundles in shared programs: 1826 -> 1824 (-0.11%) bundles in affected programs: 33 -> 31 (-6.06%) helped: 2 HURT: 0 total quadwords in shared programs: 3144 -> 3144 (0.00%) quadwords in affected programs: 0 -> 0 helped: 0 HURT: 0 total registers in shared programs: 321 -> 321 (0.00%) registers in affected programs: 45 -> 45 (0.00%) helped: 11 HURT: 11 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 16.67% max: 50.00% x̄: 39.70% x̃: 50.00% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for registers value: -0.45 0.45 95% mean confidence interval for registers %-change: -1.87% 62.18% Inconclusive result (value mean confidence interval includes 0). total threads in shared programs: 445 -> 447 (0.45%) threads in affected programs: 2 -> 4 (100.00%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Create dependency graph bytewiseAlyssa Rosenzweig2019-10-201-12/+12
| | | | | | | This allows for vec16 dependencies in the scheduler, not that we have any yet (thankfully). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Handle nontrivial masks in texture RAAlyssa Rosenzweig2019-10-201-1/+1
| | | | | | The texture instruction has a mask we need to take into account. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement per-byte liveness trackingAlyssa Rosenzweig2019-10-201-3/+2
| | | | | | | Now that we have notion of byte masks, liveness tracking can be updated to reflect this extra granularity without loss of correctness. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Simplify mir_bytemask_of_read_componentsAlyssa Rosenzweig2019-10-201-18/+4
| | | | | | There are easy ways to iterate sources! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Report byte masks for read componentsAlyssa Rosenzweig2019-10-206-31/+31
| | | | | | | | | | | Read component masks don't have a particular type associated, since the type of the ALU operation may not match the type of the operands in question. So let's generate byte masks instead, and update the rest of the compiler to use byte masks when analyzing reads. Preparation for mixed types. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add helpers for manipulating byte masksAlyssa Rosenzweig2019-10-202-0/+177
| | | | | | | | | | | | | | | | | | | There are essentially two formats of masks in play beginning with this commit: masks per-channel and masks per-byte. The former make sense within a given fixed-size instruction; the latter are typesize-independent. It turns out you need the latter to meaningfully manipulate instructions containing multiple sizes (which is quite possible with ALU operations). Similarly, we have mir_srcsize. We calculate the size of the source by analyzing the size of the instruction itself and stepping down if there is a half-modifier. Finally, we have mir_round_bytemask_down, for when we want to take a byte mask and "round it down" to a given component size, so that we can use it as a component mask. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement OP_IS_STORE with tableAlyssa Rosenzweig2019-10-202-13/+2
| | | | | | ..rather than open-coding. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Tableize load/store opsAlyssa Rosenzweig2019-10-205-70/+102
| | | | | | | | | | This will allow us to encode properties about the load/store ops like we do for ALU ops. We include now properties about whether we have a store, and if there are special cases on the load/store op. We also tag each instruction by its natural size... this is probably not totally right, but it's a start. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Factor out mir_get_alu_srcAlyssa Rosenzweig2019-10-201-6/+8
| | | | | | | This helper is used in a bunch of places ... might as well make that common. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard/disasm: Fix printing 8-bit/16-bit masksAlyssa Rosenzweig2019-10-201-49/+30
| | | | | | | | | The trick is realizing even with a destination override, the masks are encoded in the same mode as the instruction itself, rather than stepping down. The override means that the smaller type is used, but the mask is parsed as if it were the higher type. Overriding down is down by printed by blinding doing this. Overriding up can be thought of as printing in the upper size, but shifting the alphabet to use the upper half, i.e. shifting xyzw to become abcd. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Identify 64-bit atomic opcodesAlyssa Rosenzweig2019-10-202-0/+20
| | | | | | They are symmetric to their 32-bit counterparts, just shifted. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Debug mir_insert_instruction_after_scheduledAlyssa Rosenzweig2019-10-201-2/+6
| | | | | | | | Add some comments explaining what's going on in a more natural flow in order to solve the actual bug. Signed-off-by: Alyssa Rosenzweig <[email protected]> Fixes: 2d914ebe818 ("pan/midgard: Fix memory corruption in register spilling")
* etnaviv: keep track of buffer valid ranges for PIPE_BUFFERChristian Gmeiner2019-10-203-2/+35
| | | | | | | | | | | | | | | | | | This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Such a situation can be triggered with the following API usage example: glBufferSubData(..., offset, size, data1); glDrawArrays(...); // append new vertex data glBufferSubData(..., offset+size, size, data2); glDrawArrays(...); Same is done for freedreno, nouveau and radeon. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* etnaviv: store updated usage in pipe_transfer objectChristian Gmeiner2019-10-201-8/+8
| | | | | | | Store the changed usage in the newly created transfer object. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Lucas Stach <[email protected]>
* etnaviv: fix code styleChristian Gmeiner2019-10-201-1/+2
| | | | | | Fixes: 1194afdfe35 ("etnaviv: rework the stream flush to always go through the context flush") Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>