aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/lima
Commit message (Collapse)AuthorAgeFilesLines
* lima/ppir: Add gl_PointCoord handlingAndreas Baierl2019-07-186-5/+34
| | | | | | | | | Treat gl_PointCoord as a system value and add the necessary bits for correct codegen. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima/gp: Fix problem with complex movesConnor Abbott2019-07-183-9/+125
| | | | | | | | | | | | | | | | | | | When writing the scheduler, we forgot that you can't read the complex unit in certain sources because it gets overwritten to 0 or 1. Fixing this turned out to be possible without giving up and reducing GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't expect. There can be at most 4 next-max nodes that can't have moves scheduled in the complex slot, so it actually isn't a problem for getting the number of next-max nodes at 5 or lower. However, it is a problem for stores. If a given node is a next-max node whose move cannot go in the complex slot *and* is used by a store that we decide to schedule, we have to reserve one of the non-complex slots for a move instead of all the slots, or we can wind up in a situation where only the complex slot is free and we fail the move. This means that we have to add another term to the reservation logic, for stores whose children cannot be in the complex slot. Acked-by: Qiang Yu <[email protected]>
* lima/gpir: Rework the schedulerConnor Abbott2019-07-189-560/+1187
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now, we do scheduling at the same time as value register allocation. The ready list now acts similarly to the array of registers in value_regalloc, keeping us from running out of slots. Before this, the value register allocator wasn't aware of the scheduling constraints of the actual machine, which meant that it sometimes chose the wrong false dependencies to insert. Now, we assign value registers at the same time as we actually schedule instructions, making its choices reflect reality much better. It was also conservative in some cases where the new scheme doesn't have to be. For example, in something like: 1 = ld_att 2 = ld_uni 3 = add 1, 2 It's possible that one of 1 and 2 can't be scheduled in the same instruction as 3, meaning that a move needs to be inserted, so the value register allocator needs to assume that this sequence requires two registers. But when actually scheduling, we could discover that 1, 2, and 3 can all be scheduled together, so that they only require one register. The new scheduler speculatively inserts the instruction under consideration, as well as all of its child load instructions, and then counts the number of live value registers after all is said and done. This lets us be more aggressive with scheduling when we're close to the limit. With the new scheduler, the kmscube vertex shader is now scheduled in 40 instructions, versus 66 before. Acked-by: Qiang Yu <[email protected]>
* lima/gp: Mark more add-only nodes as maybe-two-slotConnor Abbott2019-07-181-0/+8
| | | | Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir: Fix some bugs in instruction handlingConnor Abbott2019-07-181-0/+12
| | | | Reviewed-by: Qiang Yu <[email protected]>
* lima: Reintroduce the standalone compilerConnor Abbott2019-07-186-2/+351
| | | | | | I used this to test things without needing to have a device handy. Acked-by: Qiang Yu <[email protected]>
* lima/ppir: Fix assert condition in ppir_codegen_encode_branch.Vinson Lee2019-07-151-1/+1
| | | | | | | Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if") Reported-by: Coverity Scan Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: Fix branch codegenVasily Khoruzhick2019-07-143-3/+5
| | | | | | | | | | | | "unknown_2" field is actually a size of instruction that branch points to. If it's set to a smaller size than actual instruction branch behavior is not defined (and it usually wedges the GPU). Fix it by setting this field correctly. Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: Fix assert condition in ppir_codegen_encode_discardVasily Khoruzhick2019-07-141-1/+1
| | | | | | Fixes: af0de6b91c0b ("lima/ppir: implement discard and discard_if") Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: Fix compiler warnings for unused functions.Andreas Baierl2019-07-132-1/+3
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* nir: Add lower_rotate flag and set to true in all driversSagar Ghuge2019-07-011-0/+2
| | | | | | Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: remove fnot/fxor/fand/for opcodesJonathan Marek2019-06-262-7/+0
| | | | | | | | | | There doesn't seem to be any reason to keep these opcodes around: * fnot/fxor are not used at all. * fand/for are only used in lower_alu_to_scalar, but easily replaced Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* lima/ppir: Add fsat opAndreas Baierl2019-06-244-0/+20
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: Add fneg opAndreas Baierl2019-06-244-0/+19
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: Add fabs opAndreas Baierl2019-06-244-0/+20
| | | | | Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: lower ffma in ppirAndreas Baierl2019-06-241-0/+1
| | | | | | | Since we cannot handle ffma in ppir, lower it on nir level already. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima,panfrost: Move lima_tiling.c/h to /src/panfrostAlyssa Rosenzweig2019-06-205-236/+5
| | | | | | | | | | | This will allow both drivers to share this code. Both drivers build-tested with meson. Android build not tested. v2: Change naming from tiling->shared, in case Lima and Panfrost can share more in the future. Fix Android build system. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-and-tested-by: Qiang Yu <[email protected]>
* lima: lower fmod in ppir and gpirErico Nunes2019-06-161-0/+2
| | | | | | | | | | | | Since commit 4f3c82c72c5 fmod is no longer being lowered in nir, and ends up crashing lima programs with "unsupported nir_op: fmod" in both ppir and gpir. There seems to be no mod operation in hardware in utgard and there is an optimization in nir to lower fmod to instructions that lima already implements, so let's use that. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: fix dynarray usage in lima_submit_add_boErico Nunes2019-06-141-1/+1
| | | | | | | | | | Commit de8a919702a refactored dynarray usage and changed the size of the allocation in lima_submit_add_bo. That causes a segfault in programs running with lima. This commit restores the allocation size back to the previous size. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: change offset type to intMateusz Krzak2019-06-132-2/+2
| | | | | | | | | | | | Offset doesn't need to be 64-bit. This fixes compilation error with 64-bit off_t. Fixes: af0de6b9 lima/ppir: implement discard and discard_if Suggested-by: Qiang Yu <[email protected]> Signed-off-by: Mateusz Krzak <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Tested-by: Andreas Baierl <[email protected]>
* u_dynarray: turn util_dynarray_{grow, resize} into element-oriented macrosNicolai Hähnle2019-06-122-5/+5
| | | | | | | | | | | | | | | | | | | | | | | | The main motivation for this change is API ergonomics: most operations on dynarrays are really on elements, not on bytes, so it's weird to have grow and resize as the odd operations out. The secondary motivation is memory safety. Users of the old byte-oriented functions would often multiply a number of elements with the element size, which could overflow, and checking for overflow is tedious. With this change, we only need to implement the overflow checks once. The checks are cheap: since eltsize is a compile-time constant and the functions should be inlined, they only add a single comparison and an unlikely branch. v2: - ensure operations are no-op when allocation fails - in util_dynarray_clone, call resize_bytes with a compile-time constant element size v3: - fix iris, lima, panfrost Reviewed-by: Marek Olšák <[email protected]>
* lima/ppir: add missing handling of min/max ops for vec4 add slotVasily Khoruzhick2019-06-061-0/+6
| | | | | Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: fix crash when program uses no registers at allVasily Khoruzhick2019-06-061-0/+4
| | | | | | | | Program may need no regalloc at all, e.g. in case when program consists of single discard op. Signed-off-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* nir: copy intrinsic type when lowering load input/uniform and store outputJonathan Marek2019-06-031-0/+1
| | | | | | | | | Fixes: c1275052 "nir: add type information to load uniform/input and store output intrinsics" Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Tested-by: Erico Nunes <[email protected]> Tested-by: Andreas Baierl <[email protected]>
* nir: remove bool lowering from lower_int_to_floatJonathan Marek2019-05-311-0/+2
| | | | | | | | | | | | | | Removes the bool_to_float logic from the int_to_float pass, so that both can be used separately. By having separate passes we have better validation and it makes it possible to use with the lower_ftrunc option (int lowering generates ftrunc, but lower_ftrunc generates bools, ftrunc lowering should probably be reworked). For now we always expect lower_bool to come after lower_int. Also fixes f2i32 to become ftrunc and adds u2f/f2u cases. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add lower_bitshift optionJonathan Marek2019-05-311-0/+1
| | | | | | | | | Add a "lower_bitshift" option, which disables optimizations introducing bitshifts and lowers ishl by constant to a multiply, so that we don't have to deal with bitshifts in int_to_float lowering. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* lima/ppir: implement discard and discard_ifVasily Khoruzhick2019-05-277-10/+253
| | | | | | | | This commit also adds codegen for branch since we need it for discard_if. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: fix lima_blit with non-zero level source resourceQiang Yu2019-05-251-25/+12
| | | | | | | | | | | lima_blit will do blit between resources with different levels. When blit from a level!=0 source, it will sample from that level of resource as texture. Current texture setup won't respect level when not mipmap filter. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* lima: fix render to non-zero level textureQiang Yu2019-05-251-4/+6
| | | | | | | Current implementation won't respect level of surface to render. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* nir: Drop imov/fmov in favor of one mov instructionJason Ekstrand2019-05-242-3/+2
| | | | | | | | | | | | | | | | The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Rob Clark <[email protected]>
* lima/gpir: switch to use nir_lower_viewport_transformQiang Yu2019-05-204-101/+11
| | | | | | Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* lima/gpir: support vector ssa loadQiang Yu2019-05-202-5/+46
| | | | | | | | | Some vector sysval can't be lowered to scaler, so need to break it to scaler in nir to gpir convertion. Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* lima/gpir: add helper function for emit load nodeQiang Yu2019-05-201-20/+19
| | | | | | Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* gallium: Redefine the max texture 2d cap from _LEVELS to _SIZE.Eric Anholt2019-05-131-1/+2
| | | | | | | | The _LEVELS assumes that the max is always power of two. For V3D 4.2, we can support up to 7680 non-power-of-two MSAA textures, which will let X11 support dual 4k displays on newer hardware. Reviewed-by: Marek Olšák <[email protected]>
* lima: add Allwinner H5 supportPatrick Lerda2019-05-131-2/+20
| | | | | | | | The H5 hardware variant requires a specific plb_max_blk number. This value can't be probed at the hardware level. Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima: refactor plb_max_blkPatrick Lerda2019-05-135-11/+34
| | | | | | | | Move plb_max_blk to lima_screen, and add a new debug option: LIMA_PLB_MAX_BLK Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* nir: allow specifying a set of opcodes in lower_alu_to_scalarJonathan Marek2019-05-101-2/+2
| | | | | | | | | This can be used by both etnaviv and freedreno/a2xx as they are both vec4 architectures with some instructions being scalar-only. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* lima: fix width 4096 resolution GP failQiang Yu2019-05-101-1/+1
| | | | | | | | When width=4096 and shift_w=0, block_w=0x100 which overflow the PLBU_CMD 8 bits for it. Reviewed-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
* lima: fix tile buffer reloadingVasily Khoruzhick2019-05-092-2/+4
| | | | | | | | | | Buffer needs to be reloaded every time unless explicit clear() was called. Fixes rendering issues with wayland compositors. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: enable sin and cos lowering for GPVasily Khoruzhick2019-05-071-0/+1
| | | | | | | | GP doesn't support sin/cos natively, so we have to lower them. Reviewed-by: Qiang Yu <[email protected]> Tested-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* nir: nir_shader_compiler_options: drop native_integersChristian Gmeiner2019-05-071-2/+0
| | | | | | | | Driver which do not support native integers should use a lowering pass to go from integers to floats. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* lima/gpir: enable lowering for ftruncVasily Khoruzhick2019-05-071-0/+1
| | | | | Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: implement nir_op_fmovVasily Khoruzhick2019-05-071-0/+1
| | | | | Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: use int_to_float lowering passVasily Khoruzhick2019-05-071-2/+6
| | | | | | | | Neither GP nor PP in Mali4x0 support integers, so utilize new pass and set native_integers to true for now until this flag is dropped. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: fix float uniform alignment issueVasily Khoruzhick2019-05-061-2/+1
| | | | | | | | | If PIPE_CAP_PACKED_UNIFORMS is not set uniforms are vec4 aligned, so lima_nir_lower_uniform_to_scalar should use first channel of vec4 for float uniforms. Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/ppir: abort compilation in case of unsupported intrinsicErico Nunes2019-05-061-2/+4
| | | | | | | | | | | | | Currently ppir continues compilation when there is an unsupported intrinsic, resulting in a shader that will surely not work as intended. This is a problem during piglit runs as some tests don't compile properly due to this but actually still get submitted to the gpu and leave the system in an unstable state after executing, causing further tests to fail. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ir: print names of unsupported intrinsicsErico Nunes2019-05-062-2/+4
| | | | | | | | | While lima still doesn't support some kinds of intrinsics, it is more helpful to display the name of the unsupported instr->intrinsic to make debugging easier. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: support nir_op_ftruncErico Nunes2019-05-023-0/+14
| | | | | | | | Support nir_op_ftrunc by turning it into a mov with a round to integer output modifier. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/gpir: add limit of max 512 instructionsErico Nunes2019-05-021-0/+6
| | | | | | | | | | | It has been noted that the lima GP has a limit of 512 instructions, after which the shaders don't work and fail silently. This commit adds a check to make the shader compilation abort when the shader exceeds this limit, so that we get a clear reason for why the program will not work. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: fix pointer referenced after a freePatrick Lerda2019-04-291-1/+2
| | | | | | | | | Issue detected by valgrind. Fixes: 92d7ca4b1cd ("gallium: add lima driver") Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>