mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	lima: add missing fallthrough comments	Timothy Arceri	2020-07-10	1	-0/+4
\| \| \| \| \|	Reviewed-by: Erico Nunes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5799>
*	lima/gpir: fix crash in schedule_insert_ready_list()	Vasily Khoruzhick	2020-03-16	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \|	Fix crash if node is already at position we want. Otherwise we remove it from list (and list->prev becomes NULL) and then we dereference list->prev in list_addtail() Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4126>
*	lima/gpir: kill dead writes to regs in DCE	Vasily Khoruzhick	2020-03-16	1	-0/+28
\| \| \| \| \| \| \| \| \| \| \|	Writes to regs that are never read will confuse regalloc since they are never live and don't conflict with any regs. Kill them to prevent overwriting another live reg. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>
*	lima/gpir: Optimize nots created from branch lowering	Connor Abbott	2020-03-16	1	-0/+67
\| \| \| \| \| \| \| \| \|	We also add a DCE pass to cleanup the result of this pass, which turns out to also be necessary to cleanup the result of nir->gpir in some cases that we didn't hit until the next commit. Reviewed-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>
*	lima/gpir: Optimize conditional break/continue	Connor Abbott	2020-03-16	3	-0/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Optimize the result of a conditional break/continue. In NIR something like: loop { ... if (cond) continue; would get lowered to: block_0: ... block_1: branch_cond !cond block_3 block_2: branch_uncond block_0 block_3: ... We recognize the conditional branch skipping over the unconditional branch, and turn it into: block_0: ... block_1: branch_cond cond block_0 block_2: block_3: Reviewed-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>
*	lima/gpir: Make lima_gpir_node_insert_child() useful	Connor Abbott	2020-03-16	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \|	We weren't using this function before. The name is confusing, but it changes the child while also fixing up the dependence link, if you don't have access to it already. Or at least, I think that's what the intention is, and what we'll need to change the branch condition in the next commit. Adding a dependency between the new and old source doesn't make any sense for this, and we also need to change the actual source. Reviewed-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4125>
*	lima/gpir: print acc ops even if we have only one source	Vasily Khoruzhick	2020-03-11	1	-4/+2
\| \| \| \| \| \| \| \| \| \|	floor and sign have only one source, so we need to print acc ops even if src1 is unused. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4110> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4110>
*	lima/gpir: improve disassembler output	Vasily Khoruzhick	2020-03-11	1	-45/+78
\| \| \| \| \| \| \| \| \| \|	Print each op at new line and add unit name suffix for each op. It improves readability a bit and gives us a hint what unit was used for particular op. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4110>
*	lima: gpir: enforce instruction limit earlier	Vasily Khoruzhick	2020-03-06	2	-9/+8
\| \| \| \| \| \| \| \| \| \| \|	Enforce instruction limit of 512 instructions earlier. This is a workaround for infinite loops in gpir compiler and allows us to pin point the tests that are affected. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4055> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4055>
*	util: Remove tmp argument from BITSET_FOREACH_SET macro	Matt Turner	2020-01-23	1	-8/+5
\| \| \| \| \| \|	Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
*	lima: add support for gl_PointSize	Vasily Khoruzhick	2019-11-05	1	-3/+16
\| \| \| \| \| \| \| \| \| \| \| \| \|	GP handles gl_PointSize similar to gl_Position, i.e. it needs separate buffer and it has special type in varying descriptors, also for indexed draw we need to emit special PLBU command to pass address of gl_PointSize buffer. Blob also clamps gl_PointSize to 1 .. 100 (as well as line width), so let's do the same. Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	util: rename list_empty() to list_is_empty()	Timothy Arceri	2019-10-28	3	-4/+4
\| \| \| \| \| \| \|	This makes it clear that it's a boolean test and not an action (eg. "empty the list"). Reviewed-by: Eric Engestrom <[email protected]>
*	lima/gpir: Fix 64-bit shift in scheduler spilling	Connor Abbott	2019-09-24	1	-2/+2
\| \| \| \| \| \|	There are 64 physical registers so the shift must be 64 bits. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Don't emit movs when translating from NIR	Connor Abbott	2019-09-24	1	-36/+50
\| \| \| \| \| \| \| \| \| \| \|	The scheduler doesn't expect them. To do this, I had to refactor the registration part of gpir_node_create_dest() to be separate from creating and inserting the node, since the last two now aren't done when handling moves. This adds more code but creates the possibility of automatically inserting input dependencies when inserting nodes, similar to what's done in NIR with the use-def lists (this isn't done yet). Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix postlog2 fixup handling	Connor Abbott	2019-09-24	1	-11/+12
\| \| \| \| \| \| \| \| \| \| \| \|	We guarantee that a complex1 op is always used by postlog2 directly by rewriting the postlog2 op to be a move when there would be a move inserted between them. But we weren't doing this in all circumstances where there might be a move. Move the logic to place_move() so that it always happens. Fixes a few log tests that happened to start failing due to changes in the register allocator leading to a different scheduling order. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Use registers for values live in multiple blocks	Connor Abbott	2019-09-24	7	-156/+648
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This commit adds the framework for cross-basic-block register allocation. Like ARM's compiler, we assume that the value registers aren't usable across branches, which means we have to use physical registers to store any value that crosses a basic block. There are three parts to this: 1. When translating from NIR, we rely on the NIR out-of-ssa pass to coalesce values into registers. We insert store_reg instructions for values used in more than one basic block, and load_reg instructions for values not defined in the same basic block (or defined after their use, for loops). So by the time we've translated out of NIR we've already split things into values (which are only used in the same basic block) and registers (which are only used in different basic blocks than where they're defined). 2. We allocate the registers at the same time that we allocate the values, before the final scheduler. Unlike the values, where the assigned color is fake, we assign the actual physical index & component to physregs at this stage. load_reg and store_reg are treated as moves in the allocator and when creating write-after-read dependencies. 3. Finally, in the main scheduler we have to avoid overwriting existing live physregs when spilling. First, we have to tell the scheduler which physical registers are live at the end of each block, to avoid overwriting those. If a register is only live at the beginning, we can reuse it for spilling after the last original use in the final program happens, i.e. before any original use is scheduled, but we have to be careful to add the proper dependencies so that the spill write is scheduled before the original reads. To handle this we repurpose reg_link for uses to be used by the scheduler. A few register-related things copied over from NIR or from other drivers can be dropped. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Support branch instructions	Connor Abbott	2019-09-24	6	-78/+102
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Because branch conditions have to be in the pass slot, there is no unconditional branch, and realistically the pass slot has to contain a move when branching (there's nothing it does that would be useful for operating on booleans, so we can't use it for anything when computing the branch condition), we put the branch instruction in the pass slot and at codegen time turn it into a move of the branch condition. This means that it doesn't have to be special-cased like store instructions are in the scheduler. Because of this decision we can remove the half-implemented BRANCH codegen slot. Finally, we (ab)use the existing schedule_first mechanism to make sure that branches are always last in the basic block. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Only try to place actual children	Connor Abbott	2019-09-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \|	When picking a node to be scheduled, we try to schedule its children as well. But we shouldn't try to schedule nodes which only have a fake dependency on the original node, since this isn't the point of scheduling children at the same time and can break some expectations of the rest of the code. Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix compiler warning	Connor Abbott	2019-09-24	1	-1/+1
\| \| \| \|	Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: fix warning in gpir disassembler	Vasily Khoruzhick	2019-09-09	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \|	Fixes following warning: ../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’: ../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds] 241 \| "xyzw"[src - gpir_codegen_src_attrib_x]); Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Disallow moves for schedule_first nodes	Connor Abbott	2019-09-09	1	-1/+5
\| \| \| \| \| \| \| \| \| \| \| \|	The entire point of schedule_first is that the node has to be scheduled as soon as possible without any moves because it doesn't produce a proper floating-point value, or its value changes depending on where you read it. We were still introducing a move for preexp2 in some cases though, even if it got scheduled as soon as possible, which broke some exp() tests. Fix that. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix fake dep handling for schedule_first nodes	Connor Abbott	2019-09-09	2	-10/+30
\| \| \| \| \| \| \| \| \| \| \| \| \|	The whole point of schedule_first nodes is that they need to be scheduled as soon as possible, so if a schedule_first node is the successor in a fake dependency that prevents it from being scheduled after its parent, that can cause problems. We need to add these fake dependencies to the parent as well, and we need to guarantee that the pre-RA scheduler puts schedule_first nodes right before their parents in order to prevent this from adding cycles to the dependency graph. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Fix schedule_first insertion logic	Connor Abbott	2019-09-09	1	-2/+3
\| \| \| \| \| \| \| \| \| \| \|	The idea was to make sure schedule_first nodes were always first in the ready list. I made sure they were inserted first, but not that other nodes wouldn't later be scheduled ahead of them. Fixes [email protected]@execution@built-in-functions@vs-exp-float and probably others. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Ignore unscheduled successors in can_use_complex()	Connor Abbott	2019-09-09	1	-1/+2
\| \| \| \| \| \| \| \| \| \|	The point of the function is to avoid creating a complex move which is used by certain slots in the next instruction, but unscheduled successors will never be in the next instruction. Found while debugging a crash that the previous commit fixed. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: Do all lowerings before rsched	Connor Abbott	2019-09-09	3	-23/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The scheduler assumes that load nodes are always duplicated so that they can always be scheduled eventually and therefore they never need to be spilled. But some lowerings were running after the pre-RA scheduler, whereas duplication has to happen before then since it's needed for the scheduler to do a better job reducing register pressure. This meant that lowerings were introducing multiple uses of a load instruction, which broke the scheduler's expectation and resulted in infinite loops in situations where the only nodes available to spill were load nodes. Spilling load nodes would be silly, so we want to fix the lowerings rather than the scheduler. Just do all lowerings before the pre-RA scheduler, which also helps with reducing pressure since the scheduler can more accurately compute the pressure. Fixes lima/mesa#104. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
*	lima: fix pipe_debug_callback warnings	Erico Nunes	2019-08-06	1	-1/+1
\| \| \| \| \|	Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima: add summary report for shader-db	Erico Nunes	2019-08-06	3	-1/+34
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	Very basic summary, loops and gpir spills:fills are not updated yet and are only there to comply with the strings to shader-db report.py regex. For now it can be used to analyze the impact of changes in instruction count in both gpir and ppir. The LIMA_DEBUG=shaderdb setting can be useful to output stats on applications other than shader-db. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
*	tree-wide: replace MAYBE_UNUSED with ASSERTED	Eric Engestrom	2019-07-31	1	-3/+3
\| \| \| \| \| \|	Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Matt Turner <[email protected]>
*	lima/gp: Support exp2 and log2	Connor Abbott	2019-07-30	5	-30/+147
\| \| \| \| \| \| \| \| \| \| \|	log2 is tricky because there cannot be a move between complex1 and postlog2. We can't guarantee that scheduling complex1 will succeed when we schedule postlog2, so we try to schedule complex1 and if it fails we back out by rewriting the postlog2 as a move and introducing a new postlog2 so that we can try again later. Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
*	lima/gpir: Always schedule complex2 and *_impl right after complex1	Connor Abbott	2019-07-30	4	-15/+47
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	See https://gitlab.freedesktop.org/lima/mesa/issues/94 for the gory details of why this is needed. For *_impl this is easy, since it never increases register pressure and it goes in the complex slot hence it never counts against max nodes. It's a bit more challenging for complex2, since it does count against max nodes, so we need to change the reservation logic to reserve an extra slot for complex2 when scheduling complex1. This second part isn't strictly necessary yet, but it will be for exp2. Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
*	lima/gpir/sched: Handle more special ops in can_use_complex()	Connor Abbott	2019-07-28	1	-5/+24
\| \| \| \| \| \| \| \| \| \| \| \| \|	We were missing handling for a few other ops that rearrange their sources somehow in codegen, namely complex2 and select. This should fix [email protected]@execution@built-in-functions@vs-asin-vec3 and possibly other random regressions from the new scheduler which were supposed to be fixed in the commit right after. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Signed-off-by: Connor Abbott <[email protected]> Acked-by: Qiang Yu <[email protected]>
*	lima/gpir/sched: Don't try to spill when something else has succeeded	Connor Abbott	2019-07-28	1	-7/+4
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	In try_node(), we assume that the node we pick can still be scheduled successfully after speculatively trying all the other nodes. Normally we always undo every node after speculating it, so that when we finally schedule best_node the scheduler state is exactly the same and it succeeds. However, we also try to spill nodes, which can change the state and in a corner case that can make scheduling best_node fail. In particular, the following sequence of events happened with piglit shaders@glsl-vs-if-nested: a partially-ready node N was spilled and a register store node S, which is a use of N, was created and then later the other uses of N were scheduled, so that S is now ready and N is partially ready. First we try to schedule S and succeed, then we try to schedule another node M, which fails, so we try to spill the remaining uses of N. This succeeds, but scheduling M still fails so that best_node is still S. However since one of the uses of N is one cycle ago, and therefore we inserted a read dependent on S one cycle ago when spilling N, S can no longer be scheduled as read-after-write latency is three cycles. While we could ad-hoc try to catch cases like this, or (the best option but very complicated) treat the spill as speculative and roll it back if we decide not to schedule the node, a simpler solution is to just give up on spilling if we've already successfully speculatively scheduled another node. We'd give up a few cases where we discover that by spilling even harder we could schedule a more desirable node, but that seems like it would be pretty rare in practice. With this we guarantee that nothing has been touched after best_node was successfully scheduled. We also cut down on pointless spilling, since if we already scheduled a node it's unlikely that spilling harder will let us schedule an even better node, and hence any spilling at this point is probably useless. While we're here, clean up the code around spilling by flattening the two if's and getting rid of the second unnecessary check for INT_MIN. Fixes: 54434fe6706 ("lima/gpir: Rework the scheduler") Acked-by: Qiang Yu <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
*	lima/gp: Fix problem with complex moves	Connor Abbott	2019-07-18	3	-9/+125
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When writing the scheduler, we forgot that you can't read the complex unit in certain sources because it gets overwritten to 0 or 1. Fixing this turned out to be possible without giving up and reducing GPIR_VALUE_REG_NUM to 10, although it was difficult in a way I didn't expect. There can be at most 4 next-max nodes that can't have moves scheduled in the complex slot, so it actually isn't a problem for getting the number of next-max nodes at 5 or lower. However, it is a problem for stores. If a given node is a next-max node whose move cannot go in the complex slot and is used by a store that we decide to schedule, we have to reserve one of the non-complex slots for a move instead of all the slots, or we can wind up in a situation where only the complex slot is free and we fail the move. This means that we have to add another term to the reservation logic, for stores whose children cannot be in the complex slot. Acked-by: Qiang Yu <[email protected]>
*	lima/gpir: Rework the scheduler	Connor Abbott	2019-07-18	8	-558/+1186
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Now, we do scheduling at the same time as value register allocation. The ready list now acts similarly to the array of registers in value_regalloc, keeping us from running out of slots. Before this, the value register allocator wasn't aware of the scheduling constraints of the actual machine, which meant that it sometimes chose the wrong false dependencies to insert. Now, we assign value registers at the same time as we actually schedule instructions, making its choices reflect reality much better. It was also conservative in some cases where the new scheme doesn't have to be. For example, in something like: 1 = ld_att 2 = ld_uni 3 = add 1, 2 It's possible that one of 1 and 2 can't be scheduled in the same instruction as 3, meaning that a move needs to be inserted, so the value register allocator needs to assume that this sequence requires two registers. But when actually scheduling, we could discover that 1, 2, and 3 can all be scheduled together, so that they only require one register. The new scheduler speculatively inserts the instruction under consideration, as well as all of its child load instructions, and then counts the number of live value registers after all is said and done. This lets us be more aggressive with scheduling when we're close to the limit. With the new scheduler, the kmscube vertex shader is now scheduled in 40 instructions, versus 66 before. Acked-by: Qiang Yu <[email protected]>
*	lima/gp: Mark more add-only nodes as maybe-two-slot	Connor Abbott	2019-07-18	1	-0/+8
\| \| \| \|	Reviewed-by: Qiang Yu <[email protected]>
*	lima/gpir: Fix some bugs in instruction handling	Connor Abbott	2019-07-18	1	-0/+12
\| \| \| \|	Reviewed-by: Qiang Yu <[email protected]>
*	nir: remove fnot/fxor/fand/for opcodes	Jonathan Marek	2019-06-26	1	-3/+0
\| \| \| \| \| \| \| \| \| \|	There doesn't seem to be any reason to keep these opcodes around: * fnot/fxor are not used at all. * fand/for are only used in lower_alu_to_scalar, but easily replaced Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
*	nir: Drop imov/fmov in favor of one mov instruction	Jason Ekstrand	2019-05-24	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	The difference between imov and fmov has been a constant source of confusion in NIR for years. No one really knows why we have two or when to use one vs. the other. The real reason is that they do different things in the presence of source and destination modifiers. However, without modifiers (which many back-ends don't have), they are identical. Now that we've reworked nir_lower_to_source_mods to leave one abs/neg instruction in place rather than replacing them with imov or fmov instructions, we don't need two different instructions at all anymore. Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Acked-by: Rob Clark <[email protected]>
*	lima/gpir: switch to use nir_lower_viewport_transform	Qiang Yu	2019-05-20	3	-101/+10
\| \| \| \| \| \|	Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
*	lima/gpir: support vector ssa load	Qiang Yu	2019-05-20	2	-5/+46
\| \| \| \| \| \| \| \| \|	Some vector sysval can't be lowered to scaler, so need to break it to scaler in nir to gpir convertion. Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
*	lima/gpir: add helper function for emit load node	Qiang Yu	2019-05-20	1	-20/+19
\| \| \| \| \| \|	Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Signed-off-by: Qiang Yu <[email protected]>
*	lima/gpir: implement nir_op_fmov	Vasily Khoruzhick	2019-05-07	1	-0/+1
\| \| \| \| \|	Reviewed-by: Qiang Yu <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
*	lima/ir: print names of unsupported intrinsics	Erico Nunes	2019-05-06	1	-1/+2
\| \| \| \| \| \| \| \| \|	While lima still doesn't support some kinds of intrinsics, it is more helpful to display the name of the unsupported instr->intrinsic to make debugging easier. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
*	lima/gpir: add limit of max 512 instructions	Erico Nunes	2019-05-02	1	-0/+6
\| \| \| \| \| \| \| \| \| \| \|	It has been noted that the lima GP has a limit of 512 instructions, after which the shaders don't work and fail silently. This commit adds a check to make the shader compilation abort when the shader exceeds this limit, so that we get a clear reason for why the program will not work. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
*	nir: make nir_const_value scalar	Karol Herbst	2019-04-14	1	-1/+1
\| \| \| \| \| \| \| \| \|	v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v2)
*	lima: use nir_src_as_float	Karol Herbst	2019-04-14	1	-4/+1
\| \| \| \| \| \|	Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
*	lima/gpir: fix alu check miss last store slot	Qiang Yu	2019-04-14	1	-2/+2
\| \| \| \| \| \|	Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima/gpir: fix compile fail when two slot node	Qiang Yu	2019-04-14	3	-3/+25
\| \| \| \| \| \| \| \|	Come from glmark2-es2 jellyfish test. Fixes: 92d7ca4b1cd "gallium: add lima driver" Signed-off-by: Qiang Yu <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]>
*	lima: lower bool to float when building shaders	Icenowy Zheng	2019-04-12	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Both processors of Mali Utgard are float-only, so bool are not acceptable data type of them. Fortunately the NIR compiler infrastructure has a lower pass to lower bool to float. Call this lower pass to lower bool to float for both GP and PP. This makes Glamor on Xorg server 1.20.3 at least doesn't hang when starting gtk3-demo. The old map of nir op bcsel is changed to fcsel, and the map of b2f32 in PP is dropped because it's not needed now (it's originally only mapped to ppir_op_mov). Signed-off-by: Icenowy Zheng <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
*	gallium: add lima driver	Qiang Yu	2019-04-11	12	-0/+5010
	v2: - use renamed util_dynarray_grow_cap - use DEBUG_GET_ONCE_FLAGS_OPTION for debug flags - remove DRM_FORMAT_MOD_ARM_AGTB_MODE0 usage - compute min/max index in driver v3: - fix plbu framebuffer state calculation - fix color_16pc assemble - use nir_lower_all_source_mods for lowering neg/abs/sat - use float arrary for static GPU data - add disassemble comment for static shader code - use drm_find_modifier v4: - use lima_nir_lower_uniform_to_scalar v5: - remove nir_opt_global_to_local when rebase Cc: Rob Clark <[email protected]> Cc: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Anholt <[email protected]> Signed-off-by: Andreas Baierl <[email protected]> Signed-off-by: Arno Messiaen <[email protected]> Signed-off-by: Connor Abbott <[email protected]> Signed-off-by: Erico Nunes <[email protected]> Signed-off-by: Heiko Stuebner <[email protected]> Signed-off-by: Koen Kooi <[email protected]> Signed-off-by: Marek Vasut <[email protected]> Signed-off-by: marmeladema <[email protected]> Signed-off-by: Paweł Chmiel <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Rohan Garg <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]> Signed-off-by: Qiang Yu <[email protected]>