summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel/eu/gen12: Add tracking of default SWSB state to the current ↵Francisco Jerez2019-10-113-0/+18
| | | | | | brw_codegen instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu/gen12: Add auxiliary type to represent SWSB information during codegen.Francisco Jerez2019-10-111-0/+148
| | | | | | v2: Introduce extra tgl_swsb_sbid() constructor (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/fs/gen12: Add codegen support for the SYNC instruction.Francisco Jerez2019-10-114-3/+19
| | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/ir/gen12: Add SYNC hardware instruction.Francisco Jerez2019-10-113-0/+3
| | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Don't set thread control, it's gone.Francisco Jerez2019-10-111-2/+4
| | | | | | | | An effect similar to the one formerly provided by setting thread control to "switch" can be achieved now by setting a RegDist of 1 on the SWSB field. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Don't set DD control, it's gone.Francisco Jerez2019-10-112-6/+12
| | | | | | | | A future lowering pass will simulate the same behavior originally provided by NoDDChk/NoDDClr at the IR level by using appropriate SWSB annotations. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Use SEND instruction for split sends.Francisco Jerez2019-10-112-2/+3
| | | | | | | The new SEND instruction behaves like the former SENDS instruction. The original single-payload SEND instruction is gone. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Codegen SEND descriptor regions correctly.Francisco Jerez2019-10-112-6/+14
| | | | | | | | The SEND instruction is now four-source. The descriptor is no longer part of source 1, so avoid touching it to avoid corruption while initializing the descriptor. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Codegen pathological SEND source and destination regions.Francisco Jerez2019-10-111-7/+39
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Codegen control flow instructions correctly.Francisco Jerez2019-10-111-6/+9
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Codegen three-source instruction source and destination regions.Francisco Jerez2019-10-112-24/+42
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Fix codegen of immediate source regions.Francisco Jerez2019-10-111-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Add Gen12 opcode descriptions to the table.Francisco Jerez2019-10-111-24/+47
| | | | | | | | | | Quite a lot of churn because the encoding of most hardware opcodes has changed unfortunately. v2: Split dot-product description fixes to separate patch (Caio). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen11+: Mark dot product opcodes as unsupported on opcode_descs table.Francisco Jerez2019-10-111-4/+4
| | | | | | These instructions have been removed from the hardware. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu/gen12: Implement datatype binary encoding.Francisco Jerez2019-10-111-7/+55
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Implement immediate 64 bit constant encoding.Sagar Ghuge2019-10-111-2/+13
| | | | | | | | | | | | On Gen12, 64 bit immediate constants are loaded in reverse order. Lower 32 bit gets loaded from bit 96-127 and higher 32 bits from 64-95 in instruction encoding. Signed-off-by: Sagar Ghuge <[email protected]> Co-authored-by: Matt Turner <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/eu/gen12: Implement compact instruction binary encoding.Francisco Jerez2019-10-111-39/+49
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/eu/gen12: Implement indirect region binary encoding.Francisco Jerez2019-10-111-8/+15
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/eu/gen12: Implement SEND instruction binary encoding.Francisco Jerez2019-10-111-69/+135
| | | | | | | | | v2: Fix off-by-one upper GET_BITS() bound, combine 25-29 and 30-31 descriptor fields (Ken). Shorten name of GEN12_MD() macro, drop some removed TS message descriptor fields (Jordan). Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Implement control flow instruction binary encoding.Francisco Jerez2019-10-111-0/+6
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Implement three-source instruction binary encoding.Francisco Jerez2019-10-111-67/+85
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Implement basic instruction binary encoding.Francisco Jerez2019-10-111-47/+51
| | | | | Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Add sanity-check asserts to brw_inst_bits() and ↵Francisco Jerez2019-10-111-0/+2
| | | | | | | | | | brw_inst_set_bits(). These caught a few bugs during the development of this series. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen12: Extend brw_inst.h macros for Gen12 support.Francisco Jerez2019-10-111-202/+346
| | | | | | | | | | | | | | | | The encoding of almost every instruction field has changed in Gen12, so this involves adding a Gen12+ bitfield spec to every brw_inst macro. In addition some new macros are required to handle certain discontiguous and variable-length fields. This commit doesn't actually include the Gen12 updated bitfield specs, only the macros are extended here for reviewability. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> v2: Rename FDC() to FFDC() and FDC1() to FDC() for consistency with the existing F() and FF() macros.
* intel/ir: Represent physical edge of unconditional CONTINUE instruction.Francisco Jerez2019-10-111-0/+2
| | | | | | | | | | This edge doesn't exist in the original scalar program, but it represents a potential control flow path the EU will take in cases where control flow isn't uniform across channels of the same SIMD thread. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/ir: Represent physical edge of ELSE instruction.Francisco Jerez2019-10-111-0/+1
| | | | | | | | | | This edge doesn't exist in the original scalar program, but it represents a potential control flow path the EU will take in cases where the condition isn't uniform across channels of the same SIMD thread. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/ir: Represent logical edge of BREAK instruction.Francisco Jerez2019-10-111-0/+1
| | | | | | | | | | Currently only the physical back-edge is represented, which incidentally also leads to the exit block of the loop, but we need the direct logical edge in addition for our logical CFG representation to be complete. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/ir: Add helper function to push block onto CFG analysis stack.Francisco Jerez2019-10-111-4/+13
| | | | | Requested-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/ir: Represent physical and logical subsets of the CFG.Francisco Jerez2019-10-113-40/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This represents two control flow graphs in the same cfg_t data structure: The physical CFG that will include all possible control flow paths the EU can physically take, and the logical CFG restricted to the control flow paths that exist in the original scalar program. The latter is a subset of the former because in case of divergence the SIMD vectorized program will take control flow paths that aren't part of the original scalar program. The bblock_link constructor and bblock_t::add_successor() now take a "kind" parameter that specifies whether the edge is purely physical or whether it's part of both the logical and physical CFGs (a logical edge is of course always guaranteed to be in the physical CFG as well). bblock_t::is_predecessor_of() and ::is_successor_of() also take a kind parameter specifying which CFG is being queried. The '~>' notation will be used now in order to represent purely physical edges in IR dumps. This commit doesn't actually add nor remove any edges from the CFG (the only edges marked as purely physical here are the two WHILE loop ones that already existed). Optimization passes should continue using the same (incomplete) physical CFG they were using before until they're fixed to do something smarter in a later commit, so this shouldn't lead to any functional changes. v2: Remove tabs from lines changed in this file (Caio). Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/ir: Drop hard-coded correspondence between IR and HW opcodes.Francisco Jerez2019-10-112-95/+85
| | | | | | | | | | | | | Having the IR opcodes locked to their hardware representation is risky because it causes opcodes as different as BRC and IFF to compare equal at the IR level (luckily the back-end only ever uses one opcode from each group, right now), and it prevents us from supporting instructions that change their hardware representation across generations, which will become a problem on Gen12+ platforms. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Encode and decode native instruction opcodes from/to IR opcodes.Francisco Jerez2019-10-117-15/+41
| | | | | | | | | | | | Change brw_inst_set_opcode() and brw_inst_opcode() to call brw_opcode_encode/decode() transparently in order to translate between hardware and IR opcodes, and update the EU compaction code in order to do the same as needed, so we can eventually drop the one-to-one correspondence between hardware and IR opcodes. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Rework opcode description tables to allow efficient look-up by ↵Francisco Jerez2019-10-115-304/+166
| | | | | | | | | | | | | | | | | | | | either HW or IR opcode. This rewrites the current opcode description tables as a more compact flat data structure. The purpose is to allow efficient constant-time look-up by either HW or IR opcode, which will allow us to drop the hard-coded correspondence between HW and IR opcodes -- See the next commits for the rationale. brw_eu.c is now built as C++ source so we can take advantage of pointers to member in order to make the look-up function work regardless of the opcode_desc member used as look-up key. v2: Optimize devinfo struct comparison (Caio) Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Fix up various type conversions in brw_eu.c that are illegal C++.Francisco Jerez2019-10-114-13/+13
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C).Francisco Jerez2019-10-115-32/+46
| | | | | | | | | | | | | | The brw_inst opcode accessors are going away in one of the following commits. We could potentially replace them with the new helpers that do opcode remapping, but that would lead to a circular dependency between brw_inst.h and brw_eu.h. This way we also avoid ordering issues that can cause the semantics of the ex_desc accessors to change depending on whether the ex_desc field is set after or before the opcode instruction field. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Fix constness of implied_mrf_writes() argument.Francisco Jerez2019-10-112-2/+2
| | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Define is_send() convenience IR helper.Francisco Jerez2019-10-111-1/+7
| | | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Define is_payload() method of the IR instruction class.Francisco Jerez2019-10-112-0/+39
| | | | | | | | | | | This is required because SEND message payload sources are fetched asynchronously by the hardware, which can lead to WaR data corruption on Gen12+ platforms if not handled specially by the compiler to guarantee proper synchronization. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Teach fs_inst::is_send_from_grf() about some missing send-like ↵Francisco Jerez2019-10-111-0/+3
| | | | | | | | instructions. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nir/dead_cf: Remove dead control flow after infinite loops.Bas Nieuwenhuizen2019-10-111-0/+7
| | | | | | | | | | | | | And after discard-only loops. Otherwise we end up with dead code which confuses nir_repair_ssa into adding a whole bunch of uses of undefined. However, for derefs, we sometimes always expect to get a variable instead of undefined. Fixes dEQP-VK.graphicsfuzz.write-red-in-loop-nest on radv. Fixes: c832820ce95 "nir/dead_cf: Repair SSA if the pass makes progress" Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1928 Reviewed-by: Connor Abbott <[email protected]>
* aco: don't use p_as_uniform for vgpr sampler/image indicesRhys Perry2019-10-111-1/+3
| | | | | | | | p_as_uniform can get CSE'd, which can be incorrect and break some dEQP-VK.descriptor_indexing.* tests. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: implement divergent vulkan_resource_indexRhys Perry2019-10-112-4/+14
| | | | | | | | | Fixes the UBO/SSBO dEQP-VK.descriptor_indexing.* tests v2: remove bld.copy() usage Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: readfirstlane vgpr pointers in convert_pointer_to_64_bit()Rhys Perry2019-10-111-0/+2
| | | | | | | | | | | This can happen when bcsel is used between the results of two vulkan_resource_index. It's also probably needed for non-uniform descriptor indexing Fixes dEQP-VK.spirv_assembly.instruction.compute.variable_pointers.compute.reads_opselect_two_buffers Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: use can_accept_constant in valu_can_accept_literalRhys Perry2019-10-111-7/+8
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: don't apply sgprs/constants to read/write lane instructionsRhys Perry2019-10-111-1/+11
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* nir/lower_input_attachments: pass on non-uniform access flagRhys Perry2019-10-111-0/+2
| | | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/lower_non_uniform: lower image/texture instructions taking derefsRhys Perry2019-10-111-10/+88
| | | | | | | | | v2: always assert on the texture/sampler handle's num_components v3: replicate the deref inside the loop v4: remove a case of useless line wrapping Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* etnaviv: rework etna_resource_create tiling choiceJonathan Marek2019-10-111-40/+26
| | | | | | | | | | Now that the base resource is allowed to be incompatible with PE, we can make a smarter choice of tiling mode to avoid allocating a PE compatible base that is never used for regular textures. This affects GPUs like GC2000 where there is no tiling compatible with both PE and TE. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: rework compatible render baseJonathan Marek2019-10-117-58/+64
| | | | | | | | For PE-incompatible layouts, use a mechanism similar to what texture does to create a compatible base resource. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: get addressing mode from tiling layoutJonathan Marek2019-10-115-24/+8
| | | | | | | | Remove the "addressing_mode" state, which is currently set incorrectly, and instead deduce the addressing mode from the tiling layout. Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: clear texture cache and flush ts when texture is modifiedJonathan Marek2019-10-113-29/+53
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>