aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_eu_emit.c
Commit message (Collapse)AuthorAgeFilesLines
* intel/eu: Encode and decode native instruction opcodes from/to IR opcodes.Francisco Jerez2019-10-111-0/+5
| | | | | | | | | | | | Change brw_inst_set_opcode() and brw_inst_opcode() to call brw_opcode_encode/decode() transparently in order to translate between hardware and IR opcodes, and update the EU compaction code in order to do the same as needed, so we can eventually drop the one-to-one correspondence between hardware and IR opcodes. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Split brw_inst ex_desc accessors for SEND(C) vs. SENDS(C).Francisco Jerez2019-10-111-1/+1
| | | | | | | | | | | | | | The brw_inst opcode accessors are going away in one of the following commits. We could potentially replace them with the new helpers that do opcode remapping, but that would lead to a circular dependency between brw_inst.h and brw_eu.h. This way we also avoid ordering issues that can cause the semantics of the ex_desc accessors to change depending on whether the ex_desc field is set after or before the opcode instruction field. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs/generator: refactor rounding mode helper in preparation for float ↵Samuel Iglesias Gonsálvez2019-09-171-31/+21
| | | | | | | | | | | | | | | | | controls v2: - Fix bug in defining BRW_CR0_FP_MODE_MASK. v3: - Update comment (Caio). v4: - Split the patch into the helper (this one) and the new opcode (Caio). Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Handle bits 15:12 in brw_send_indirect_split_message()Kenneth Graunke2019-08-271-2/+12
| | | | | | | | | | | | Annoyingly, these bits exist in some extended message descriptors (in particular render target writes), but they don't have any corresponding bits in the ISA encoding. So we can't use an immediate and have to fall back to an indirect extended descriptor. Thanks to Jason Ekstrand for reminding me that you can still set these bits via an indirect descriptor, even if they don't exist in the ISA. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Fix src0/desc setter orderingKenneth Graunke2019-08-271-2/+2
| | | | | | | | | | | | | | src0 vstride and type overlap with bits of the extended descriptor. brw_set_desc() also sets the extended descriptor to 0. So by setting the descriptor, then setting src0, we were accidentally setting a bunch of extended descriptor bits unintentionally. When using this infrastructure for framebuffer writes (in a future patch), this ended up setting the extended descriptor bit 20, which is "Null Render Target" on Icelake, causing nothing to be written to the framebuffer. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Add support for SLM fence in Gen11Caio Marcelo de Oliveira Filho2019-07-111-4/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gen11 SLM is not on L3 anymore, so now the hardware has two separate fences. Add a way to control which fence types to use. At this time, we don't have enough information in NIR to control the visibility of the memory being fenced, so for now be conservative and assume that fences will need a stall. With more information later we'll be able to reduce those. Fixes Vulkan CTS tests in ICL: dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_nonlocal.workgroup.guard_local.buffer.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.device.payload_local.image.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp The whole set of supported tests in dEQP-VK.memory_model.* group should be passing in ICL now. v2: Pass BTI around instead of having an enum. (Jason) Emit two SHADER_OPCODE_MEMORY_FENCE instead of one that gets transformed into two. (Jason) List tests fixed. (Lionel) v3: For clarity, split the decision of which fences to emit from the emission code. (Jason) Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel/compiler: Enable the emission of ROR/ROL instructionsSagar Ghuge2019-07-011-0/+2
| | | | | | | | v2: 1) Drop changes for vec4 backend as on Gen11+ we don't support align16 mode (Matt Turner) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Fix assertions in brw_alu3Sagar Ghuge2019-06-031-3/+3
| | | | | | | | | v2: Fix assertion for src1 (Ian Romanick) Fixes: 3b967e17 (intel/compiler: Avoid false positive assertions) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Do a stalling MFENCE in endInvocationInterlock()Jason Ekstrand2019-05-301-2/+6
| | | | | Fixes: 939312702e "i965: Add ARB_fragment_shader_interlock support" Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs,vec4: Use g0 as the header for MFENCEJason Ekstrand2019-05-301-4/+5
| | | | | | | | | We set header_present but then pass it some random garbage. Give it g0 instead. I'm not actually sure this does anything but g0 is the usual header data and this is what the windows driver does so it seems like a good idea. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: force stride of 2 on NULL register for Byte instructionsIago Toral Quiroga2019-04-181-0/+11
| | | | | | | | | | | The hardware only allows a stride of 1 on a Byte destination for raw byte MOV instructions. This is required even when the destination is the NULL register. Rather than making sure that we emit a proper NULL:B destination every time we need one, just fix it at emission time. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: set correct precision fields for 3-source float instructionsIago Toral Quiroga2019-04-181-0/+16
| | | | | | | | | | | | | | | | | | | | | | | | Source0 and Destination extract the floating-point precision automatically from the SrcType and DstType instruction fields respectively when they are set to types :F or :HF. For Source1 and Source2 operands, we use the new 1-bit fields Src1Type and Src2Type, where 0 means normal precision and 1 means half-precision. Since we always use the type of the destination for all operands when we emit 3-source instructions, we only need set Src1Type and Src2Type to 1 when we are emitting a half-precision instruction. v2: - Set the bit separately for each source based on its type so we can do mixed floating-point mode in the future (Topi). v3: - Use regular citation style for the comment referencing the PRM (Matt). - Decided not to add asserts in the emission code to check that only mixed HF/F types are used since such checks would break negative tests for brw_eu_validate.c (Matt) Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: allow half-float on 3-source instructions since gen8Iago Toral Quiroga2019-04-181-1/+2
| | | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: handle extended math restrictions for half-floatIago Toral Quiroga2019-04-181-2/+4
| | | | | | | | | | | | | | | Extended math with half-float operands is only supported since gen9, but it is limited to SIMD8. In gen8 we lower it to 32-bit. v2: quashed together the following patches (Jason): - intel/compiler: allow extended math functions with HF operands - intel/compiler: lower 16-bit extended math to 32-bit prior to gen9 - intel/compiler: extended Math is limited to SIMD8 on half-float Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> (allow extended math functions with HF operands, extended Math is limited to SIMD8 on half-float)
* intel/vec4: Drop dead code for handling typed surface messagesJason Ekstrand2019-02-281-89/+0
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu: Add an EOT parameter to send_indirect_[split]_messageJason Ekstrand2019-02-251-10/+15
| | | | | | | For split indirect sends we have to put the EOT parameter in the extended descriptor as well as the instruction itself so just calling brw_inst_set_eot is insufficient. Moving the EOT handling handling into the send_indirect_[split]_message helper lets us handle it properly.
* intel/eu: Add support for the SENDS[C] messagesJason Ekstrand2019-01-291-5/+136
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/inst: Indent some codeJason Ekstrand2019-01-291-177/+183
| | | | | | | We're about to add some more if cases so let's have the giant re-indent in it's own patch to make review easier. Acked-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Use SHADER_OPCODE_SEND for surface messagesJason Ekstrand2019-01-291-72/+0
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/eu: Rework surface descriptor helpersJason Ekstrand2019-01-291-234/+21
| | | | | | | | | | This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/eu: Add has_simd4x2 bools to surface_write functionsJason Ekstrand2019-01-291-6/+8
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Take an explicit exec size in brw_surface_payload_size()Jason Ekstrand2019-01-291-20/+39
| | | | | | | | Instead of magically falling back to SIMD8 for atomics and typed messages on Ivy Bridge, explicitly figure out the exec size and pass that into brw_surface_payload_size. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Reset default flag register in brw_find_live_channel()Matt Turner2019-01-231-2/+11
| | | | | | | | | | | | emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its flag_subreg set, so that the IR knows which flag is accessed. However the flag is only used on Gen7 in Align1 mode. To avoid setting unnecessary bits in the instruction words, get the information we need and reset the default flag register. This allows round-tripping through the assembler/disassembler. Reviewed-by: Francisco Jerez <[email protected]>
* intel/eu: Stop overriding exec sizes in send_indirect_messageJason Ekstrand2019-01-181-3/+0
| | | | | | | | For a long time, we based exec sizes on destination register widths. We've not been doing that since 1ca3a9442760b6f7 but a few remnants accidentally remained. Reviewed-by: Anuj Phogat <[email protected]>
* intel/compiler: Avoid false positive assertionsMatt Turner2019-01-091-6/+6
| | | | | | | | | | | A follow on patch will move the 'nr' field to the union containing the immediate field, so prepare by checking that we're only testing these assertions if the .file is correct. The assertions with != ARF were kind of silly to begin with because the <128 check is specifically only for things in the GRF. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.Francisco Jerez2019-01-091-7/+4
| | | | | | | | | I triggered this bug while prototyping code for a future platform on IVB. Could be a problem today though if a strided move is copy-propagated into a type-converting move with DF destination. Cc: [email protected] Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar regionSagar Ghuge2018-12-101-1/+18
| | | | | | | | | | When RepCtrl is set, the swizzle field is ignored by the hardware. In order to ensure a 1-to-1 correspondence between the human-readable disassembly and the binary instruction encoding always set the swizzle to XXXX (all zeros) when it is unused due to RepCtrl Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Change src1 reg type to unsigned doublewordSagar Ghuge2018-10-231-1/+1
| | | | | | | | | | | To have uniform behavior while disassembling send(c) instruction use register type of unsigned doubleword for src1 when message descriptor is immediate value. Bspec does not specifiy anything for src1 immediate default type. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Sagar Ghuge <[email protected]>
* intel/compiler: Implement untyped atomic float min, max, and compare-swap ↵Ian Romanick2018-08-221-0/+47
| | | | | | | | | | dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu: Assert that the instruction is send-like in brw_set_desc_ex().Francisco Jerez2018-07-091-2/+3
| | | | | | | Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Get rid of the return value of brw_send_indirect_message().Francisco Jerez2018-07-091-14/+3
| | | | | | | | | The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Get rid of the return value of brw_send_indirect_surface_message().Francisco Jerez2018-07-091-10/+6
| | | | | | | | All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport typed surface messages.Francisco Jerez2018-07-091-47/+35
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport scattered byte surface ↵Francisco Jerez2018-07-091-33/+27
| | | | | | messages. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport untyped surface messages.Francisco Jerez2018-07-091-50/+38
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Provide single descriptor argument to ↵Francisco Jerez2018-07-091-29/+36
| | | | | | | | | brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for pixel interpolator messages.Francisco Jerez2018-07-091-14/+12
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport write messages.Francisco Jerez2018-07-091-74/+26
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport read messages.Francisco Jerez2018-07-091-52/+25
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for sampler messages.Francisco Jerez2018-07-091-39/+6
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Provide desc immediate argument up front to ↵Francisco Jerez2018-07-091-5/+6
| | | | | | | | | | brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <[email protected]>
* TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().Francisco Jerez2018-07-091-5/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use brw_set_desc() along with a helper to set common descriptor ↵Francisco Jerez2018-07-091-69/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see d2eecf0b0b24d203d0f171807681dffd830d54de fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set_*_message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw_*_desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Define helper to specify the descriptor immediates of a SEND ↵Francisco Jerez2018-07-091-0/+17
| | | | | | instruction. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Fix pixel interpolator queries for SIMD32.Francisco Jerez2018-06-281-1/+2
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Return new instruction to caller from brw_fb_WRITE().Francisco Jerez2018-06-281-11/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Switch to a logical state stackJason Ekstrand2018-06-041-71/+3
| | | | | | | | Instead of the state stack that's based on copying a dummy instruction around, we start using a logical stack of brw_insn_states. This uses a bit less memory and is way less conceptually bogus. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Set flag [sub]register number differently for 3srcJason Ekstrand2018-06-041-3/+10
| | | | | | | | | | | Prior to gen8, the flag [sub]register number is in a different spot on 3src instructions than on other instructions. Starting with Broadwell, they made it consistent. This commit fixes bugs that occur when a conditional modifier gets propagated into a 3src instruction such as a MAD. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Copy fields manually in brw_next_insnJason Ekstrand2018-06-041-1/+94
| | | | | | | | Instead of doing a memcpy, this moves us to start with a blank instruction (memset to zero) and copy the fields over one at a time. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Add some brw_get_default_ helpersJason Ekstrand2018-06-041-53/+45
| | | | | | | | This is much cleaner than everything that wants a default value poking at the bits of p->current directly. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>