aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_eu_emit.c
Commit message (Collapse)AuthorAgeFilesLines
* intel/vec4: Drop dead code for handling typed surface messagesJason Ekstrand2019-02-281-89/+0
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu: Add an EOT parameter to send_indirect_[split]_messageJason Ekstrand2019-02-251-10/+15
| | | | | | | For split indirect sends we have to put the EOT parameter in the extended descriptor as well as the instruction itself so just calling brw_inst_set_eot is insufficient. Moving the EOT handling handling into the send_indirect_[split]_message helper lets us handle it properly.
* intel/eu: Add support for the SENDS[C] messagesJason Ekstrand2019-01-291-5/+136
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/inst: Indent some codeJason Ekstrand2019-01-291-177/+183
| | | | | | | We're about to add some more if cases so let's have the giant re-indent in it's own patch to make review easier. Acked-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Use SHADER_OPCODE_SEND for surface messagesJason Ekstrand2019-01-291-72/+0
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/eu: Rework surface descriptor helpersJason Ekstrand2019-01-291-234/+21
| | | | | | | | | | This commit pulls the surface descriptor helpers out into brw_eu.h and makes them no longer depend on the codegen infrastructure. This should allow us to use them directly from the IR code instead of the generator. This change is unfortunately less mechanical than perhaps one would like but it should be fairly straightforward. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/eu: Add has_simd4x2 bools to surface_write functionsJason Ekstrand2019-01-291-6/+8
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Take an explicit exec size in brw_surface_payload_size()Jason Ekstrand2019-01-291-20/+39
| | | | | | | | Instead of magically falling back to SIMD8 for atomics and typed messages on Ivy Bridge, explicitly figure out the exec size and pass that into brw_surface_payload_size. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Reset default flag register in brw_find_live_channel()Matt Turner2019-01-231-2/+11
| | | | | | | | | | | | emit_uniformize() emits SHADER_OPCODE_FIND_LIVE_CHANNEL with its flag_subreg set, so that the IR knows which flag is accessed. However the flag is only used on Gen7 in Align1 mode. To avoid setting unnecessary bits in the instruction words, get the information we need and reset the default flag register. This allows round-tripping through the assembler/disassembler. Reviewed-by: Francisco Jerez <[email protected]>
* intel/eu: Stop overriding exec sizes in send_indirect_messageJason Ekstrand2019-01-181-3/+0
| | | | | | | | For a long time, we based exec sizes on destination register widths. We've not been doing that since 1ca3a9442760b6f7 but a few remnants accidentally remained. Reviewed-by: Anuj Phogat <[email protected]>
* intel/compiler: Avoid false positive assertionsMatt Turner2019-01-091-6/+6
| | | | | | | | | | | A follow on patch will move the 'nr' field to the union containing the immediate field, so prepare by checking that we're only testing these assertions if the .file is correct. The assertions with != ARF were kind of silly to begin with because the <128 check is specifically only for things in the GRF. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu/gen7: Fix brw_MOV() with DF destination and strided source.Francisco Jerez2019-01-091-7/+4
| | | | | | | | | I triggered this bug while prototyping code for a future platform on IVB. Could be a problem today though if a strided move is copy-propagated into a type-converting move with DF destination. Cc: [email protected] Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Set swizzle to BRW_SWIZZLE_XXXX for scalar regionSagar Ghuge2018-12-101-1/+18
| | | | | | | | | | When RepCtrl is set, the swizzle field is ignored by the hardware. In order to ensure a 1-to-1 correspondence between the human-readable disassembly and the binary instruction encoding always set the swizzle to XXXX (all zeros) when it is unused due to RepCtrl Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Change src1 reg type to unsigned doublewordSagar Ghuge2018-10-231-1/+1
| | | | | | | | | | | To have uniform behavior while disassembling send(c) instruction use register type of unsigned doubleword for src1 when message descriptor is immediate value. Bspec does not specifiy anything for src1 immediate default type. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Sagar Ghuge <[email protected]>
* intel/compiler: Implement untyped atomic float min, max, and compare-swap ↵Ian Romanick2018-08-221-0/+47
| | | | | | | | | | dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/eu: Assert that the instruction is send-like in brw_set_desc_ex().Francisco Jerez2018-07-091-2/+3
| | | | | | | Constructing a descriptor in-place as part of the immediate of an ALU instruction is no longer supported. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Get rid of the return value of brw_send_indirect_message().Francisco Jerez2018-07-091-14/+3
| | | | | | | | | The return value is not used anymore. This allows simplifying the code slightly, and in addition it should frustrate anybody's attempts to continue using the obsolete piecemeal approach to construct a message descriptor in combination with brw_send_indirect_message(). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Get rid of the return value of brw_send_indirect_surface_message().Francisco Jerez2018-07-091-10/+6
| | | | | | | | All users of brw_send_indirect_surface_message() should be providing a full descriptor immediate up front by now, this isn't necessary anymore. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport typed surface messages.Francisco Jerez2018-07-091-47/+35
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport scattered byte surface ↵Francisco Jerez2018-07-091-33/+27
| | | | | | messages. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport untyped surface messages.Francisco Jerez2018-07-091-50/+38
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Provide single descriptor argument to ↵Francisco Jerez2018-07-091-29/+36
| | | | | | | | | brw_send_indirect_surface_message(). Instead of the current message_len, response_len and header_present arguments. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for pixel interpolator messages.Francisco Jerez2018-07-091-14/+12
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport write messages.Francisco Jerez2018-07-091-74/+26
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for dataport read messages.Francisco Jerez2018-07-091-52/+25
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use descriptor constructors for sampler messages.Francisco Jerez2018-07-091-39/+6
| | | | | | v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Provide desc immediate argument up front to ↵Francisco Jerez2018-07-091-5/+6
| | | | | | | | | | brw_send_indirect_message(). The current approach of returning a setup instruction where additional descriptor fields can be specified is still supported in order to keep things working, but it will be removed later in this series. Reviewed-by: Kenneth Graunke <[email protected]>
* TRIVIAL: intel/eu: Use a local devinfo variable in brw_shader_time_add().Francisco Jerez2018-07-091-5/+6
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Use brw_set_desc() along with a helper to set common descriptor ↵Francisco Jerez2018-07-091-69/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | controls. This replaces brw_set_message_descriptor() with the composition of brw_set_desc() and a new inline helper function that packs the common message descriptor controls into an integer. The goal is to represent all message descriptors as a 32-bit integer which is written at once into the instruction, which is more flexible (SENDS anyone?), robust (see d2eecf0b0b24d203d0f171807681dffd830d54de fixing an issue ultimately caused by some bits of the extended message descriptor being left undefined) and future-proof than the current approach of specifying the individual descriptor fields directly into the instruction. This approach also seems more self-documenting, since it will allow removing calls to functions with way too many arguments like brw_set_*_message() and brw_send_indirect_message(), and instead provide a single descriptor argument constructed from an appropriate combination of brw_*_desc() helpers. Note that because brw_set_message_descriptor() was (conditionally?) overriding fields of the instruction which strictly speaking weren't part of the message descriptor, this involves calling brw_inst_set_sfid() and brw_inst_set_eot() in some cases in addition to brw_set_desc(). v2: Use SET_BITS macro instead of left shift (Ken). Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Define helper to specify the descriptor immediates of a SEND ↵Francisco Jerez2018-07-091-0/+17
| | | | | | instruction. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Fix pixel interpolator queries for SIMD32.Francisco Jerez2018-06-281-1/+2
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Return new instruction to caller from brw_fb_WRITE().Francisco Jerez2018-06-281-11/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Switch to a logical state stackJason Ekstrand2018-06-041-71/+3
| | | | | | | | Instead of the state stack that's based on copying a dummy instruction around, we start using a logical stack of brw_insn_states. This uses a bit less memory and is way less conceptually bogus. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Set flag [sub]register number differently for 3srcJason Ekstrand2018-06-041-3/+10
| | | | | | | | | | | Prior to gen8, the flag [sub]register number is in a different spot on 3src instructions than on other instructions. Starting with Broadwell, they made it consistent. This commit fixes bugs that occur when a conditional modifier gets propagated into a 3src instruction such as a MAD. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Copy fields manually in brw_next_insnJason Ekstrand2018-06-041-1/+94
| | | | | | | | Instead of doing a memcpy, this moves us to start with a blank instruction (memset to zero) and copy the fields over one at a time. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* intel/eu: Add some brw_get_default_ helpersJason Ekstrand2018-06-041-53/+45
| | | | | | | | This is much cleaner than everything that wants a default value poking at the bits of p->current directly. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add ARB_fragment_shader_interlock support.Plamena Manolova2018-06-011-3/+4
| | | | | | | | | Adds suppport for ARB_fragment_shader_interlock. We achieve the interlock and fragment ordering by issuing a memory fence via sendc. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel/eu: Set EXECUTE_1 when setting the rounding mode in cr0Jason Ekstrand2018-05-221-0/+2
| | | | | Fixes: d6cd14f2131a5b "i965/fs: Define new shader opcode to..." Reviewed-by: Jose Maria Casanova Crespo <[email protected]>
* intel/compiler/icl: Clear "null render target" bit in extended message ↵Jason Ekstrand2018-03-221-0/+3
| | | | | | | | | descriptor Otherwise all our render target writes go no where. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add infrastructure for generating CSEL instructions.Kenneth Graunke2018-03-081-0/+1
| | | | | | | | | | | | | | | | v2 (idr): Don't allow CSEL with a non-float src2. v3 (idr): Add CSEL to fs_inst::flags_written. Suggested by Matt. v4 (idr): Only set BRW_ALIGN_16 on Gen < 10 (suggested by Matt). Don't reset the access mode afterwards (suggested by Samuel and Matt). Add support for CSEL not modifying the flags to more places (requested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Memory fence commit must always be enabled for gen10+Anuj Phogat2018-03-021-1/+3
| | | | | | | | | | | | Commit bit in the message descriptor (Bit 13) must be always set to true in CNL+ for memory fence messages. It also fixes a piglit GPU hang on cnl+ in simulation environment. Piglit test: arb_shader_image_load_store-shader-mem-barrier See HSD ES # 1404612949 Signed-off-by: Anuj Phogat <[email protected]> Cc: [email protected] Reviewed-by: Francisco Jerez <[email protected]>
* intel/eu: Plumb header present bit to codegen helpers for HDC messages.Francisco Jerez2018-03-021-12/+18
| | | | | | | | | | This makes sure that the header-present bit of the message descriptor is in sync with the IR instruction fields, which gives the optimizer more control to avoid the overhead of setting up a message header when it's possible to do so. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/ir: Allow arbitrary scratch flag registers for ↵Francisco Jerez2018-03-021-2/+3
| | | | | | | | | | | | | | SHADER_OPCODE_FIND_LIVE_CHANNEL. This shouldn't cause any functional change at this point, it changes SHADER_OPCODE_FIND_LIVE_CHANNEL to use the flag register specified at the IR level instead of the hard-coded f1.0, now that it can be represented in backend_instruction::flag_subreg. This will be necessary for scheduling to behave correctly once more things start making use of f1.0. Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler/fs: Implement FS_OPCODE_LINTERP with MADs on Gen11+Matt Turner2018-02-281-1/+1
| | | | | | | | | | | | | | | | | | | The PLN instruction is no more. Its functionality is now implemented using two MAD instructions with the new native-float type. Instead of pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F we now have mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F ... and in the case of SIMD8 only the first pair of MAD instructions is used. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add Gen11+ native float typeMatt Turner2018-02-281-2/+8
| | | | | | | | | | | | This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Add/use functions to convert to 3src_align1 vstride/hstrideMatt Turner2018-01-111-28/+41
| | | | | | | | | | Some cases weren't handled, such as stride 4 which is needed for 64-bit operations. Presumably fixes the assertion failure mentioned in commit 2d0457203871 (Revert "i965/fs: Use align1 mode on ternary instructions on Gen10+") but who can really say since the commit neglected to list any of them! Reviewed-by: Scott D Phillips <[email protected]>
* i965/fs: Add byte scattered read message and fs supportJose Maria Casanova Crespo2017-12-061-0/+32
| | | | | | | | | | | | | | | | | | | | v2: Fix alignment style (Topi Pohjolainen) (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_read. v3: (Jason Ekstrand) - Use renamed brw_byte_scattered_data_element_from_bit_size method - Assert scattered read for Gen8+ and Haswell. - Use conditional expresion at components_read. - Include comment about params for scattered opcodes. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Add byte scattered write message and fs supportJose Maria Casanova Crespo2017-12-061-0/+44
| | | | | | | | | | | | | | | | | | | v2: (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_write. v3: - Remove leftover newline (Topi Pohjolainen) - Rename brw_data_size to brw_scattered_data_element and use defines instead of an enum (Jason Ekstrand) - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Define new shader opcode to set rounding modesAlejandro Piñeiro2017-12-061-0/+33
| | | | | | | | | | | | | | | | | | | | | | | Although it is possible to emit them directly as AND/OR on brw_fs_nir, having a specific opcode makes it easier to remove duplicate settings later. v2: (Curro) - Set thread control to 'switch' when using the control register - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode. - Avoid magic numbers setting rounding mode field at control register. v3: (Curro) - Remove redundant and add missing whitespace lines. - Match printing instruction to IR opcode "rnd_mode" v4: (Topi Pohjolainen) - Fix code style. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/eu: Explicitly set EXECUTE_1 where neededJason Ekstrand2017-11-071-0/+9
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>