aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri/i965/brw_eu.h
Commit message (Collapse)AuthorAgeFilesLines
* i965: Introduce the FIND_LIVE_CHANNEL pseudo-opcode.Francisco Jerez2015-05-041-0/+4
| | | | | | | | | | | | | This instruction calculates the index of an arbitrary channel enabled in the current execution mask. It's expected to be used as input for the BROADCAST opcode, but it's implemented as a separate instruction rather than being baked into BROADCAST because FIND_LIVE_CHANNEL has no dependencies so it can always be CSE'ed with other instances of the same instruction within a basic block. v2: Whitespace fixes. Reviewed-by: Matt Turner <[email protected]>
* i965: Introduce the BROADCAST pseudo-opcode.Francisco Jerez2015-05-041-0/+6
| | | | | | | | | | | | | | | | | | | The BROADCAST instruction picks the channel from its first source given by an index passed in as second source. This will be used in situations where all channels from the same SIMD thread have to agree on the value of something, e.g. a surface binding table index. This is in particular the case for UBO, sampler and image arrays, which can be indexed dynamically with the restriction that all active SIMD channels access the same index, provided to the shared unit as part of a single scalar field of the message descriptor. Simply taking the index value from the first channel as we were doing until now is incorrect, because it might contain an uninitialized value if the channel had previously been disabled by non-uniform control flow. v2: Minor style fixes. Improve commit message. Reviewed-by: Matt Turner <[email protected]>
* i965: Add memory fence opcode.Francisco Jerez2015-05-041-0/+4
| | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add typed surface access opcodes.Francisco Jerez2015-05-041-0/+24
| | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add untyped surface write opcode.Francisco Jerez2015-05-041-0/+7
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Fix the untyped surface opcodes to deal with indirect surface access.Francisco Jerez2015-05-041-5/+5
| | | | | | | | | | | | Change brw_untyped_atomic() and brw_untyped_surface_read() to take the surface index as a register instead of a constant and to use brw_send_indirect_message() to emit the indirect variant of send with a dynamically calculated message descriptor. This will be required to support variable indexing of image arrays for ARB_shader_image_load_store. Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename brw_compile to brw_codegenJason Ekstrand2015-04-221-64/+64
| | | | | | | | | | | | This name better matches what it's actually used for. The patch was generated with the following command: for file in *; do sed -i -e s/brw_compile/brw_codegen/g $file done Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Remove the context field from brw_compilerJason Ekstrand2015-04-221-3/+2
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the disassembler take a device_info instead of a contextJason Ekstrand2015-04-221-2/+2
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make instruction compaction take a device_info instead of a contextJason Ekstrand2015-04-221-6/+6
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Make the brw_inst helpers take a device_info instead of a contextJason Ekstrand2015-04-221-5/+5
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Add a devinfo parameter to brw_compileJason Ekstrand2015-04-221-0/+1
| | | | Reviewed-by: Matt Turner <[email protected]>
* i965: Replace guess_execution_size with something simpler.Matt Turner2015-04-211-0/+1
| | | | | | | | | | | | | | | | | | | | | | | guess_execution_size() does two things: 1. Cope with small destination registers. 2. Cope with SIMD8 vs SIMD16 mode. This patch replaces the first with a simple if block in brw_set_dest: if the destination register width is less than 8, you probably want the execution size to match. (I didn't put this in the 3src block because it doesn't seem to matter.) Since only the FS compiler cares about SIMD16 mode, it's easy to just set the default execution size there. This pattern was already been proven in the Gen8+ generator, but we didn't port it back to the existing generator when we combined the two. This is based on a patch from Ken from about a year ago. I've rebased it and and fixed a few bugs. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Pass number of components explicitly to brw_untyped_atomic and ↵Francisco Jerez2015-03-201-2/+2
| | | | | | | | | | | | _surface_read. And calculate the message response size based on the number of components rather than the other way around. This simplifies their interface somewhat and allows the caller to request a writeback message with more than one vector component in SIMD4x2 mode. Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Factor out logic to build a send message instruction with indirect ↵Francisco Jerez2015-03-201-5/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | descriptor. This is going to be useful because the Gen7+ uniform and varying pull constant, texturing, typed and untyped surface read, write, and atomic generation code on the vec4 and fs back-end all require the same logic to handle conditionally indirect surface indices. In pseudocode: | if (surface.file == BRW_IMMEDIATE_VALUE) { | inst = brw_SEND(p, dst, payload); | set_descriptor_control_bits(inst, surface, ...); | } else { | inst = brw_OR(p, addr, surface, 0); | set_descriptor_control_bits(inst, ...); | inst = brw_SEND(p, dst, payload); | set_indirect_send_descriptor(inst, addr); | } This patch abstracts out this frequently recurring pattern so we can now write: | inst = brw_send_indirect_message(p, sfid, dst, payload, surface) | set_descriptor_control_bits(inst, ...); without worrying about handling the immediate and indirect surface index cases explicitly. v2: Rebase. Improve documentatation and commit message. (Topi) Preserve UW destination type cargo-cult. (Topi, Ken, Matt) Reviewed-by: Topi Pohjolainen <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965/fs: Implement SIMD16 dual source blending.Iago Toral Quiroga2015-03-091-0/+1
| | | | | | | | | | | | From the SNB PRM, volume 4, part 1, page 193: "The dual source render target messages only have SIMD8 forms due to maximum message length limitations. SIMD16 pixel shaders must send two of these messages to cover all of the pixels. Each message contains two colors (4 channels each) for each pixel in the message payload." Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82831 Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Introduce brw_negate_cmod().Kenneth Graunke2015-02-271-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/emit: Do the sampler index adjustment directly in header.0.3Jason Ekstrand2015-01-221-2/+1
| | | | | | | | | | | Prior to this commit, the adjust_sampler_state_pointer function took an extra register that it could use as scratch space. The usual candidate was the destination of the sampler instruction. However, if that register ever aliased anything important such as the sampler index, this would scratch all over important data. Fortunately, the calculation is such that we can just do it in place and we don't need the scratch space at all. Reviewed-by: Chris Forbes <[email protected]>
* i965/fs: Add a an optional source to the FS_OPCODE_FB_WRITE instructionJason Ekstrand2014-09-301-2/+2
| | | | | | | | | Previously, we were use the base_mrf parameter of fs_inst to store the MRF location. In preparation for doing FB writes from the GRF, we now also allow you to set inst->base_mrf to -1 and provide a source register. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Use the GRF for UNTYPED_ATOMIC instructionsJason Ekstrand2014-09-301-1/+1
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Use BRW_MATH_DATA_SCALAR when source regioning is scalar.Matt Turner2014-09-291-1/+0
| | | | | | Notice the mistaken (but harmless) argument swapping in brw_math_invert(). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Extract helper function for surface state pointer adjustmentChris Forbes2014-08-151-0/+5
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add low-level support for indirect sendsChris Forbes2014-08-151-0/+5
| | | | | | | This provides a reasonable place to enforce the hardware restriction that indirect descriptors must be in a0.0 Signed-off-by: Chris Forbes <[email protected]>
* i965/eu: Update jump distance scaling for Broadwell.Kenneth Graunke2014-08-101-0/+4
| | | | | | | | | | Broadwell measures jump distances in bytes, so we need to scale by 16. v2: Update the function in brw_eu.h, not in brw_eu_emit.c. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Refactor jump distance scaling to use a helper function.Kenneth Graunke2014-08-101-0/+20
| | | | | | | | | | | | | | | | | | | Different generations of hardware measure jump distances in different units. Previously, every function that needed to set a jump target open coded this scaling, or made a hardcoded assumption (i.e. just used 2). Most functions start with the number of instructions to jump, and scale up to the hardware-specific value. So, I made the function match that. Others start with a byte offset, and divide by a constant (8) to obtain the jump distance. This is actually 16 / 2 (the jump scale for Gen5-7). v2: Make the helper a static inline defined in brw_eu.h, instead of an actual function in brw_eu_emit.c (as suggested by Matt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/eu: Merge brw_CONT and gen6_CONT.Kenneth Graunke2014-08-081-1/+0
| | | | | | | The only difference is setting PopCount on Gen4-5. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: add low-level support for send to pixel interpolatorChris Forbes2014-07-131-0/+10
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Rename intel_asm_printer -> intel_asm_annotation.Matt Turner2014-07-051-1/+1
| | | | | | The #ifndef include guards already said the right thing :) Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a brw_conditional_mod enum.Matt Turner2014-07-051-2/+2
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Pass brw to brw_try_compact_instruction().Matt Turner2014-06-261-1/+1
| | | | | Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Replace struct brw_compact_instruction with brw_compact_inst.Matt Turner2014-06-261-3/+2
| | | | | Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Replace 'struct brw_instruction' with 'brw_inst'.Matt Turner2014-06-261-47/+38
| | | | | | | | Use this an an opportunity to clean up the formatting of some old code (brw_ADD, for instance). Signed-off-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Convert brw_eu.[ch] to use the new brw_inst API.Kenneth Graunke2014-06-261-1/+2
| | | | | | | v2: Don't set flag_reg_nr prior to Gen7 (as it doesn't exist). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Pass brw into next_offset().Kenneth Graunke2014-06-261-1/+1
| | | | | | | | The new brw_inst API is going to require a brw pointer in order to access fields (so it can do generation checks). Plumb it in now. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* Revert "i965: Add 'wait' instruction support"Matt Turner2014-06-171-2/+0
| | | | | | This reverts commit 20be3ff57670529a410b30a1008a71e768d08428. No evidence of ever being used.
* i965: Rename brw_math to gen4_math.Kenneth Graunke2014-06-101-1/+1
| | | | | | | | | Usually, I try to use "brw" for functions that apply to all generations, and "gen4" for dead end/legacy code that is only used on Gen4-5. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Split Gen4-5 and Gen6+ MATH instruction emitters.Kenneth Graunke2014-06-101-1/+1
| | | | | | | | | | | | | | | | | | | | Our existing functions, brw_math and brw_math2, had unclear roles: Gen4-5 used brw_math for both unary and binary math functions; it never used brw_math2. Since operands are already in message registers, this is reasonable. Gen6+ used brw_math for unary math functions, and brw_math2 for binary math functions, duplicating a lot of code. The only real difference was that brw_math used brw_null_reg() for src1. This patch improves brw_math2's assertions to allow both unary and binary operations, renames it to gen6_math(), and drops the Gen6+ code out of brw_math(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* Revert "i965: Move brw_land_fwd_jump() to compilation unit of its use."Iago Toral Quiroga2014-06-071-0/+4
| | | | | | | | | | This reverts commit f3cb2e6ed7059b22752a6b7d7a98c07ba6b5552e. brw_land_fwd_jump() is convenient wherever we produce JMPI instructions and we will use JMPI to implement framebuffer writes that involve line antialiasing in gen < 6. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Put '_default_' in the name of functions that set default state.Kenneth Graunke2014-06-021-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | Eventually we're going to use functions to set bits on an instruction. Putting 'default' in the name of functions that alter default state will help distinguins them. This patch was generated entirely mechanically, by the following: for file in brw*.{cpp,c,h}; do sed -i \ -e 's/brw_set_mask_control/brw_set_default_mask_control/g' \ -e 's/brw_set_saturate/brw_set_default_saturate/g' \ -e 's/brw_set_access_mode/brw_set_default_access_mode/g' \ -e 's/brw_set_compression_control/brw_set_default_compression_control/g' \ -e 's/brw_set_predicate_control/brw_set_default_predicate_control/g' \ -e 's/brw_set_predicate_inverse/brw_set_default_predicate_inverse/g' \ -e 's/brw_set_flag_reg/brw_set_default_flag_reg/g' \ -e 's/brw_set_acc_write_control/brw_set_default_acc_write_control/g' \ $file; done No manual changes were done after running that command. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Delete brw_set_conditionalmod.Kenneth Graunke2014-06-021-1/+0
| | | | | | | | | This removes the ability to set the default conditional modifier on all future instructions. Nothing uses it, and it's not really a sensible thing to do anyway. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Create a "brw_last_inst" convenience macro.Kenneth Graunke2014-06-021-0/+6
| | | | | | | | | | | | | | | | Often times, we want to emit an instruction, then set one field on it, such as predication or a conditional modifier. Normally, we'd have to declare "struct brw_instruction *inst;" and then use "inst = brw_FOO(...)" to emit the instruction, which can hurt readability. The new "brw_last_inst" macro refers to the most recently emitted instruction, so you can just do: brw_ADD(...) brw_last_inst->header.predicate_control = BRW_PREDICATE_NORMAL; Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Make brw_JMPI set predicate_control based on a parameter.Kenneth Graunke2014-06-021-1/+2
| | | | | | | | | We use both predicated and unconditional JMPI instructions. But in each case, it's clear which we want. It's simpler to just specify it as a parameter, rather than relying on default state. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Remove the dst and src0 parameters from brw_JMPI.Kenneth Graunke2014-06-021-2/+1
| | | | | | | | | In all cases, we set both dst and src0 to brw_ip_reg(). This is no accident: according to the ISA reference, both are required to be the IP register. So, we may as well drop the parameters. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Move brw_compile::flag_value to brw_sf_compile.Kenneth Graunke2014-05-271-1/+0
| | | | | | | | | | | This field is only used to track the current value of the flag register during the SF compile. It has no place in the common compiler code. While we're changing every call, drop the 'brw' prefix from the function since it's static. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/sf: Move brw_set_predicate_control_flag_value to brw_sf_emit.c.Kenneth Graunke2014-05-271-1/+0
| | | | | | | | Only the Gen4-5 SF program compiler actually uses this function; move it there. Soon the fields will be moved out of brw_compile. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Move brw_land_fwd_jump() to compilation unit of its use.Matt Turner2014-05-241-3/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move next_offset() to brw_eu.h for use elsewhere.Matt Turner2014-05-241-0/+12
| | | | | | | Also perform arithmetic on char* rather than void* since the latter is a GNU C extension not available in C++. Reviewed-by: Eric Anholt <[email protected]>
* i965: Add annotation data structure and support code.Matt Turner2014-05-241-1/+3
| | | | | | | | | | | | | | | | Will be used to print disassembly after jump targets are set and instructions are compacted, while still retaining higher-level IR annotations and basic block information. An array of 'struct annotation' will live along side the generated assembly. The generators will populate the array with their IR annotations, and basic block pointers if the instructions began or ended a basic block pointer. We'll then update the instruction offset when we compact instructions and then using the annotations print the disassembly. Reviewed-by: Eric Anholt <[email protected]>
* i965: Pass in start_offset to brw_compact_instructions().Matt Turner2014-05-241-1/+1
| | | | | | | Let's us avoid recompacting the SIMD8 instructions when we compact the SIMD16 program. Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename brw/gen8_dump_compile to brw/gen8_disassemble.Kenneth Graunke2014-05-181-2/+2
| | | | | | | "Disassemble" is an accurate description of what this function does. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>