aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/compiler/brw_shader.cpp
Commit message (Collapse)AuthorAgeFilesLines
...
* intel/compiler: fix 16-bit int brw_negate_immediate and brw_abs_immediateJose Maria Casanova Crespo2018-05-031-4/+8
| | | | | | | | | | | | | | | | | | | | | From Intel Skylake PRM, vol 07, "Immediate" section (page 768): "For a word, unsigned word, or half-float immediate data, software must replicate the same 16-bit immediate value to both the lower word and the high word of the 32-bit immediate field in a GEN instruction." This fixes the int16/uint16 negate and abs immediates that weren't taking into account the replication in lower and upper words. v2: Integer cases are different to Float cases. (Jason Ekstrand) Included reference to PRM (Jose Maria Casanova) v3: Make explicit uint32_t casting for left shift (Jason Ekstrand) Split half float implementation. (Jason Ekstrand) Fix brw_abs_immediate (Jose Maria Casanova) Cc: "18.0 18.1" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Add negative_equals methodsIan Romanick2018-03-261-0/+6
| | | | | | | | | | | | | | | | | | | | | | This method is similar to the existing ::equals methods. Instead of testing that two src_regs are equal to each other, it tests that one is the negation of the other. v2: Simplify various checks based on suggestions from Matt. Use src_reg::type instead of fixed_hw_reg.type in a check. Also suggested by Matt. v3: Rebase on 3 years. Fix some problems with negative_equals with VF constants. Add fs_reg::negative_equals. v4: Replace the existing default case with BRW_REGISTER_TYPE_UB, BRW_REGISTER_TYPE_B, and BRW_REGISTER_TYPE_NF. Suggested by Matt. Expand the FINISHME comment to better explain why it isn't already finished. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> [v3] Reviewed-by: Matt Turner <[email protected]>
* compiler: int8/uint8 supportKarol Herbst2018-03-141-0/+4
| | | | | | | | | | OpenCL kernels also have int8/uint8. v2: remove changes in nir_search as Jason posted a patch for that Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* intel/fs: Add support for subgroup quad operationsJason Ekstrand2018-03-071-0/+2
| | | | | | | | NIR has code to lower these away for us but we can do significantly better in many cases with register regioning and SIMD4x2. Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/fs: Add a couple of simple helper opcodesJason Ekstrand2018-03-071-0/+5
| | | | | Acked-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/fs: Add support for nir_intrinsic_shuffleJason Ekstrand2018-03-071-0/+2
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel: Drop program size pointer from vec4/fs assembly getters.Kenneth Graunke2018-03-021-3/+2
| | | | | | | | | These days, we're just passing a pointer to a prog_data field, which we already have access to. We can just use it directly. (In the past, it was a pointer to a separate value.) Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Add Gen11+ native float typeMatt Turner2018-02-281-0/+6
| | | | | | | | | | | | This new type exposes the additional precision offered by the accumulator register and will be used in the next patch to implement the functionality of the PLN instruction using a pair of MAD instructions. One weird thing to note: align1 ternary instructions may only have an accumulator in the dst or src1 normally, but when src0's type is :NF the accumulator is read. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Combine {VS,FS}_OPCODE_GET_BUFFER_SIZE opcodes.Kenneth Graunke2017-12-301-6/+3
| | | | | | These are the same, we don't need a separate opcode enum per backend. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Add byte scattered read message and fs supportJose Maria Casanova Crespo2017-12-061-0/+6
| | | | | | | | | | | | | | | | | | | | v2: Fix alignment style (Topi Pohjolainen) (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_read. v3: (Jason Ekstrand) - Use renamed brw_byte_scattered_data_element_from_bit_size method - Assert scattered read for Gen8+ and Haswell. - Use conditional expresion at components_read. - Include comment about params for scattered opcodes. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Add byte scattered write message and fs supportJose Maria Casanova Crespo2017-12-061-0/+7
| | | | | | | | | | | | | | | | | | | v2: (Jason Ekstrand) - Enable bit_size parameter to scattered messages to enable different bitsizes byte/word/dword. - Remove use of brw_send_indirect_scattered_message in favor of brw_send_indirect_surface_message. - Move scattered messages to surface messages namespace. - Assert align1 for scattered messages and assume Gen8+. - Inline brw_set_dp_byte_scattered_write. v3: - Remove leftover newline (Topi Pohjolainen) - Rename brw_data_size to brw_scattered_data_element and use defines instead of an enum (Jason Ekstrand) - Assert scattered write for Gen8+ and Haswell (Jason Ekstrand) Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Define new shader opcode to set rounding modesAlejandro Piñeiro2017-12-061-0/+4
| | | | | | | | | | | | | | | | | | | | | | | Although it is possible to emit them directly as AND/OR on brw_fs_nir, having a specific opcode makes it easier to remove duplicate settings later. v2: (Curro) - Set thread control to 'switch' when using the control register - Use a single SHADER_OPCODE_RND_MODE opcode taking an immediate with the rounding mode. - Avoid magic numbers setting rounding mode field at control register. v3: (Curro) - Remove redundant and add missing whitespace lines. - Match printing instruction to IR opcode "rnd_mode" v4: (Topi Pohjolainen) - Fix code style. Signed-off-by: Alejandro Piñeiro <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Support for 16-bit base types in helper functionsJose Maria Casanova Crespo2017-12-061-0/+6
| | | | | | | | | v2: Fixed calculation of scalar size for 16-bit types. (Jason Ekstrand) v3: Fix coding style (Topi Pohjolainen) Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Signed-off-by: Eduardo Lima <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use nir_lower_atomics_to_ssbos and delete ABO compiler code.Kenneth Graunke2017-11-151-32/+0
| | | | | | | | | | | | We use the same hardware mechanism for both atomic counters and SSBO atomics, so there's really no benefit to maintaining separate code to handle each case. Instead, we can just use Rob's shiny new NIR pass to convert atomic_uints to SSBOs, and delete piles of code. The ssbo_start section of the binding table becomes a combined ABO and SSBO section, with ABOs first, then SSBOs. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Move the destructor from vec4_visitor to backend_shaderJason Ekstrand2017-11-071-0/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Add some restrictions to MOV_INDIRECT and BROADCASTJason Ekstrand2017-11-071-0/+2
| | | | | | | These restrictions effectively already existed due to the way we use indirect sources but weren't being directly enforced. Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: Remove final_program_size from brw_compile_*Jordan Justen2017-10-311-4/+2
| | | | | | | | | The caller can now use brw_stage_prog_data::program_size which is set by the brw_compile_* functions. Cc: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: add new field for storing program sizeCarl Worth2017-10-311-4/+8
| | | | | | | | | | | | This will be used by the on disk shader cache. v2: * Set in brw_compile_* rather than brw_codegen_*. (Jason) Signed-off-by: Timothy Arceri <[email protected]> [[email protected]: Only add to brw_stage_prog_data] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Remove ir_binop_greater and ir_binop_lequal expressionsIan Romanick2017-10-301-4/+0
| | | | | | | | | | | | | | | | | | | NIR does not have these instructions. TGSI and Mesa IR both implement them using < and >=, repsectively. Removing them deletes a bunch of code and means I don't have to add code to the SPIR-V generator for them. v2: Rebase on 2+ years of change... and fix a major bug added in the rebase. text data bss dec hex filename 8255291 268856 294072 8818219 868e2b 32-bit i965_dri.so before 8254235 268856 294072 8817163 868a0b 32-bit i965_dri.so after 7815339 345592 420592 8581523 82f193 64-bit i965_dri.so before 7813995 345560 420592 8580147 82ec33 64-bit i965_dri.so after Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir: Get rid of nir_shader::stageJason Ekstrand2017-10-201-1/+1
| | | | | | | | It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Move fs_inst::has_side_effects()'s eot check to the parent class.Kenneth Graunke2017-10-191-1/+1
| | | | | | | | | This eliminates a layer of wrapping, and makes a backend_instruction sufficient. The downside is that it exposes 'eot' to the vec4 backend, which it doesn't need, but can basically happily ignore. Reviewed-by: Matt Turner <[email protected]> Tested-by: Pallavi G <[email protected]>
* i965/cnl: Make URB {VS, GS, HS, DS} sizes non multiple of 3Anuj Phogat2017-06-091-0/+8
| | | | | | | | | | v1: By Ben Widawsky <[email protected]> v2: v1 had an assert only for VS. Add the restriction for GS, HS and DS as well and make sure the allocated sizes are not multiple of 3. v3: Move the entry_size checks in to compiler code (Ken) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Embed the shader_info in the nir_shader againJason Ekstrand2017-05-091-14/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit e1af20f18a86f52a9640faf2d4ff8a71b0a4fa9b changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since 00620782c92100d77c660f9783504c6d80fa1d58, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: split VEC4_OPCODE_FROM_DOUBLE into one opcode per destination's typeSamuel Iglesias Gonsálvez2017-04-141-2/+6
| | | | | | | | | | This way we can set the destination type as double to all these new opcodes, avoiding any optimizer's confusion that was happening before. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> [ Francisco Jerez: Drop no_spill workaround originally needed due to the bogus destination type of VEC4_OPCODE_FROM_DOUBLE. ] Reviewed-by: Francisco Jerez <[email protected]>
* i965: expose BRW_OPCODE_[F32TO16/F16TO32] name on gen8+Alejandro Piñeiro2017-03-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | Technically those hw operations are only available on gen7, as gen8+ support the conversion on the MOV. But, when using the builder to implement nir operations (example: nir_op_fquantize2f16), it is not needed to do the gen check. This check is done later, on the final emission at brw_F32TO16 (brw_eu_emit), choosing between the MOV or the specific operation accordingly. So in the middle, during optimization phases those hw operations can be around for gen8+ too. Without this patch, several (at least 95) vulkan-cts quantize tests crashes when using INTEL_DEBUG=optimizer. For example: dEQP-VK.spirv_assembly.instruction.graphics.opquantize.too_small_vert v2: simplify the code using GEN_GE (Ilia Mirkin) v3: tweak brw_instruction_name instead of changing opcode_descs table, that is used for validation (Matt Turner) Reviewed-by: Matt Turner <[email protected]>
* i965: Move the back-end compiler to src/intel/compilerJason Ekstrand2017-03-131-0/+1273
Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>