aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/aux-map: Factor out some useful helpersJason Ekstrand2020-01-252-28/+91
| | | | | | | | | | | | | | | | | | | | This breaks add_mapping() into three pieces: 1. get_aux_entry() adds AUX-TT pages as needed and returns the L1 entry index, L1 entry address, and L1 entry map. 2. gen_aux_map_format_bits_for_isl_surf() computes the format- specific information that goes in the AUX-TT entry. 3. add_mapping() is a lot dumber function that now just adds the requested mapping with the requested format bits. This lets us break out some additional helpers in the API which we want to use for more direct AUX-TT management in ANV. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
* intel/aux-map: Add some #definesJason Ekstrand2020-01-252-14/+25
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3519>
* intel/compiler: Add names for SHADER_OPCODE_[IU]SUB_SATCaio Marcelo de Oliveira Filho2020-01-241-0/+4
| | | | | | | | Fixes: 58907568ec5 ("intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-ops") Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3558>
* anv: Always initialize target_stencil_layoutCaio Marcelo de Oliveira Filho2020-01-241-2/+4
| | | | | | | | | | | | | | | | | Pass down stencil data from the subpass attachment like we do elsewhere. Only stencil attachments will make use of it. Fixes warnings like ../src/intel/vulkan/genX_cmd_buffer.c: In function ‘cmd_buffer_begin_subpass’: ../src/intel/vulkan/genX_cmd_buffer.c:4656:41: warning: ‘target_stencil_layout’ may be used uninitialized in this function [-Wmaybe-uninitialized] 4656 | att_state->current_stencil_layout = target_stencil_layout; | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~ Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3557>
* anv: Replace aux_surface.isl.size_B checks with aux_usage checksJason Ekstrand2020-01-243-13/+11
| | | | | | | | | | | Now that aux_usage has a unified meaning, aux_usage == NONE if and only if aux_surface.isl.size_B > 0. In most of these cases, the question we're asking is "does have compression?" and not "have we allocated an aux surface for compression?". Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
* anv: Rework the meaning of anv_image::planes[]::aux_usageJason Ekstrand2020-01-244-27/+13
| | | | | | | | | | | | Previously, we set aux_usage=ISL_AUX_USAGE_NONE when we really meant CCS_D. This sort-of made sense before we had anv_layout_to_aux_usage but now that we have that helper. However, in our more modern aux tracking model, all aux usage goes through anv_layout_to_* and we're better off making the meaning of anv_image::planes[]::aux_usage be AUX_USAGE_NONE if and only if there is no compression. Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3556>
* intel/isl: Add a hack for the Gen12 A0 texture buffer bugJason Ekstrand2020-01-241-0/+19
| | | | | | Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
* intel/isl: Plumb devinfo into isl_genX(buffer_fill_state_s)Jason Ekstrand2020-01-243-3/+3
| | | | | Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
* intel/disasm: Properly disassemble indirect SENDsJason Ekstrand2020-01-241-3/+16
| | | | | | | | Instead of emitting g[a0]UD for the indirect descriptor, emit a0<0>UD. This is more correct because there is no GRF involved. Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
* intel/fs: Don't unnecessarily fall back to indirect sends on Gen12Jason Ekstrand2020-01-241-3/+4
| | | | | | | | | | The instruction encoding for SENDS changed on Gen12 and it now supports embedding the entire extended message descriptor in the instruction if it's an immediate. Stop falling back to doing an indirect SEND just because we had something in [15:12] of ex_desc.ud. Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
* anv: Improve BTI change cache flushingJason Ekstrand2020-01-242-7/+17
| | | | | | | | | | | | | | | This commit makes two changes: 1. We set pending_pipe_bits instead of emitting PIPE_CONTROL directly for the flush at the end of cmd_buffer_begin_subpass. 2. Because BLORP ops such as vkCmdClearAttachments may come in the middle of a render pass, we have to also flag the need for a cache flush after the blorp op. Fixes: 185630c6bc97 "anv/blorp: Do the gen11 BTI flush" Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3547>
* anv: Allow HiZ in read-only depth layoutsJason Ekstrand2020-01-241-0/+60
| | | | | | | | This improves the performance of Aztec Ruins by 5% on ICL. Reviewed-by: Lionel Landwerlin <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
* anv: Add a usage parameter to anv_layout_to_aux_usageJason Ekstrand2020-01-244-18/+53
| | | | | | | | | | | | | | | | | Most places we actually know the usage and can provide it. There are two exceptions to this: 1. We pass 0 into get_blorp_surf_for_anv_image when we use ANV_IMAGE_LAYOUT_EXPLICIT_AUX because anv_layout_to_aux_usage is never actually called so it doesn't matter. 2. We pass 0 into anv_layout_to_aux_usage in transition_color_buffer. However, the coming commits which will begin using the usage parameter only care about depth. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
* anv: Use isl_aux_state for HiZ resolvesJason Ekstrand2020-01-242-19/+50
| | | | | | | | | | | | Rather than looking at the aux usage, we look at the isl_aux_state which provides us with more detailed information. This commit adds a couple helpers to isl which let us quickly determine if we have valid depth/hiz on the initial layout and if we need valid depth/hiz for the final layout. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
* anv: Add a layout_to_aux_state helperJason Ekstrand2020-01-242-82/+134
| | | | | | | | | | | This new helper maps VkImageLayout enums to isl_aux_state enums which are the hardware's concept of image layouts. We can then use the aux state to get the fast clear type and the aux usage. This should yield no functional change in driver behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
* anv: Use TRANSFER_SRC_OPTIMAL for depth/stencil MSAA resolvesJason Ekstrand2020-01-241-4/+4
| | | | | | | | | | | | As of 52ad1712ed62, TRANSFER_SRC_OPTIMAL and SHADER_READ_ONLY_OPTIMAL are now identical for depth buffers so there's no reason why we need to use the "wrong" layout. Technically, according to Vulkan, blits and MSAA resolves are transfer ops so we should use the transfer layout now that we can. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Nanley Chery <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2605>
* intel/blorp: resize src and dst surfaces separatelyJason Ekstrand2020-01-241-45/+53
| | | | | | | | | | | | | | | When copying to an RGB surface, we treat it as an R only one of three times the width, which may end up being larger than the maximum size supported by the hardware and so it hits the shrink path. This forced both source and destination surfaces to be shrunk, even though it's not necessary for the former, and may even hit some assertions in some cases, such as the surface being compressed. Fixes several tests under dEQP-VK.api.copy_and_blit.core.image_to_image.dimensions.* Reviewed-by: Jordan Justen <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3422>
* anv/apply_pipeline_layout: Initialize the nir_builder before useJason Ekstrand2020-01-231-1/+2
| | | | | | | Fixes: #2410 Fixes: 3c754900b5f "nir: don't emit ishl in _nir_mul_imm() if backend doesn't support bitops" Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3548>
* util/hash_table: added hash functions for integer typesAnthony Pesch2020-01-231-1/+1
| | | | | | | | | | | | | | | | | A few hash_table users roll their own integer hash functions which call _mesa_hash_data to perform the hashing which ultimately calls into XXH32 with a dynamic key length. When using small keys with a constant size the hash rate can be greatly improved by inlining XXH32 and providing it a constant key length, see: https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html Additionally, this patch removes calls to _mesa_key_hash_string and makes them instead call _mesa_has_string directly, matching the new integer hash functions. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
* anv/iris: warn gen12 3DSTATE_HS restrictionLionel Landwerlin2020-01-231-0/+11
| | | | | | | | | | This should never happen but better off documenting it in case someone plays with max threads numbers. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3489>
* intel: Fix aux map alignments on 32-bit builds.Kenneth Graunke2020-01-231-4/+4
| | | | | | | | | | | | | | | ALIGN() brilliantly uses uintptr_t, making it unsafe for use with 64-bit GPU addresses in 32-bit builds of the driver. Use align64() instead, which uses uint64_t. Fixes assertion failures when running any 32-bit program on Tigerlake. Fixes: 2e6a7ced4db ("iris/gen12: Write GFX_AUX_TABLE base address register") Fixes: 0d0290bb3f7 ("intel/common: Add surface to aux map translation table support") Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3507>
* util: Remove tmp argument from BITSET_FOREACH_SET macroMatt Turner2020-01-231-6/+3
| | | | | | Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3499>
* anv: Enable SPV_INTEL_shader_integer_functions2 and ↵Ian Romanick2020-01-232-0/+2
| | | | | | | | | | | VK_INTEL_shader_integer_functions2 Currently only implemented in the scalar backend, so only enable for Gen8+. If support for the other opcodes is added to the vec4 backend, Gen7 could be supported. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
* i965: Enable INTEL_shader_integer_functions2 on Gen8+Ian Romanick2020-01-231-0/+2
| | | | | | | | | | | v2: Use new lower_hadd64 and lower_usub_sat64 flags. v3: Enable SPIR-V capability. v4: Move lowering options to COMMON_SCALAR_OPTIONS. Suggested by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
* intel/fs: Implement support for NIR opcodes for INTEL_shader_integer_functions2Ian Romanick2020-01-231-0/+67
| | | | | | | | | | | | | | | | | | | | | v2: Remove smashing type to D for nir_op_irhadd. Caio noticed it was odd, and removing it fixes an assertion failure in the crucible func.shader.averageRounded.int64_t test (because the source should be W). v3: Emit BRW_OPCODE_MUL directly for nir_op_umul_32x16 and nir_op_imul_32x16. Suggested by Curro. v4: Smash types of MUL instruction generated for nir_op_umul_32x16 and nir_op_imul_32x16. With this change, I get the same assembly now as I did with v2. v5: Remove support for pre-Gen7. The integer multiply path was incorrect, and, since the extension isn't enabled pre-Gen7, there's no way to test it. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
* intel/fs: Add SHADER_OPCODE_[IU]SUB_SAT pseudo-opsIan Romanick2020-01-233-0/+101
| | | | | | | | | | | | | v2: Add a big comment explaining the [IU]SUB_SAT lowering. Suggested by Caio. v3: Use get_fpu_lowered_simd_width in get_lowered_simd_width. Suggested by Ken on IRC. v4: Fix a typo in a comment. Noticed by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
* intel/fs: Don't lower integer multiplies that don't need loweringIan Romanick2020-01-231-0/+11
| | | | | | | | | v2: Move the check to fs_visitor::lower_integer_multiplication. Previously the cases where lowering was skipped, the original instruction was removed by fs_visitor::lower_integer_multiplication. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
* intel/compiler: Move Gen4/5 rounding to visitorMatt Turner2020-01-224-37/+28
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Gen4/5's rounding instructions operate differently than later Gens'. They all return the floor of the input and the "Round-increment" conditional modifier answers whether the result should be incremented by 1.0 to get the appropriate result for the operation (and thus its behavior is determined by the round opcode; e.g., RNDZ vs RNDE). Since this requires a second instruciton (a predicated ADD) that consumes the result of the round instruction, the round instruction cannot write its result directly to the (write-only) message registers. By emitting the ADD in the generator, the backend thinks it's safe to store the round's result directly to the message register file. To avoid this, we move the emission of the ADD instruction to the NIR translator so that the backend has the information it needs. I suspect this also fixes code generated for RNDZ.SAT but since Gen4/5 don't support GLSL 1.30 which adds the trunc() function, I couldn't write a piglit test to confirm. My thinking is that if x=-0.5: sat(trunc(-0.5)) = 0.0 But on Gen4/5 where sat(trunc(x)) is implemented as rndz.r.f0 result, x // result = floor(x) // set f0 if increment needed (+f0) add result, result, 1.0 // fixup so result = trunc(x) then putting saturate on both instructions will give the wrong result. floor(-0.5) = -1.0 sat(floor(-0.5)) = 0.0 // +1 increment would be needed since floor(-0.5) != trunc(-0.5) sat(sat(floor(-0.5)) + 1.0) = 1.0 Fixes: 6f394343b1f ("nir/algebraic: i2f(f2i()) -> trunc()") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2355 Reviewed-by: Ian Romanick <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3459>
* intel/compiler: Return early if read() failedEmmanuel Gil Peyrot2020-01-221-1/+4
| | | | | | | | This was the only warning I could see while compiling Iris. Reviewed-by: Matt Turner <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2821>
* intel/perf: adapt to platforms like Solaris without d_type in struct direntAlan Coopersmith2020-01-221-5/+20
| | | | | | | | | | | | Signed-off-by: Alan Coopersmith <[email protected]> [Eric: factor out the is_dir_or_link() check and fix a bug in v1] Signed-off-by: Eric Engestrom <[email protected]> v3: include directory path when lstat'ing files v4: fix inverted check in enumerate_sysfs_metrics() Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
* anv: ensure prog params are initialized with 0sLionel Landwerlin2020-01-221-1/+1
| | | | | | | | | | | | | | | | | | As a result of 9baa33cef01f our backend compiler leaves params pretty much untouched. So in order to avoid storing uninitialized values in the shader cache blobs, just 0 out this array. I've considered not even allocating this array which works on gen8+ but the vec4 backend still makes a copy of this array and so it crashes on memcpy on HSW. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 9baa33cef01f ("anv: Rework push constant handling") Reported-by: Tapani Pälli <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Tapani Pälli <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3516>
* intel/compiler: Fix array bounds warning on GCC 10.Timur Kristóf2020-01-221-0/+2
| | | | | Signed-off-by: Timur Kristóf <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Test compaction on Gen <= 12Matt Turner2020-01-221-1/+1
| | | | | | | | With the previous commits we can now enable the unit test on Gen <= 12. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Validate fuzzed instructionsMatt Turner2020-01-223-1/+7
| | | | | | | ... before giving them to the instruction compactor. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Add unit tests for new EU validation checksMatt Turner2020-01-221-0/+396
| | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Validate some instruction word encodingsMatt Turner2020-01-221-10/+84
| | | | | | | | | | | Specifically, execution size, register file, and register type. I did not add validation for vertical stride and width because I don't believe it's possible to have an otherwise valid instruction with an invalid vertical stride or width, due to all of the other regioning restrictions. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Factor out brw_validate_instruction()Matt Turner2020-01-221-26/+35
| | | | | | | | | In order to fuzz test instructions, we first need to do some sanity checking first. Factoring out this function allows us an easy way to validate a single instruction. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Handle invalid compacted immediatesMatt Turner2020-01-221-1/+15
| | | | | | | | | | | 16-bit immediates need to be replicated through the 32-bit immediate field, so we should never see one that isn't. This does happen however in the fuzzer unit test, so returning false allows the fuzzer to reject this case. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Handle invalid inputs to brw_reg_type_to_*()Matt Turner2020-01-221-0/+6
| | | | | | | Necessary to handle these cases when we test fuzzed instructions. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Split hw_type tablesMatt Turner2020-01-221-23/+116
| | | | | | | | | | | | Previously we were sharing tables between generations that were nearly identical (i.e., Gen8 3-src adds HF support) and used a small bit of code to handle the differences. This is kind of a mess if you want to reject 64-bit types on platforms that don't support 64-bit types, so split the tables, allowing each generation's table to list exactly what it supports. Acked-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Add a INVALID_{,HW_}REG_TYPE macrosMatt Turner2020-01-223-5/+8
| | | | | | | | Since the enum brw_reg_type is packed, comparisons with -1 don't work directly, necessitating the cast. Add a macro to avoid this confusion. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Add NF some more placesMatt Turner2020-01-222-0/+5
| | | | | | | Necessary to handle these cases when we test fuzzed instructions. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Limit compaction unit tests to specific gensMatt Turner2020-01-221-9/+18
| | | | | | | | | Two of the tests emit instructions with MRF destinations, and MRFs aren't present on Gen7+. I think we were just lucky that this didn't cause a problem earlier since we were running the tests on Gen7-9. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Don't disassemble align1 3-src operands on Gen < 10Matt Turner2020-01-221-0/+12
| | | | | | | | | Since the platforms don't support align1 3-src instructions, the contents of these operands are not going to be meaningful. Just don't print them to avoid hitting some assertions in brw_inst functions. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Split has_64bit_types into float/intMatt Turner2020-01-228-27/+80
| | | | | | | Gen7 has 64-bit floats but not 64-bit ints. Acked-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Extract GEN_* macros into separate fileMatt Turner2020-01-222-37/+63
| | | | | | | Will be used by the instruction compaction unit test. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/compiler: Use ARRAY_SIZE()Matt Turner2020-01-221-22/+22
| | | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2635>
* intel/fs: Don't emit control barrier if only one thread is usedCaio Marcelo de Oliveira Filho2020-01-211-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there's only one hardware thread (i.e. the dispatch width greater or equal to the workgroup size), there's no need to use a barrier to ensure all the invocations reach the same point in the shader, because they are already running lock-step. Results for SKL running Iris for shader-db tests with compute shaders total sends in shared programs: 18361 -> 18339 (-0.12%) sends in affected programs: 904 -> 882 (-2.43%) helped: 9 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 2.44 x̃: 2 helped stats (rel) min: 0.84% max: 21.43% x̄: 7.82% x̃: 2.67% 95% mean confidence interval for sends value: -3.31 -1.58 95% mean confidence interval for sends %-change: -14.67% -0.97% Sends are helped. Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped. Results for ICL and TGL are similar to SKL. Results for BDW are similar to SKL except for DeusEx shader that has a workgroup size 16 but in BDW picks the SIMD8. Reviewed-by: Francisco Jerez <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
* intel/fs: Don't emit fence for shared memory if only one thread is usedCaio Marcelo de Oliveira Filho2020-01-211-13/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When there's only one hardware thread (i.e. the dispatch width greater or equal to the workgroup size), there's no need to synchronize shared memory access (SLM) since all the requests from a single thread are already synchronized. In such case, we just add a scheduling fence. To be able to identify that case for all platforms, move the handling of platforms prior to Gen11 (which don't have a separate SLM fence) after the optimization. Results for SKL running Iris for shader-db tests with compute shaders total sends in shared programs: 18395 -> 18361 (-0.18%) sends in affected programs: 938 -> 904 (-3.62%) helped: 9 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 3.78 x̃: 4 helped stats (rel) min: 1.56% max: 26.32% x̄: 10.33% x̃: 2.60% 95% mean confidence interval for sends value: -4.85 -2.71 95% mean confidence interval for sends %-change: -19.12% -1.54% Sends are helped. Shaders from Aztec Ruins, Car Chase, Manhattan and DeusEx are helped. Results for ICL and TGL are similar to SKL. Results for BDW are similar to SKL except for DeusEx shader that has a workgroup size 16 but in BDW picks the SIMD8. Reviewed-by: Francisco Jerez <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>
* intel/fs: Add workgroup_size() helperCaio Marcelo de Oliveira Filho2020-01-212-0/+10
| | | | | Reviewed-by: Francisco Jerez <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3226>