summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/aub_write: factorize context image/pphwsp/ring creationLionel Landwerlin2019-03-075-178/+161
| | | | | | | | | | | We allocate GGTT entries and physical addresses are we create engines rather than having a fixed layout. Context images now receive a parameter argument which is used to setup pml4 & ring buffer addresses. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: turn context images arrays into functionsLionel Landwerlin2019-03-074-242/+306
| | | | | | | | | | | We'll make them more parameterized in a later commit. As this is just a transitional commit, we allow ourself to leak the context images allocated in get_context_init(). We'll fix this in the next commit. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: store the physical page allocator in structLionel Landwerlin2019-03-072-15/+33
| | | | | | | We want to use this allocator in the next commit for GGTT pages. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: log mmio writesLionel Landwerlin2019-03-071-0/+5
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: switch to use i915_drm engine classesLionel Landwerlin2019-03-074-44/+59
| | | | | | | | | Prepare aub write to deal with multiple engine instances. We don't pass the instance number yet this could be done in the future by having a 2 dimensional array of struct engine. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel/aub_write: break execlist write in 2Lionel Landwerlin2019-03-071-41/+67
| | | | | | | | We want to reuse the execlist submission, but won't need the ring buffer update. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: write header in initLionel Landwerlin2019-03-074-82/+84
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_write: split comment section from HW setupLionel Landwerlin2019-03-074-26/+52
| | | | | | | | In the future we'll want error2aub to reuse the context image saved by i915 instead of the default one we write in intel_dump_gpu. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/aub_read: reuse defines from gen_contextLionel Landwerlin2019-03-071-12/+13
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: limit number of decoded batchbuffersLionel Landwerlin2019-03-074-2/+27
| | | | | | | | | | | IGT has a test to hang the GPU that works by having a batch buffer jump back into itself, trigger an infinite loop on the command stream. As our implementation of the decoding is "perfectly" mimicking the hardware, our decoder also "hangs". This change limits the number of batch buffer we'll decode before we bail to 100. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: handle decoding MI_BBS from ringLionel Landwerlin2019-03-0710-16/+18
| | | | | | | | | An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: add address space indicator to get BOsLionel Landwerlin2019-03-077-41/+64
| | | | | | | Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* anv: call blob_finish when done with itTapani Pälli2019-03-071-0/+3
| | | | | | | | | | | | | | | | | Fixes leaks from anv_device_upload_nir: ==7345== 8,192 bytes in 2 blocks are definitely lost in loss record 24 of 24 ==7345== at 0x4C2ED78: malloc (vg_replace_malloc.c:308) ==7345== by 0x4C31393: realloc (vg_replace_malloc.c:836) ==7345== by 0x54E0848: grow_to_fit (blob.c:67) ==7345== by 0x54E0BE5: blob_reserve_bytes (blob.c:166) ==7345== by 0x54E0C7C: blob_reserve_intptr (blob.c:186) ==7345== by 0x54704A7: nir_serialize (nir_serialize.c:1091) ==7345== by 0x512F97D: anv_device_upload_nir (anv_pipeline_cache.c:756) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: use anv_gem_munmap in block pool cleanupTapani Pälli2019-03-071-1/+5
| | | | | | | | | | | | | | | | Use anv_gem_munmap for unmap when softpin in use, this corresponds to anv_gem_mmap used in anv_block_pool_expand_range. This fixes valgrind errors seen for each pool when softpin is in use: ==25581== 262,144 bytes in 1 blocks are definitely lost in loss record 31 of 31 ==25581== at 0x50E77E8: anv_gem_mmap (anv_gem.c:96) ==25581== by 0x50EEE2B: anv_block_pool_expand_range (anv_allocator.c:543) ==25581== by 0x50EEB51: anv_block_pool_init (anv_allocator.c:477) ==25581== by 0x50EF7EF: anv_state_pool_init (anv_allocator.c:920) ==25581== by 0x510B8EB: anv_CreateDevice (anv_device.c:2031) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/nir: Move 64-bit lowering laterJason Ekstrand2019-03-061-21/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have a loop unrolling cost function and loop unrolling isn't going to kill us the moment we have a 64-bit op in a loop, we can go ahead and move 64-bit lowering later. This gives us the opportunity to do more optimizations and actually let the full optimizer run even on 64-bit ops rather than hoping one round of opt_algebraic will fix everything. This substantially reduces both fp64 shader compile times and the resulting code size. On the vs-isnan-dvec test from piglit: Before this commit: 1684.63s user 17.29s system 99% cpu 28:28.24 total 101479 instructions. 0 loops. 802452 cycles. 79:369 spills:fills. Peak memory usage (according to massif): 1.435 GB After this commit: 179.64s user 7.75s system 99% cpu 3:07.92 total 57316 instructions. 0 loops. 459287 cycles. 0:0 spills:fills. Peak memory usage (according to massif): 531.0 MB Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-064-22/+9
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/nir: Drop an unneeded lower_constant_initializers callJason Ekstrand2019-03-061-2/+0
| | | | | | | | | | | Even though this is technically a step in the function inlining process as laid out in nir_inline_functions.c, it's not really needed. We already have constant initializers lowered here and no new ones are added by appending the softfp64 functions. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/debug: Add a debug flag to force software fp64Jason Ekstrand2019-03-063-2/+4
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/fs: Fix extract_u8 of an odd byte from a 64-bit integerIan Romanick2019-03-061-0/+7
| | | | | | | | | | | | | | | In the old code, we would generate the exact same instruction for extract_u8(some_u64, 0) and extract_u8(some_u64, 1). The mask-a-word trick only works for even numbered bytes. This fixes the (new) piglit test tests/spec/arb_gpu_shader_int64/execution/fs-ushr-and-mask.shader_test. v2: Use a SHR instead of an AND. This saves an instruction compared to using two moves. Suggested by Jason. Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: nir_op_extract_i8 extracts a byte, not a wordIan Romanick2019-03-061-2/+4
| | | | | Fixes: 6ac2d169019 ("i965/fs: Fix extract_i8/u8 to a 64-bit destination") Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Silence unused parameter warning in brw_interpolation_map.cIan Romanick2019-03-063-7/+4
| | | | | | | | | | | | The parameter is never used, and it's not part of a common interface idiom. Remove it. src/intel/compiler/brw_interpolation_map.c: In function ‘brw_setup_vue_interpolation’: src/intel/compiler/brw_interpolation_map.c:62:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] const struct gen_device_info *devinfo) ^~~~~~~ Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Silence many unused parameter warnings in brw_eu.hIan Romanick2019-03-061-9/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In file included from src/intel/compiler/brw_eu_util.c:34:0: src/intel/compiler/brw_eu.h: In function ‘brw_message_desc_header_present’: src/intel/compiler/brw_eu.h:288:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_desc_header_present(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc’: src/intel/compiler/brw_eu.h:296:51: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_message_ex_desc_ex_mlen’: src/intel/compiler/brw_eu.h:303:59: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_message_ex_desc_ex_mlen(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_binding_table_index’: src/intel/compiler/brw_eu.h:337:68: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_binding_table_index(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_sampler’: src/intel/compiler/brw_eu.h:344:56: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_sampler(const struct gen_device_info *devinfo, uint32_t desc) ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_sampler_desc_return_format’: src/intel/compiler/brw_eu.h:371:62: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_sampler_desc_return_format(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_desc_binding_table_index’: src/intel/compiler/brw_eu.h:405:63: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_dp_desc_binding_table_index(const struct gen_device_info *devinfo, ^~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_desc’: src/intel/compiler/brw_eu.h:754:41: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, /**< 0 for SIMD4x2 */ ^~~~~~~~~ src/intel/compiler/brw_eu.h: In function ‘brw_dp_a64_untyped_atomic_float_desc’: src/intel/compiler/brw_eu.h:775:47: warning: unused parameter ‘exec_size’ [-Wunused-parameter] unsigned exec_size, ^~~~~~~~~ Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: rename is_record() -> is_struct()Timothy Arceri2019-03-061-2/+2
| | | | | | | | | | Replace was done using: find ./src -type f -exec sed -i -- \ 's/is_record(/is_struct(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* anv: Implement VK_EXT_external_memory_hostCaio Marcelo de Oliveira Filho2019-03-054-1/+133
| | | | | | | v2: Ignore the import if handleType == 0. (Jason) Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Implement VK_EXT_inline_uniform_blockJason Ekstrand2019-03-056-16/+163
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: Use the same types for resource indices as pointersJason Ekstrand2019-03-053-6/+37
| | | | | | | | We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Add a concept of a descriptor bufferJason Ekstrand2019-03-055-0/+281
| | | | | | | | | This buffer goes along side the CPU data structure and may contain pointers, bindless handles, or any other descriptor information. Currently, all descriptors are size zero and nothing goes in the buffer but this commit sets up the framework we will need later. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Take references to push descriptor set layoutsJason Ekstrand2019-03-051-6/+16
| | | | | | | | Technically, descriptor set layouts aren't required to survive past the function they're passed into so we need to reference them. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Refactor descriptor pushing a bitJason Ekstrand2019-03-051-28/+22
| | | | | | | | | Pull the common code out of the two entrypoints into the helper which fetches the push descriptor set for us. Now that it does more than just get a thing, call it anv_cmd_buffer_push_descriptor_set. Cc: "19.0" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: drop add_var_binding from anv_nir_apply_pipeline_layout.cJason Ekstrand2019-03-051-7/+2
| | | | | | It has exactly one caller. Just inline it. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Clean up descriptor set layoutsJason Ekstrand2019-03-053-83/+85
| | | | | | | | | | | | | | | | | The descriptor set layout code in our driver has undergone many changes over the years. Some of the fields which were once essential are now useless or nearly so. The has_dynamic_offsets field was completely unused accept for the code to set and hash it. The per-stage indices were only being used to determine if a particular binding had images, samplers, etc. The fact that it's per-stage also doesn't matter because that binding should never be accessed by a shader of the wrong stage. This commit deletes a pile of cruft and replaces it all with a descriptive bitfield which states what a particular descriptor contains. This merely describes the data available and doesn't necessarily dictate how it will be lowered in anv_nir_apply_pipeline_layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Count image param entries rather than imagesJason Ekstrand2019-03-055-23/+29
| | | | | | | | | This is what we're actually storing in the descriptor set and consuming when we bind surface states. This commit renames image_count to image_param_count a few places and moves the decision to not count image params on gen9+ into anv_descriptor_set.c when we build the layout. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Stop allocating buffer views for dynamic buffersJason Ekstrand2019-03-053-24/+22
| | | | | | | We emit the surface states for those on-the-fly so we don't need the buffer view. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Rework arguments to anv_descriptor_set_write_*Jason Ekstrand2019-03-053-29/+27
| | | | | | | Make them all take a device followed by a set. This is consistent with how the actual Vulkan entrypoint parameters are laid out. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/descriptor_set: Refactor alloc/free of descriptor setsJason Ekstrand2019-03-051-59/+84
| | | | | | | This commit just puts the free list code together as part of the pool instead of having it inlined into the descriptor set create code. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: use the platform defines in vk.xml instead of hard-coding themEric Engestrom2019-03-051-4/+7
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: update supported patch versionLionel Landwerlin2019-03-051-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* anv: toggle on support for VK_EXT_ycbcr_image_arraysTapani Pälli2019-03-052-0/+8
| | | | | | | | We already propagate coord_components correctly and did not have layer restrictions for ycbcr formats. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: retain the is_array state in create_plane_tex_instr_implicitTapani Pälli2019-03-051-0/+1
| | | | | | | | This does not seem to fix anything ATM but is the right thing todo. Signed-off-by: Tapani Pälli <[email protected]> Fixes: f3e91e78a33775 ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: Drop anv_fill_binding_tableJason Ekstrand2019-03-041-26/+0
| | | | | | | We zero out the prog data anyway and, now that bias is always zero, this function is accomplishing nothing. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Use an actual binding for gl_NumWorkgroupsJason Ekstrand2019-03-043-31/+33
| | | | | | | | | | | This commit moves our handling of gl_NumWorkgroups over to work like our handling of other special bindings in the Vulkan driver. We give it a magic descriptor set number and teach emit_binding_tables to handle it. This is better than the bias mechanism we were using because it allows us to do proper accounting through the bind map mechanism. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel,nir: Lower TXD with min_lod when the sampler index is not < 16Jason Ekstrand2019-03-041-1/+3
| | | | | | | | | | | When we have a larger sampler index, we get into the "high sampler" scenario and need an instruction header. Even in SIMD8, this pushes the instruction over the sampler message size maximum of 11 registers. Instead, we have to lower TXD to TXL. Fixes: cb98e0755f8d "intel/fs: Support min_lod parameters on texture..." Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv: Count surfaces for non-YCbCr images in GetDescriptorSetLayoutSupportJason Ekstrand2019-03-041-0/+3
| | | | | | | | We were accidentally not counting those surfaces Fixes: ddc4069122 "anv: Implement VK_KHR_maintenance3" Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/glsl: Add another way of doing lower_imul64 for gen8+Sagar Ghuge2019-03-042-0/+12
| | | | | | | | | | | | | | | | | | | On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We can reduce our 64x64 int multiplication from 4 instructions to 3. Also instead of emitting two mul instructions, we can emit single mul instuction and extract low/high 32 bits from 64 bit result for [i/u]mulExtended v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand) 2) Add lower_mul_2x32_64 flag (Matt Turner) 3) Remove associative property as bit size is different (Connor Abbott) v3: Fix indentation and variable naming convention (Jason Ekstrand) Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* android: anv: fix libexpat shared dependencyMauro Rossi2019-03-041-1/+1
| | | | | | | | Fixes undefined reference building errors for XML_* functions Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Cc: "19.0" <[email protected]>
* android: anv: fix generated files depedencies (v2)Mauro Rossi2019-03-041-15/+25
| | | | | | | | | | | | | | | | | Fix anv_extrypoints.{c,h} and anv_extensions.{c,h} missing dependencies Rename the variable labels according to targets and python scripts Align the building rules as per Automake for simplification Fixes building errors during rebuils due to missing dependencies (v2) Fixed a missing $(VULKAN_API_XML) reference Fixes: 9a508b7 ("android: anv/extensions: fix generated sources build") Fixes: dd088d4bec7 ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Cc: "19.0" <[email protected]>
* intel/compiler: Move int64/doubles lowering optionsJordan Justen2019-03-022-34/+39
| | | | | | | | | | Instead of calculating the int64 and doubles lowering options each time a shader is preprocessed, save and use the values in nir_shader_compiler_options. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Don't assert on b2f with a saturate modifierIan Romanick2019-03-021-1/+3
| | | | | | | | This ran afoul of Iris's use of nir_lower_clamp_color_outputs which applies fsat() before writes to vertex shader color outpus. Reviewed-by: Kenneth Graunke <[email protected]> Fixes: 7725d609387 ("intel/fs: Emit better code for b2f(inot(a)) and b2i(inot(a))")
* anv: add support for INTEL_DEBUG=batLionel Landwerlin2019-03-024-2/+93
| | | | | | | | | | | | | | | | | As requested by Ken ;) v2: Also decode simple batches (Caio) Fix u_vector usage issues (Lionel) v3: Make binding/instruction/state/surface available (Lionel) v4: Going through device pools for simple batches (Lionel) Centralize search BO callbacks into anv_device.c (Lionel) v5: Clear decoded batch buffer var after use (Caio) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Add commas on final values of compaction table arraysMatt Turner2019-03-011-15/+15
| | | | Reviewed-by: Jordan Justen <[email protected]>