aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* genxml: Make "Reorder Mode" fields consistent.Kenneth Graunke2017-05-033-6/+8
| | | | | | | | Both GS and SOL have these fields. Some were ReorderEnable = true, some were ReorderMode = REORDER_TRAILING, and some were just TRAILING. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* genxml: Add alias for MOCS.Rafael Antognolli2017-05-034-0/+4
| | | | | | | | | | | Use an alias, so we can set the same value as the #define's. v3: - Call it "SO Buffer MOCS" to follow the most common naming scheme. - Add alias for gen7 and gen75 too (Ken). Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Add missing field values to 3DSTATE_SBE.Rafael Antognolli2017-05-031-1/+6
| | | | | | | Fill out "Attribute Active Component Format" possible values. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Update xml for 3DSTATE_SF.Rafael Antognolli2017-05-032-6/+16
| | | | | | | | | | - Normalize "Anti-Aliasing Enable" - Add "Multisample Rasterization Mode" constants - Rename "Use Point Width on Vertex" to "Vertex" - Rename "Use Point Width from State" to "State" Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Rename clip enable property.Rafael Antognolli2017-05-034-4/+4
| | | | | | | | | | | There are two variants: - Clip Enable - CLIP Enable (on gen6) Rename everything to Clip Enable. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Fill out Gen4, Gen45 and Gen5 XMLLouis-Francis Ratté-Boulianne2017-05-033-1065/+2517
| | | | | | | | | | | | | | Add some more details to Gen4 and Gen45 and add what is needed in Gen5 XML. This commit overwrite the previous work done on Gen4 and Gen45 as it contains more instructions and fixes some mistakes. However, comments (dword boundaries) are lost in the process. v3: - Set the type of some fields, instead of prefix. Also fix the SAMPLER_BORDER_COLOR_STATE fields of gen5.xml. Signed-off-by: Louis-Francis Ratté-Boulianne <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* anv: Implement VK_KHX_external_semaphore_fdJason Ekstrand2017-05-035-16/+199
| | | | | | | | This implementation allocates a 4k BO for each semaphore that can be exported using OPAQUE_FD and uses the kernel's already-existing synchronization mechanism on BOs. Reviewed-by: Chad Versace <[email protected]>
* anv: Pull the guts of cmd_buffer_execbuf into a helperJason Ekstrand2017-05-031-24/+35
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* anv: Implement VK_KHX_external_semaphoreJason Ekstrand2017-05-033-0/+13
| | | | Reviewed-by: Chad Versace <[email protected]>
* anv: Implement VK_KHX_external_semaphore_capabilitiesJason Ekstrand2017-05-033-0/+18
| | | | | | | This just stubs things out. Real external semaphore support will come with VK_KHX_external_semaphore_fd. Reviewed-by: Chad Versace <[email protected]>
* anv: Add a real semaphore structJason Ekstrand2017-05-032-6/+54
| | | | | | | It's just a dummy for now, but we'll flesh it out as needed for external semaphores. Reviewed-by: Chad Versace <[email protected]>
* anv: Trivially implement multiDrawIndirectJason Ekstrand2017-05-032-24/+34
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Enable VK_KHX_multiview and SPV_KHR_multiviewJason Ekstrand2017-05-032-0/+5
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/cmd_buffer: Emit instanced draws for multiple viewsJason Ekstrand2017-05-033-5/+135
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/cmd_buffer: Pull indirect draw parameter loading into a helperJason Ekstrand2017-05-031-10/+24
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/pipeline: Add shader lowering for multiviewJason Ekstrand2017-05-034-0/+244
| | | | | | | | v2 (Jason Ekstrand): - Take a view_mask rather than a whole subpass - Build the view mask into the VS shader key Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/pipeline: Add a subpass field to anv_pipelineJason Ekstrand2017-05-032-5/+8
| | | | | | This simplifies the code a variety of places. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/pipeline: Call nir_gather_info laterJason Ekstrand2017-05-031-2/+2
| | | | | | | We want to insert more lowering code that may insert system values and we need to gather info after that lowering. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Move shader hashing to anv_pipelineJason Ekstrand2017-05-033-45/+46
| | | | | | | | | | Shader hashing is very closely related to shader compilation. Putting them right next to each other in anv_pipeline makes it easier to verify that we're actually hashing everything we need to be hashing. The only real change (other than the order of hashing) is that we now hash in the shader stage. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/pass: Store the per-subpass view maskJason Ekstrand2017-05-032-0/+21
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Add the KHX_multiview boilerplateJason Ekstrand2017-05-032-0/+18
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/nir: Delete the apply_dynamic_offsets prototypeJason Ekstrand2017-05-031-3/+0
| | | | | | | | That pass hasn't existed since dd4db84640bbb694f180dd50850c3388f67228be but the prototype stuck around for no reason. Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* i965/vec4: don't modify regioning parameters to the sources of DF align1 ↵Samuel Iglesias Gonsálvez2017-05-031-8/+1
| | | | | | | | | | | | | instructions The regioning parameters are now properly set by convert_to_hw_regs() and we don't need to fix them in the generator. That latter fix previously done in the generator was strictly speaking wrong for any non-identity regions. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4: fix register width for DF VGRF and UNIFORMSamuel Iglesias Gonsálvez2017-05-031-5/+7
| | | | | | | | | | | | | | | | | | | | | | | On gen7, the swizzles used in DF align16 instructions works for element size of 32 bits, so we can address only 2 consecutive DFs. As we assumed that in the rest of the code and prepare the instructions for this (scalarize_df()), we need to set it to two again. However, for DF align1 instructions, a width of 2 is wrong as we are not reading the data we want. For example, an uniform would have a region of <0, 2, 1> so it would repeat the first 2 DFs, when we wanted to access to the first 4. This patch sets the default one to 4 and then modifies the width of align16 instruction's DF sources when we translate the logical swizzle to the physical one. v2: - Remove conditional (Curro). Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4: fix vertical stride to avoid breaking region parameter ruleSamuel Iglesias Gonsálvez2017-05-031-18/+32
| | | | | | | | | | | | | | | | | | | | | | | | | | | From IVB PRM, vol4, part3, "General Restrictions on Regioning Parameters": "If ExecSize = Width and HorzStride ≠ 0, VertStride must be set to Width * HorzStride." In next patch, we are going to modify the region parameter for uniforms and vgrf. For uniforms that are the source of DF align1 instructions, they will have <0, 4, 1> regioning and the execsize for those instructions will be 4, so they will break the regioning rule. This will be the same for VGRF sources where we use the vstride == 0 exploit. As we know we are not going to cross the GRF boundary with that execsize and parameters (not even with the exploit), we just fix the vstride here. v2: - Move is_align1_df() (Curro) - Refactor exec_size == width calculation (Curro) Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* anv/tests: Create a dummy instance as well as deviceJason Ekstrand2017-05-014-4/+16
| | | | | | | | | This fixes crashes caused by 35e626bd0e59e7ce9fd97ccef66b2468c09206a4 which made us start referencing the instance in the allocators. With this commit, the tests now happily pass again. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100877 Tested-by: Vinson Lee <[email protected]>
* anv: Drop 'x11' prefix from non-X11 WSI funcsChad Versace2017-04-281-16/+16
| | | | | | | Drop it from x11_anv_wsi_image_create and x11_anv_wsi_image_free. The functions are used by Wayland WSI too. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Alphabetize KHR extensionsJason Ekstrand2017-04-281-18/+18
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* anv: Move queues, events, and semaphores to their own fileJason Ekstrand2017-04-273-484/+516
| | | | | | | Things are about to get more complicated, especially as far as semaphores are concerned. Reviewed-by: Chad Versace <[email protected]>
* anv: Implement VK_KHX_external_memory_fdJason Ekstrand2017-04-273-18/+113
| | | | | | | | | | | | | | | | | | This commit just exposes the memory handle type. There's interesting we need to do here for images. So long as the user doesn't set any crazy environment variables such as INTEL_DEBUG=nohiz, all of the compression formats etc. should "just work" at least for opaque handle types. v2 (chadv): - Rebase. - Fix vkGetPhysicalDeviceImageFormatProperties2KHR when handleType == 0. - Move handleType-independency comments out of handleType-switch, in vkGetPhysicalDeviceExternalBufferPropertiesKHX. Reduces diff in future dma_buf patches. Co-authored-with: Chad Versace <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* anv: Use the BO cache for DeviceMemory allocationsJason Ekstrand2017-04-275-26/+30
| | | | Reviewed-by: Chad Versace <[email protected]>
* anv/allocator: Add a BO cacheJason Ekstrand2017-04-272-0/+278
| | | | | | | | | | | | This cache allows us to easily ensure that we have a unique anv_bo for each gem handle. We'll need this in order to support multiple-import of memory objects and semaphores. v2 (Jason Ekstrand): - Reject BO imports if the size doesn't match the prime fd size as reported by lseek(). Reviewed-by: Chad Versace <[email protected]>
* anv: Implement VK_KHX_external_memoryJason Ekstrand2017-04-272-0/+5
| | | | | | | This is the trivial implementation that just exposes the extension string but exposes zero external handle types. Reviewed-by: Chad Versace <[email protected]>
* anv: Implement VK_KHX_external_memory_capabilitiesChad Versace2017-04-274-14/+116
| | | | | | | | | | | | | | | | | | This is a complete but trivial implementation. It's trivial becasue We support no external memory capabilities yet. Most of the real work in this commit is in reworking the UUIDs advertised by the driver. v2 (chadv): - Fix chain traversal in vkGetPhysicalDeviceImageFormatProperties2KHR. Extract VkPhysicalDeviceExternalImageFormatInfoKHX from the chain of input structs, not the chain of output structs. - In vkGetPhysicalDeviceImageFormatProperties2KHR, iterate over the input chain and the output chain separately. Reduces diff in future dma_buf patches. Co-authored-with: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/physical_device: Rename uuid to pipeline_cache_uuidJason Ekstrand2017-04-273-5/+6
| | | | | | | We're about to have more UUIDs for different things so this one really needs to be properly labeled. Reviewed-by: Chad Versace <[email protected]>
* anv: Refactor device_get_cache_uuid into physical_device_init_uuidsJason Ekstrand2017-04-271-13/+17
| | | | Reviewed-by: Chad Versace <[email protected]>
* anv: Set EXEC_OBJECT_ASYNC when availableJason Ekstrand2017-04-274-0/+10
| | | | Reviewed-by: Chad Versace <[email protected]>
* anv/cmd_buffer: Use the device allocator for QueueSubmitJason Ekstrand2017-04-271-3/+3
| | | | | | | | The command is really operating on a Queue not a command buffer and the nearest object to that with an allocator is VkDevice. Reviewed-by: Chad Versace <[email protected]> Cc: "17.0 17.1" <[email protected]>
* anv: Don't place scratch buffers above the 32-bit boundaryJason Ekstrand2017-04-271-0/+19
| | | | | | | | | | | | This fixes rendering corruptions in DOOM. Hopefully, it will also make Jenkins a bit more stable as we've been seeing some random failures and GPU hangs ever since turning on 48bit. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100620 Fixes: 651ec926fc1 "anv: Add support for 48-bit addresses" Tested-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: "17.1" <[email protected]>
* genxml: Fix gen_pack_header.py crash when field type is invalid.Rafael Antognolli2017-04-241-2/+2
| | | | | | | | Just return earlier in that case. Also set prefix to an empty string, so we don't get to use it undefined. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Make BLEND_STATE command support variable length array.Rafael Antognolli2017-04-247-48/+74
| | | | | | | | | | | | | | | | | | | | | | | | We need to emit BLEND_STATE, which size is 1 + 2 * nr_draw_buffers dwords (on gen8+), but the BLEND_STATE struct length is always 17. By marking it size 1, which is actually the size of the struct minus the BLEND_STATE_ENTRY's, we can emit a BLEND_STATE of variable number of entries. For gen6 and gen7 we set length to 0, since it only contains BLEND_STATE_ENTRY's, and no other data. With this change, we also change the code for blorp and anv to emit only the needed BLEND_STATE_ENTRY's, instead of always emitting 16 dwords on gen6-7 and 17 dwords on gen8+. v2: - Use designated initializers on blorp and remove 0 from initialization (Jason) - Default entries to disabled on Vulkan (Jason) - Rebase code. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Fix python crash when no dwords are found.Rafael Antognolli2017-04-241-5/+12
| | | | | | | | | | | | | | | If the 'dwords' dict is empty, max(dwords.keys()) throws an exception. This case could happen when we have an instruction that is only an array of other structs, with variable length. v2: - Add another clause for empty dwords and make it work with python 3 (Dylan) - Set the length to 0 if dwords is empty, and do not declare dw Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml: Remove unused parameter.Rafael Antognolli2017-04-241-2/+2
| | | | | | | 'start' parameter from Group.emit_pack_function() is useless. Signed-off-by: Rafael Antognolli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/aubinator: Correctly read variable length structs.Rafael Antognolli2017-04-243-6/+54
| | | | | | | | | | | | | | | | Before this commit, when a group with count="0" is found, only one field is added to the struct representing the instruction. This causes only one entry to be printed by aubinator, for variable length groups. With this commit we "detect" that there's a variable length group (count="0") and store the offset of the last entry added to the struct when reading the xml. When finally reading the aubdump file, we check the size of the group and whether we have variable number of elements, and in that case, reuse the last field to add the remaining elements. Signed-off-by: Rafael Antognolli <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* isl/format: Update the R16G16B16X16_FLOAT entryNanley Chery2017-04-241-1/+1
| | | | | | | | | | | The section of the PRM mentioned in the code comment above this table says that this format supports the render target write message. Internal documentation says that this format also supports alpha blending. As a side effect, this allows CCS_D buffers to be created for images with this format. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* anv/pass: Delete anv_pass::subpass_attachmentsNanley Chery2017-04-241-1/+0
| | | | | | | This field has no users. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* intel/fs: Take into account amount of data read in spilling cost heuristic.Francisco Jerez2017-04-241-1/+1
| | | | | | | | | | | | | | | | | | | Until now the spilling cost calculation was neglecting the amount of data read from the register during the spilling cost calculation. This caused it to make suboptimal decisions in some cases leading to higher memory bandwidth usage than necessary. Improves Unigine Heaven performance by ~4% on BDW, reversing an unintended FPS regression from my previous commit 147e71242ce539ff28e282f009c332818c35f5ac with n=12 and statistical significance 5%. In addition SynMark2 OglCSDof performance is improved by an additional ~5% on SKL, and a Kerbal Space Program apitrace around the Moho planet I can provide on request improves by ~20%. Cc: <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/fs: Use regs_written() in spilling cost heuristic for improved accuracy.Francisco Jerez2017-04-241-2/+1
| | | | | | | | | | This is what we use later on to compute the number of registers that will actually get spilled to memory, so it's more likely to match reality than the current open-coded approximation. Cc: <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Use reads_accumulator_implicitly(), not MACH checks.Kenneth Graunke2017-04-241-4/+4
| | | | | | | | | Curro pointed out that I should not just check for MACH, but use the reads_accumulator_implicitly() helper, which would also prevent the same bug with MAC and SADA2 (if we ever decide to use them). Cc: [email protected] Reviewed-by: Francisco Jerez <[email protected]>
* nir/i965: add before ffma algebraic optsTimothy Arceri2017-04-241-0/+6
| | | | | | | | | | | | | | | | | | | | | | | This shuffles constants down in the reverse of what the previous patch does and applies some simpilifications that may be made possible from doing so. Shader-db results BDW: total instructions in shared programs: 12980814 -> 12977822 (-0.02%) instructions in affected programs: 281889 -> 278897 (-1.06%) helped: 1231 HURT: 128 total cycles in shared programs: 246562852 -> 246567288 (0.00%) cycles in affected programs: 11271524 -> 11275960 (0.04%) helped: 1630 HURT: 1378 V2: mark float opts as inexact Reviewed-by: Elie Tournier <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>