summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* i965: use nir_lower_indirect_derefs() for GLSLTimothy Arceri2016-12-231-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. The changes seem to be caused be the difference in the GLSL IR vs NIR variable index lowering passes. The GLSL IR pass creates a simple if ladder for arrays of size 4 or less, while the NIR pass implements a binary search for all arrays regardless of size. Shader-db results BDW: total instructions in shared programs: 13021176 -> 13021819 (0.00%) instructions in affected programs: 57693 -> 58336 (1.11%) helped: 20 HURT: 190 total cycles in shared programs: 299805580 -> 299750826 (-0.02%) cycles in affected programs: 2290024 -> 2235270 (-2.39%) helped: 337 HURT: 442 total fills in shared programs: 19984 -> 19984 (0.00%) fills in affected programs: 0 -> 0 helped: 0 HURT: 0 LOST: 4 GAINED: 0 V2: remove the do_copy_propagation() call from the i965 GLSL IR linking code. This call was added in f7741c52111 but since we are moving the variable index lowering to NIR we no longer need it and can just rely on the nir copy propagation pass. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Fix uniform and storage buffer offset alignment limits.Francisco Jerez2016-12-161-2/+2
| | | | | | | | | | | | | | | | | | | | | This fixes a regression in a bunch of image store vulkan CTS tests from commit ad38ba113491869ab0dffed937f7b3dd50e8a735, which started using OWORD block read messages to implement UBO loads. The reason for the failure is that we were giving bogus buffer alignment limits to the application (1B), so the CTS would happily come back with descriptor sets pointing at not even word-aligned uniform buffer addresses. Surprisingly the sampler messages used to fetch pull constants before that commit were able to cope with the non-texel aligned addresses, but the dataport messages used to fetch pull constants after that commit and the ones used to access storage buffers (before and after the same commit) aren't as permissive with unaligned addresses. Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99097 Reported-by: Mark Janes <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* genxml: Make Gen8 3DSTATE_DS SIMD8 enable work like Gen9+.Kenneth Graunke2016-12-141-1/+4
| | | | | | | This will let us avoid ifdefs. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: Rename "DS Function Enable" to "Function Enable".Kenneth Graunke2016-12-142-2/+2
| | | | | | | This makes Gen7/7.5 match Gen8-9. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Reject VkMemoryAllocateInfo::allocationSize == 0Chad Versace2016-12-141-5/+2
| | | | | | The Vulkan 1.0.33 spec says "allocationSize must be greater than 0". Reviewed-by: Nanley Chery <[email protected]>
* intel/aubinator: fix 32bit shift overflow warningGrazvydas Ignotas2016-12-111-1/+1
| | | | | | | | Doesn't look like this can work on 32bit, just rids of annoying warning. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* anv: fix release build unused variable warningsGrazvydas Ignotas2016-12-112-2/+3
| | | | | Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* anv: Clean up some unused variablesEdward O'Callaghan2016-12-101-15/+0
| | | | | | | Following on from the spirit of commit 011e5570f. Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Add split_blorp_blit_debug switchJordan Justen2016-12-071-3/+9
| | | | | | | | | Enabling this debug switch causes surface shrinking to happen by default, and lowers the surface size limit which causes blorp blits to be split. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Enable splitting large blorp blitsJordan Justen2016-12-071-1/+40
| | | | | | | | | | | | | | | | Detect when the surface sizes are too large for a blorp blit. When it is too large, the blorp blit will be split into a smaller operation and attempted again. For gen7, this fixes the cts test: ES3-CTS.gtf.GL3Tests.framebuffer_blit.framebuffer_blit_functionality_multisampled_to_singlesampled_blit It will also enable us to increase our renderable size from 8k x 8k to 16k x 16k. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Move RGB=>R conversion to follow blit splittingJordan Justen2016-12-071-48/+65
| | | | | | | | | | | | | | In blorp_copy, when RGB surfaces are copied, we convert the destination surface to a Red only surface, but 3 times as wide. This introduces an implicit restriction of "mod 3" for the destination width. It is easier to handle the blorp split buffer offsetting with the original RGB surface, and do the RGB=>R after this. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Adjust blorp surface parameters for split blitsJordan Justen2016-12-071-3/+94
| | | | | | | | | | | | If try_blorp_blit() previously returned that a blit was too large, shrink_surface_params() will be used to update the surface parameters for the smaller blit so the blit operation can proceed. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Split blorp blits if they are too largeJordan Justen2016-12-071-6/+96
| | | | | | | | | | | | | | | | | | | | | We rename do_blorp_blit() to try_blorp_blit(), and add a return error if the surface size for the blit is too large. Now, do_blorp_blit() is rewritten to try to split the blit into smaller operations if try_blorp_blit() fails. Note: In this commit, try_blorp_blit() will always attempt to blit and never return an error, which matches the previous behavior. We will enable the size checking and splitting in a future commit. The motivation for this splitting is that in some cases when we flatten an image, it's dimensions grow, and this can then exceed the programmable hardware limits. An example is w-tiled+MSAA blits. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_blit: Create structure for src & dst coordinatesJordan Justen2016-12-071-19/+56
| | | | | | | | | | | | | | This will be useful for splitting blits into smaller sizes. We also make the coordinates of type double rather than float. Since we will be splitting and scaling the coordinates, we might require extra precision in the calculations. v2: * Use double instead of float. (Jason) Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/TODO: Document sampling from HiZNanley Chery2016-12-061-0/+1
| | | | Acked-by: Jason Ekstrand <[email protected]>
* genxml/gen9: Change the default of MI_SEMAPHORE_WAIT::RegisterPoleModeJason Ekstrand2016-12-061-1/+1
| | | | | | | | | | We would really like it to be false as that's what you get on hardware that doesn't have RegisterPoleMode (Sky Lake for example). While we're at it, we change it to a boolean. This fixes dEQP-VK.synchronization.smoke.events on Broxton. Reviewed-by: Kenneth Graunke <[email protected]> Cc: "13.0" <[email protected]>
* anv/pipeline: Call nir_lower_constant_initializersJason Ekstrand2016-12-051-0/+13
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* Revert "i965: use nir_lower_indirect_derefs() for GLSL"Jason Ekstrand2016-12-051-0/+10
| | | | | This reverts commit 9404439a754e5640ccd98df40fa694835c0d8759. I didn't intend to push it and it breaks clip and cull distance.
* i965: use nir_lower_indirect_derefs() for GLSLTimothy Arceri2016-12-051-10/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves the nir_lower_indirect_derefs() call into brw_preprocess_nir() so thats is called by both OpenGL and Vulkan and removes that call to the old GLSL IR pass lower_variable_index_to_cond_assign() We want to do this pass in nir to be able to move loop unrolling to nir. There is a increase of 1-3 instructions in a small number of shaders, and 2 Kerbal Space program shaders that increase by 32 instructions. Shader-db results BDW: total instructions in shared programs: 8705873 -> 8706194 (0.00%) instructions in affected programs: 32515 -> 32836 (0.99%) helped: 3 HURT: 79 total cycles in shared programs: 74618120 -> 74583476 (-0.05%) cycles in affected programs: 528104 -> 493460 (-6.56%) helped: 47 HURT: 37 LOST: 2 GAINED: 0
* anv: expose support for VK_KHR_sampler_mirror_clamp_to_edgeIlia Mirkin2016-11-301-0/+4
| | | | | | | This is already supported in genX_state.c, expose the extension string. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Actually use the stencil dimensionJason Ekstrand2016-11-301-1/+1
| | | | | | | | In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I accidentally kept setting the SurfaceType to 2D in the stencil-only case thanks to a copy+paste error. Reviewed-by: Nanley Chery <[email protected]>
* anv: Prefer in-tree headers to out-of-tree headersVille Syrjälä2016-11-301-5/+11
| | | | | | | | | | | | | | | | | | Set the include paths to consider in-tree headers before out-of-tree headers. Avoids the build failing due to stale headers being present in $prefix. Previosuly 'make -ki install' or something similar was required to update the out-of-tree headers to allow the build to succeed. Also avoids having to rebuild the entire thing after every 'make install'. Cc: Rob Clark <[email protected]> Cc: Jason Ekstrand <[email protected]> Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* aubinator: Add support for enum typesKristian H. Kristensen2016-11-292-40/+93
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Fix ksp for INTERFACE_DESCRIPTOR_DATAKristian H. Kristensen2016-11-292-4/+2
| | | | | | | | | | This one was split across two dwords as "Kernel Start Pointer" and "Kernel Start Pointer High", which looks like it works when the driver only accesses "Kernel Start Pointer". This breaks, of course, with BO offsets > 4G. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Logic_Op_Function where applicableKristian H. Kristensen2016-11-295-56/+62
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use blend function and factor enums where applicableKristian H. Kristensen2016-11-295-130/+124
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Vertex_Component_Control where applicableKristian H. Kristensen2016-11-295-20/+20
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Stencil_Operation where applicableKristian H. Kristensen2016-11-295-84/+63
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum SURFACE_FORMAT where applicableKristian H. Kristensen2016-11-295-10/+10
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use enum 3D_Prim_Topo_Type where applicableKristian H. Kristensen2016-11-295-15/+15
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Use 3D_Compare_Function for gen8+ test functionsKristian H. Kristensen2016-11-292-8/+8
| | | | | | | | | When the state fields where shuffled around for gen8, the compare function enums were downgraded to just uints. Change them to enum 3D_Compare_Function. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Emit genxml enums as C enumsKristian H. Kristensen2016-11-291-4/+4
| | | | | | | | | The previous commits got rid of any clashes between #defines and enum values and we can now emit the genxml enums as debugger friendly C enums. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Remove duplicate COMPAREFUNCTION valuesKristian H. Kristensen2016-11-293-120/+12
| | | | | | | | These values were defined both as an enum and as inline values. Remove the inline values and reference the 3D_Compare_Function enum instead. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Allow referencing enums in type attributesKristian H. Kristensen2016-11-291-0/+7
| | | | | | | This lets us reference enums in the type attribute of a field. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Emit cherryview SF state without including gen9_pack.hKristian H. Kristensen2016-11-291-13/+23
| | | | | | | | | Cleaner this way and we avoid including gen9_pack.h when we compile with gen8_pack.h. We also avoid the if (cherryview) condition for non-gen8 gens that don't need it. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't include two different pack headersKristian H. Kristensen2016-11-291-3/+5
| | | | | | | | | | The batch chain logic only needs the pre-gen8 size of MI_BATCH_BUFFER_START, which seems like something we can make a special case for. The other two gen7 references, MI_BATCH_BUFFER_END and MI_NOOP, are the same on all gens. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Move enums above structsKristian H. Kristensen2016-11-295-1726/+1726
| | | | | | | | | We'll need to define them before we can reference them in structs and instructions. Enums have no dependencies, so move them first in the file. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* genxml: Add values for Barycentric Interpolation ModeKristian H. Kristensen2016-11-295-5/+40
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: remove per-sample shading from TODOIlia Mirkin2016-11-301-1/+0
| | | | | | | This was done some time ago. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: clean up VkPhysicalDeviceFeatures listIlia Mirkin2016-11-301-3/+3
| | | | | | | | | | Remove duplicate .alphaToOne, add missing .shaderResourceMinLod, and reorder a few entries to match their vulkan.h order. All the sparse features are still left out entirely. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: bump the texture gather offset limitsIlia Mirkin2016-11-291-2/+2
| | | | | | | | This matches what NVIDIA and AMD hardware expose, as well as what Intel hardware supports. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculationJason Ekstrand2016-11-281-3/+6
| | | | | | | | | | | The 1-D special case doesn't actually apply to depth or HiZ. I discovered this while converting BLORP over to genxml and ISL. The reason is that the 1-D special case only applies to the new Sky Lake 1-D layout which is only used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers, the old gen4 2-D layout is used and the QPitch should be in rows. Reviewed-by: Nanley Chery <[email protected]> Cc: "13.0" <[email protected]>
* anv/cmd_buffer: Set the correct surface type for depth/stencilJason Ekstrand2016-11-281-2/+53
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv: enable drawIndirectFirstInstanceIlia Mirkin2016-11-281-1/+1
| | | | | | | | This was already piped through in the CmdDraw(Indexed)Indirect handling. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: expose depthBiasClamp, it is already setIlia Mirkin2016-11-281-1/+1
| | | | | | | | The gen7/8_cmd_buffer logic already sets the clamp, and it's piped through via the dynamic state. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: bump maxFramebufferLayers to 2048Ilia Mirkin2016-11-281-1/+1
| | | | | | | | | This matches maxImageArrayLayers, as well as the same setting in the GL frontend. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: enable storage image extended formatsIlia Mirkin2016-11-281-1/+1
| | | | | | | | These are all regularly available in desktop GL, so the backend fully supports them. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: expose imageCubeArray functionalityIlia Mirkin2016-11-281-1/+1
| | | | | | | | This appears to be fully supported already. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: set maxFragmentDualSrcAttachments to 1Dave Airlie2016-11-291-1/+1
| | | | | | | Reported-by: Ilia Mirkin <[email protected]> Cc: "13.0" <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* intel/aubinator: Pull useful information from the AUB headerJason Ekstrand2016-11-281-2/+32
| | | | | | | | | | | This commit does two things. One is to pull useful and/or interesting information from the AUB file header and display it as a header above your decoded batches. Second, it is now capable of pulling the PCI ID from the AUB file comment left by intel_aubdump. This removes the need to use the --gen flag all the time. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]>