summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv: Disable dual source blending when shader doesn't support it on gen8+Danylo Piliaiev2018-10-301-10/+36
| | | | | | | | | | | | | | | | | | | | | | | | | Dual source blending behaviour is undefined when shader doesn't have second color output. "If SRC1 is included in a src/dst blend factor and a DualSource RT Write message is not used, results are UNDEFINED. (This reflects the same restriction in DX APIs, where undefined results are produced if “o1” is not written by a PS – there are no default values defined)." Dismissing fragment in such situation leads to a hang on gen8+ if depth test in enabled. Since blending cannot be gracefully fixed in such case and the result is undefined - blending is simply disabled. v2 (Jason Ekstrand): - Apply the workaround to each individual entry - Emit a warning through debug_report Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aub_viewer: show vertex buffer pitchEric Engestrom2018-10-301-1/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Bump the advertised patch version to 90Jason Ekstrand2018-10-301-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: tools: Add handling for video pipeToni Lönnberg2018-10-302-1/+30
| | | | | | | | Preliminary work for adding handling of different pipes to gen_decoder. We need to be able to distinguish between different pipes in order to decode the packets correctly due to opcode re-use. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Use 'DWord Length' and 'bias' fields for packet length.Toni Lönnberg2018-10-302-7/+25
| | | | | | | | Use the 'DWord Length' and 'bias' fields from the instruction definition to parse the packet length from the command stream when possible. The hardcoded mechanism is used whenever an instruction doesn't have this field. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch-decoder: remove never-used functionEric Engestrom2018-10-302-42/+0
| | | | | | | | | This function was there when the file was introduced in commit 38f10d5a03542c60a589 "intel: tools: add aubinator viewer", but was never actually used. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: add missing meson build dependencyEric Engestrom2018-10-291-1/+1
| | | | | | | Fixes: e4538b93f5d5177318f2 "anv: Implement VK_KHR_driver_properties" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* anv: Use absolute timeouts in wait_for_bo_fencesJason Ekstrand2018-10-271-42/+30
| | | | | | | | | | | We were previously using relative timeouts and decrementing the user-provided timeout as we waited. Instead, this commit refactors things to use absolute timeouts throughout. This should fix a subtle bug in the waitAll case where we aren't decrementing the timeout after a successful GPU wait. Since pthread_cond_timedwait already takes an absolute timeout, it's also significantly simpler. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Flag semaphore BOs as externalJason Ekstrand2018-10-271-2/+3
| | | | | | | | It probably doesn't actually break anything but it does cause some assertions in debug builds. Fixes: 7a89a0d9edae6 "anv: Use separate MOCS settings for external BOs" Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Improve the asserts in anv_buffer_get_rangeJason Ekstrand2018-10-271-1/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* Revert "anv/skylake: disable ForceThreadDispatchEnable"Jason Ekstrand2018-10-261-35/+7
| | | | | | | | | | This reverts commit 0fa9e6d7b304f6a8064ed78a4b9c557e1026e7e5. The real issue appears to have been that HiZ ops don't like having WM thread dispatch force-enabled. The previous commit fixes that problem so we can go back to using the ForceThreadDispatchEnable bit even on SKL+. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* blorp: Emit a dummy 3DSTATE_WM prior to 3DSTATE_WM_HZ_OPJason Ekstrand2018-10-261-0/+9
| | | | | | Cc: [email protected] Suggested-by: Francisco Jerez <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Return VK_ERROR_DEVICE_LOST from anv_device_set_lostJason Ekstrand2018-10-264-45/+32
| | | | | | | This lets us get rid of a bunch of duplicated error messages. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv/util: Split a vk_errorv helper out of vk_errorfJason Ekstrand2018-10-262-6/+25
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/blorp: Define the clear value bounds for HiZ clearsNanley Chery2018-10-261-0/+14
| | | | | | | | | | | | | | | | Follow the restriction of making sure the clear value is between the min and max values defined in CC_VIEWPORT. Avoids a simulator warning for some piglit tests, one of them being: ./bin/depthstencil-render-miplevels 146 d=z32f_s8 Jason found this to fix incorrect clearing on SKL. Fixes: 09948151ab1d5184b4dd9052bb1f710fa1e00a7b ("intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORP") Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]>
* vulkan: drop always-true paramEric Engestrom2018-10-261-2/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/nir: Use the OPT macro for more passesJason Ekstrand2018-10-261-3/+3
| | | | Reviewed-by: Ian Romanick <[email protected]>
* nir/builder: Add a nir_imm_true/false helpersJason Ekstrand2018-10-261-1/+1
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/validate: Print when the validation failedJason Ekstrand2018-10-262-5/+5
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* anv: Handle the device loss abort in anv_device_set_lostJason Ekstrand2018-10-262-5/+11
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Add helpers for setting/checking device lostJason Ekstrand2018-10-264-21/+36
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Provide a error message with a DEVICE_LOSTJason Ekstrand2018-10-261-1/+2
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Fix sanitization of stencil state when the depth test is disabledAlex Smith2018-10-261-7/+7
| | | | | | | | | | | | | | | | When depth testing is disabled, we shouldn't pay attention to the specified depthCompareOp, and just treat it as always passing. Before, if the depth test is disabled, but depthCompareOp is VK_COMPARE_OP_NEVER (e.g. from the app having zero-initialized the structure), then sanitize_stencil_face() would have incorrectly changed passOp to VK_STENCIL_OP_KEEP. v2: Roll the depthTestEnable check into the ds_aspect check below since they now both do the same thing. Fixes: 028e1137e6 "anv/pipeline: Be smarter about depth/stencil state" Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Print message descriptor as immediate sourceSagar Ghuge2018-10-261-1/+7
| | | | | | | | | While disassembling send(c) instruction print message descriptor as immediate source operand along with message descriptor. This allows assembler to read immediate source operand and set bits accordingly. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/compiler: Print hex representation along with floating point valueSagar Ghuge2018-10-261-3/+9
| | | | | | | | | | | | | | While encoding the immediate floating point values in instruction we use values upto precision 9, but while disassembling, we print precision to 6 places, which round up the value and gives wrong interpretation for encoded immediate constant. To avoid misinterpretation of encoded immediate values in instruction and disassembled output, print hex representation along with floating point value which can be used by assembler in future. Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* util: use C99 declaration in the for-loop set_foreach() macroEric Engestrom2018-10-252-4/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* util: use C99 declaration in the for-loop hash_table_foreach() macroEric Engestrom2018-10-254-8/+0
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* anv: move variable to proper scope and mark as MAYBE_UNUSEDEric Engestrom2018-10-241-2/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: use snprintf() instead of memset()+strcpy()Eric Engestrom2018-10-241-4/+3
| | | | | | | | | snprintf() guarantees that it will not write more chars than allowed, and that the string will be null-terminated, without the need to fill the whole thing with zeroes to begin with. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: drop unused includesEric Engestrom2018-10-241-3/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Allow presenting via a different GPUAlex Smith2018-10-241-2/+2
| | | | | | | | | | | | | | | anv_GetPhysicalDeviceSurfaceSupportKHR will already return success for this, but anv_GetPhysicalDevice{Xcb,Xlib}PresentationSupportKHR do not. Apps which check for presentation support via the latter (all Feral Vulkan games at least) will therefore fail. This allows me to render on an Intel GPU and present to a display connected to an AMD card (tested HD 530 + Vega 64). v2: Rebase on current master. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Change src1 reg type to unsigned doublewordSagar Ghuge2018-10-232-3/+3
| | | | | | | | | | | To have uniform behavior while disassembling send(c) instruction use register type of unsigned doubleword for src1 when message descriptor is immediate value. Bspec does not specifiy anything for src1 immediate default type. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Sagar Ghuge <[email protected]>
* intel/decoders: fix end of batch limitLionel Landwerlin2018-10-233-9/+10
| | | | | | | | | | | | | Pointer arithmetic... v2: s/4/sizeof(uint32_t)/ (Eric) v3: Give bytes to print_batch() in error_decode (Lionel) Make clear what values we're dealing with in error_decode (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v2) Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fix decoding for partial STATE_BASE_ADDRESS updates.Kenneth Graunke2018-10-222-6/+42
| | | | | | | | | | | STATE_BASE_ADDRESS only modifies various bases if the "modify" bit is set. Otherwise, we want to keep the existing base address. Iris uses this for updating Surface State Base Address while leaving the others as-is. v2: Also update aubinator_viewer_decoder (caught by Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* anv,radv: Trivially expose two new VK_GOOGLE extensionsJason Ekstrand2018-10-221-0/+2
| | | | | | | | | | | | This patch exposes support for the following two extensions: * VK_GOOGLE_decorate_string * VK_GOOGLE_hlsl_functionality1 There's nothing for the driver to do; it's all handled in spirv_to_nir. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107971 Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: Define trampolines as the weak functionsJason Ekstrand2018-10-192-49/+21
| | | | | | | | | | | | | | | | | | | | | Instead of having weak references to the anv functions and separate trampoline functions with their own dispatch table, just make the trampoline functions weak. This gets rid of a dispatch table and potentially lets the compiler delete the unused weak function. The end result is a reduction in the .text section of 5.7K and a reduction in the .data section of 1.4K. Before: text data bss dec hex filename 3190329 282232 8960 3481521 351fb1 _install/lib64/libvulkan_intel.so After: text data bss dec hex filename 3184548 280792 8960 3474300 35037c _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <[email protected]>
* Revert "anv: Stop generating weak references for instance entrypoints"Jason Ekstrand2018-10-181-0/+13
| | | | | | This reverts commit 00bb42105d6edf6e432c0e3712ffb9d3eb0aece4. It was not as well thought out as I had intended and broke the build when VK_KHR_display is disabled in the build.
* vulkan/wsi: Use VK_EXT_pci_bus_info for DRM fd matchingJason Ekstrand2018-10-182-5/+3
| | | | | | | | This lets us avoid passing the DRM fd around all over the place and gets us closer to layer utopia. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Stop generating weak references for instance entrypointsJason Ekstrand2018-10-181-13/+0
| | | | | | | | | We don't need weak references to instance entrypoints because we never have more than one of each so we don't need the NULL fall-back. This also helps us avoid forgetting things because we now get link errors for missing instance entrypoints. Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi: Implement GetPhysicalDevicePresentRectanglesKHRJason Ekstrand2018-10-181-0/+14
| | | | | | | | | | | | | | | | | | This got missed during 1.1 enabling because it was defined as an interaction between device groups and WSI and it wasn't obvious it was in the delta. The idea behind it is that it's supposed to provide a hint to the application in a multi-GPU setup to indicate which regions of the screen are being scanned out by which GPU so a multi-device split-screen rendering application can render each part of the screen on the GPU that will be presenting it and avoid extra bus traffic between GPUs. On a single-GPU setup or one which doesn't support this present mode, we need to do something. We choose to return the window size (or a max-size rect) if the compositor, X server, or crtc is associated with the given physical device and zero rectangles otherwise. Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi: Store the instance allocator in wsi_deviceJason Ekstrand2018-10-182-3/+0
| | | | | | | | | | | | We already have wsi_device and we know the instance allocator at wsi_device_init time so there's no need to pass it into the physical device queries. This also fixes a memory allocation domain bug that can occur if CreateSwapchain gets called prior to any queries (not likely) in which case the cached connection gets allocated off the device instead of the instance. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan: Add VK_EXT_calibrated_timestamps extension (radv and anv) [v5]Keith Packard2018-10-175-0/+150
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Offers three clocks, device, clock monotonic and clock monotonic raw. Could use some kernel support to reduce the deviation between clock values. v2: Ensure deviation is at least as big as the GPU time interval. v3: Set device->lost when returning DEVICE_LOST. Use MAX2 and DIV_ROUND_UP instead of open coding these. Delete spurious TIMESTAMP in radv version. Suggested-by: Jason Ekstrand <[email protected]> Suggested-by: Lionel Landwerlin <[email protected]> v4: Add anv_gem_reg_read to anv_gem_stubs.c Suggested-by: Jason Ekstrand <[email protected]> v5: Adjust maxDeviation computation to max(sampled_clock_period) + sample_interval. Suggested-by: Bas Nieuwenhuizen <[email protected]> Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler/icl: Use invocation id bits 22:16 instead of 23:17Topi Pohjolainen2018-10-171-2/+6
| | | | | | | | | | | | | | Identifier bits in the dispatch header have changed. See Bspec: SINGLE_PATCH Payload: 3D Pipeline Stages - 3D Pipeline Geometry - Hull Shader (HS) Stage IVB+ - Payloads IVB+ Fixes: KHR-GL46.tessellation_shader.tessellation_shader_tc_barriers.barrier_guarded_read_write_calls Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965/fs: Add 64-bit int immediate support to dump_instructions()Matt Turner2018-10-162-0/+8
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv/skylake: disable ForceThreadDispatchEnableSergii Romantsov2018-10-161-7/+35
| | | | | | | | | | | | | | | | On Skylake enabling of ForceThreadDispatchEnable causes gpu-hang. -v2: enabling of ForceThreadDispatchEnable is only for gen8, for gen9 and higher reverted enabling of PixelShaderHasUAV. -v3 (Jason Ekstrand): Rework the comments a bit. CC: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107941 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107760 Fixes: 79270d2140ec (anv: Stop setting 3DSTATE_PS_EXTRA::PixelShaderHasUAV) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Implement VK_EXT_pci_bus_infoLionel Landwerlin2018-10-163-5/+26
| | | | | | | | Even though the Intel GPU are always at the same PCI location, all the info we need is already provided by libdrm. Let's be future proof. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel: disable FS IR validation in release mode.Kenneth Graunke2018-10-151-0/+2
| | | | | | We probably don't need to iterate, fprintf, and abort in release mode. Reviewed-by: Matt Turner <[email protected]>
* intel/nir, freedreno/ir3: Use the separated dead write vars passCaio Marcelo de Oliveira Filho2018-10-151-0/+1
| | | | | | | No changes to shader-db for intel. No changes to shader-db expected for freedreno. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Don't advertise ASTC support on BSWJason Ekstrand2018-10-151-0/+8
| | | | | Tested-by: Mark Janes <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Split dispatch tables into device and instanceJason Ekstrand2018-10-153-91/+230
| | | | | | | | | | | | | | | | | | | | There's no reason why we need generate trampoline functions for instance functions or carry N copies of the instance dispatch table around for every hardware generation. Splitting the tables and being more conservative shaves about 34K off .text and about 4K off .data when built with clang. Before splitting dispatch tables: text data bss dec hex filename 3224305 286216 8960 3519481 35b3f9 _install/lib64/libvulkan_intel.so After splitting dispatch tables: text data bss dec hex filename 3190325 282232 8960 3481517 351fad _install/lib64/libvulkan_intel.so Reviewed-by: Lionel Landwerlin <[email protected]>