summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* i965: Move clip program compilation to the compilerJason Ekstrand2017-05-269-0/+2347
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move SF compilation to the compilerJason Ekstrand2017-05-264-0/+932
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/compiler: Make brw_disasm take const assemblyJason Ekstrand2017-05-263-15/+15
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/decoder: Handle the BLT ring in gen_group_get_lengthJason Ekstrand2017-05-261-0/+4
| | | | Reviewed-by: Jordan Justen <[email protected]>
* intel/decoder: Handle gen4 VF_STATISTICS and PIPELINE_SELECTJason Ekstrand2017-05-261-2/+7
| | | | | | | These need special handling because they have no "DWord Length" parameter and they have an unusual bias of 1. Reviewed-by: Jordan Justen <[email protected]>
* intel/genxml: Rename 3DSTATE_AA_LINE_PARAMS on gen5Jason Ekstrand2017-05-261-1/+1
| | | | | | All of the other gens use "PARAMETERS". Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Use the right subtype for VF_STATISTICS on gen4Jason Ekstrand2017-05-261-1/+1
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Iron Lake doesn't support non-normalized sampler coordinatesJason Ekstrand2017-05-261-1/+0
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Add SAMPLER_STATE to gen 4.5Jason Ekstrand2017-05-261-0/+63
| | | | | | Somehow this got missed. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Rename the CC_VIEWPORT pointer on gen4-5Jason Ekstrand2017-05-263-3/+3
| | | | | | | It isn't a pointer to "color calc state", that's the packet it's in. It's a pointer to the CC viewport state. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Sampler state is a pointer on gen4-5Jason Ekstrand2017-05-263-9/+9
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/genxml: Suffix KSP0 fields on Iron LakeJason Ekstrand2017-05-261-5/+5
| | | | | | | | | Iron Lake introduced the multiple KSP thing and so you have KSP0-3. However, the genxml didn't have an index on the first "Kernel Start Pointer" or "GRF Register Count". Add one to match gen6+. While we're here, we drop the brackets from the other "GRF Register Count" fields. Reviewed-by: Matt Turner <[email protected]>
* intel/genxml: Make a bunch of things offsets on gen4-5Jason Ekstrand2017-05-263-15/+15
| | | | | | | | | Most things on gen4-5 are addresses because we don't have dynamic state base address and we don't have instruction state base on gen4. However, whoever converted things to addresses got a little over-excited and converted too much. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add gen4_filter_tilingJason Ekstrand2017-05-263-2/+57
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add support for setting component write disablesJason Ekstrand2017-05-262-0/+26
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add support for gen4 cube maps to get_image_offset_saJason Ekstrand2017-05-261-5/+18
| | | | | | | Gen4 cube maps are a 2-D surface with ISL_DIM_LAYOUT_GEN4_3D which is a bit weird but accurate none the less. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Don't request space for stencil/hiz packets unless neededJason Ekstrand2017-05-261-7/+6
| | | | | | | On Iron Lake, the packets exist but we never emit them so there's no need for us to ask the driver to make batch space for them. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Move the gen7 stencil format workaround to blorp_blitJason Ekstrand2017-05-262-5/+9
| | | | | | | | It's not needed for blorp_copy because it already overrides formats. It's also not needed for blorp_clear because it clears stencil as stencil. Reviewed-by: Topi Pohjolainen <[email protected]>
* aubinator: report error on unknown device idLionel Landwerlin2017-05-241-1/+1
| | | | | | | | | Since we're going to stop aubinator without a valid device id, better report an error. This also silences a Coverity warning. CID: 1405004 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* aubinator: be consistent on exit codeLionel Landwerlin2017-05-241-5/+5
| | | | | | | | We're using both exit(1) & exit(EXIT_FAILURE), settle for one, same for success. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* aubinator: fix double freeLionel Landwerlin2017-05-241-1/+1
| | | | | | | | | 1;4601;0c Free previously allocated filename outside the for loop. CID: 1405014 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Require vertex buffers to come from a 32-bit heapJason Ekstrand2017-05-231-0/+12
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Advertise both 32-bit and 48-bit heaps when we have enough memoryJason Ekstrand2017-05-231-6/+36
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Refactor memory type setupJason Ekstrand2017-05-231-36/+40
| | | | | | | | This makes us walk over the heaps one at a time and add the types for LLC and !LLC to each heap. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Make supports_48bit_addresses a heap propertyJason Ekstrand2017-05-232-3/+14
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Stop setting BO flags in bo_init_newJason Ekstrand2017-05-234-15/+25
| | | | | | | | | | | | | | | | | | The idea behind doing this was to make it easier to set various flags. However, we have enough custom flag settings floating around the driver that this is more of a nuisance than a help. This commit has the following functional changes: 1) The workaround_bo created in anv_CreateDevice loses both flags. This shouldn't matter because it's very small and entirely internal to the driver. 2) The bo created in anv_CreateDmaBufImageINTEL loses the EXEC_OBJECT_ASYNC flag. In retrospect, it never should have gotten EXEC_OBJECT_ASYNC in the first place. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Set image memory types based on the type countJason Ekstrand2017-05-231-2/+4
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Add valid_bufer_usage to the memory type metadataJason Ekstrand2017-05-232-8/+26
| | | | | | | | | Instead of returning valid types as just a number, we now walk the list and check the buffer's usage against the usage flags we store in the new anv_memory_type structure. Currently, valid_buffer_usage == ~0. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Determine the type of mapping based on type metadataJason Ekstrand2017-05-232-7/+7
| | | | | | | | | Before, we were just comparing the type index to 0. Now we actually look the type up in the table and check its properties to determine what kind of mapping we want to do. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Set up memory types and heaps during physical device initJason Ekstrand2017-05-232-44/+77
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Predicate 48bit support on gen >= 8Jason Ekstrand2017-05-231-1/+6
| | | | | | | | | This doesn't matter right now since it only affects whether or not we set the kernel bit but, if we ever do anything else based on it, we'll want it to be correct per-gen. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv/image: Get rid of the memset(aux, 0, sizeof(aux)) hackJason Ekstrand2017-05-231-28/+0
| | | | | | | | | | | | | Up until now, we've been memsetting the auxiliary surface to 0 at BindImageMemory time to ensure that it is properly initialized. However, this isn't correct because apps are allowed to freely alias memory between different images and buffers so long as they properly track whether or not a particular image is valid and, if it isn't, transition from UNINITIALIZED to something else before using it. We now implement those transitions so we can drop the hack. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Handle transitioning depth from UNDEFINED to other layoutsJason Ekstrand2017-05-232-19/+19
| | | | | Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* anv: Handle color layout transitions from the UNINITIALIZED layoutJason Ekstrand2017-05-233-2/+108
| | | | | | | | This causes dEQP-VK.api.copy_and_blit.resolve_image.partial.* to start failing due to test bugs. See CL 1031 for a test fix. Reviewed-by: Nanley Chery <[email protected]> Cc: "17.1" <[email protected]>
* intel/isl: Add ASTC HDR to format lists and helpersNanley Chery2017-05-223-2/+58
| | | | | Reviewed-by: Anuj Phogat <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* android: add -Wl,--build-id=sha1 to LDFLAGS for libvulkan_intelTapani Pälli2017-05-201-0/+2
| | | | | | | Just like is done on desktop and what is expected by the build-id code. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* configure: check once for DRI3 dependenciesEmil Velikov2017-05-191-2/+1
| | | | | | | | | | | | | | | | Currently we are having the XCB_DRI3 dependencies duplicated, partially. Just do a once-off check and add all of the respective CFLAGS/LIBS where needed. As a nice side effect this helps us solve a couple of FIXMEs. DRI3 is not a thing w/o X11 so disable it in such cases. Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv/formats: Update the three-channel BC1 mappingsNanley Chery2017-05-181-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The procedure for decompressing an opaque BC1 Vulkan format is dependant on the comparison of two colors stored in the first 32 bits of the compressed block. Here's the specified OpenGL (and Vulkan) behavior for reference: The RGB color for a texel at location (x,y) in the block is given by: RGB0, if color0 > color1 and code(x,y) == 0 RGB1, if color0 > color1 and code(x,y) == 1 (2*RGB0+RGB1)/3, if color0 > color1 and code(x,y) == 2 (RGB0+2*RGB1)/3, if color0 > color1 and code(x,y) == 3 RGB0, if color0 <= color1 and code(x,y) == 0 RGB1, if color0 <= color1 and code(x,y) == 1 (RGB0+RGB1)/2, if color0 <= color1 and code(x,y) == 2 BLACK, if color0 <= color1 and code(x,y) == 3 The sampling operation performed on an opaque DXT1 Intel format essentially hard-codes the comparison result of the two colors as color0 > color1. This means that the behavior is incompatible with OpenGL and Vulkan. This is stated in the SKL PRM, Vol 5: Memory Views: Opaque Textures (DXT1_RGB) Texture format DXT1_RGB is identical to DXT1, with the exception that the One-bit Alpha encoding is removed. Color 0 and Color 1 are not compared, and the resulting texel color is derived strictly from the Opaque Color Encoding. The alpha channel defaults to 1.0. Programming Note Context: Opaque Textures (DXT1_RGB) The behavior of this format is not compliant with the OGL spec. The opaque and non-opaque BC1 Vulkan formats are specified to be decoded in exactly the same way except the BLACK value must have a transparent alpha channel in the latter. Use the four-channel BC1 Intel formats with the alpha set to 1 to provide the behavior required by the spec. v2 (Kenneth Graunke): - Provide a more detailed commit message. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100925 Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Nanley Chery <[email protected]>
* anv: Add an option to abort on device lossJason Ekstrand2017-05-181-0/+5
| | | | | | | | | | | | | | | | This is mostly for running in our CI system to prevent dEQP from continuing on to the next test if we get a GPU hang. As it currently stands, dEQP uses the same VkDevice for almost all tests and if one of the tests hangs, we set the anv_device::device_lost flag and report VK_ERROR_DEVICE_LOST for all queue operations from that point forward without sending anything to the GPU. dEQP will happily continue trying to run tests and reporting failures until it eventually gets crash that forces the test runner to start over. This circumvents the problem by just aborting the process if we ever get a GPU hang. Since this is not the recommended behavior most of the time, we hide it behind an environment variable. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Wrap the device lost error in vk_error in QueueSubmitJason Ekstrand2017-05-181-1/+1
| | | | | | | | We weren't wrapping this before because anv_cmd_buffer_execbuf may throw a more meaningful error message. However, we do change the error code into VK_ERROR_DEVICE_LOST, so we should print a new message. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: fix multiview for clear commandsIago Toral Quiroga2017-05-181-0/+41
| | | | | | | | | | | | | | | | | | | | According to the VK_KHX_multiview spec: "Multiview causes all drawing and clear commands in the subpass to behave as if they were broadcast to each view, where each view is represented by one layer of the framebuffer attachments." This adds support for multiview clears, which were missing in the initial implementation. v2 (Jason): - split multiview from regular case - Use for_each_bit() macro Fixes new CTS multiview tests: dEQP-VK.multiview.clear_attachments.* Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: load dvec3/4 uniforms first in the push constant bufferSamuel Iglesias Gonsálvez2017-05-181-27/+80
| | | | | | | | | | | | | | | | | | | | | | | | Reorder the uniforms to load first the dvec4-aligned variables in the push constant buffer and then push the vec4-aligned ones. It takes into account that the relocated uniforms should be aligned to their channel size. This fixes a bug were the dvec3/4 might be loaded one part on a GRF and the rest in next GRF, so the region parameters to read that could break the HW rules. v2: - Fix broken logic. - Add a comment to explain what should be needed to optimise the usage of the push constant buffer slots, as this patch does not pack the uniforms. v3: - Implemented the push constant buffer usage optimization. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Acked-by: Francisco Jerez <[email protected]>
* i965/vec4: fix swizzle and writemask when loading an uniform with constant ↵Samuel Iglesias Gonsálvez2017-05-181-4/+11
| | | | | | | | | | | | | | | | | | | | | | | | | offset It was setting XYWZ swizzle and writemask to all uniforms, no matter if they were a vector or scalar, so this can lead to problems when loading them to the push constant buffer. Moreover, 'shift' calculation was designed to calculate the offset in DWORDS, but it doesn't take into account DFs, so the calculated swizzle for the later ones was wrong. The indirect case is not changed because MOV INDIRECT will write to all components. Added an assert to verify that these uniforms are aligned. v2: - Fix 'shift' calculation (Curro) - Set both swizzle and writemask. - Add assert(shift == 0) for the indirect case. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/vec4/gs: restore the uniform values which was overwritten by failed ↵Samuel Iglesias Gonsálvez2017-05-181-0/+26
| | | | | | | | | | | | | | | | | vec4_gs_visitor execution We are going to add a packing feature to reduce the usage of the push constant buffer. One of the consequences is that 'nr_params' would be modified by vec4_visitor's run call, so we need to restore it if one of them failed before executing the fallback ones. Same thing happens to the uniforms values that would be reordered afterwards. Fixes GL45-CTS.arrays_of_arrays_gl.InteractionFunctionCalls2 when the dvec4 alignment and packing patch is applied. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Cc: "17.1" <[email protected]> Acked-by: Francisco Jerez <[email protected]>
* Android: correct libz dependencyChih-Wei Huang2017-05-171-1/+1
| | | | | | | | | | | | | | | | | | | Commit 6facb0c0 ("android: fix libz dynamic library dependencies") unconditionally adds libz as a dependency to all shared libraries. That is unnecessary. Commit 85a9b1b5 introduced libz as a dependency to libmesa_util. So only the shared libraries that use libmesa_util need libz. Fix Android Lollipop build by adding the include path of zlib to libmesa_util explicitly instead of getting the path implicitly from zlib since it doesn't export the include path in Lollipop. Fixes: 6facb0c0 "android: fix libz dynamic library dependencies" Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Herring <[email protected]>
* intel/isl/gen6: Fix combined depth stencil alignmentJason Ekstrand2017-05-161-7/+7
| | | | | | | | All combined depth stencil buffers (even those with just stencil) require a 4x4 alignment on Sandy Bridge. The only depth/stencil buffer type that requires 4x2 is separate stencil. Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Refactor gen8_choose_image_alignment_elJason Ekstrand2017-05-161-140/+49
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Refactor gen6_choose_image_alignment_elJason Ekstrand2017-05-161-18/+14
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Refactor gen7_choose_image_alignment_elJason Ekstrand2017-05-161-83/+74
| | | | | | | | | | | | | | | The Ivy Bridge PRM provides a nice table that handles most of the alignment cases in one place. For standard color buffers we have a little freedom of choice but for most depth, stencil and compressed it's hard-coded. Chad's original functions split halign and valign apart and implemented them almost entirely based on restrictions and not the table. This makes things way more confusing than they need to be. This commit gets rid of the split and makes us implement the exact table up-front. If our surface isn't one of the ones in the table then we have to make real choices. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/isl/gen7: Use stencil vertical alignment of 8 instead of 4Pohjolainen, Topi2017-05-161-23/+5
| | | | | | | | | | | | | | | | | | | | | The reasoning Chad gave in the comment for choosing a valign of 4 is entirely bunk. The fact that you have to multiply pitch by 2 is completely unrelated to the halign/valign parameters used for texture layout. (Not completely unrelated. W-tiling is just Y-tiling with a bit of extra swizzling which turns 8x8 W-tiled chunks into 16x4 y-tiled chunks so it makes everything easier if miplevels are always aligned to 8x8.) The fact that RENDER_SURFACE_STATE::SurfaceVerticalAlignmet doesn't have a VALIGN_8 option doesn't matter since this is gen7 and you can't do stencil texturing anyway. v2 (Jason Ekstrand): - Delete most of Chad's comment and add a more descriptive commit message. Signed-off-by: Topi Pohjolainen <[email protected]> Cc: "17.0 17.1" <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>