summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/isl: Allow creation of 1-D compressed texturesJason Ekstrand2016-10-032-3/+11
| | | | | | | | | | | Compressed 1-D textures are not well-defined thing in either GL or Vulkan. However, auxiliary surfaces are treated as compressed textures in ISL and we can do HiZ and CCS with 1-D so we need to be able to create them. In order to prevent actually using them (the docs say no), we assert in the state setup code. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Fix up asserts in calc_phys_level0_extent_saJason Ekstrand2016-10-031-2/+4
| | | | | | | | | | | | | The assertion that a format is uncompressed in the multisample layouts isn't quite right. What we really want to assert is that the format supports multisampling which is a bit more complicated query. We also want to assert that it has a block size of 1x1 since we do nothing with the block size in the phys_level0_sa assignment. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Add a format_supports_multisampling helperJason Ekstrand2016-10-035-36/+33
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/formats: Fix build on gcc-4 and earlierVille Syrjälä2016-10-031-4/+19
| | | | | | | | | | | | | | | | | | | | gcc-4 and earlier don't allow compound literals where a constant is required in -std=c99/gnu99 mode, so we can't use ISL_SWIZZLE() when populating the anv_formats[] array. There are a few ways around it: First one would be -std=c89/gnu89, but the rest of the code depends on c99 so it's not really an option. The second option would be to upgrade to gcc-5+ where the compiler behaviour was relaxed a bit [1]. And the third option is just to avoid using compound literals. I chose the last option since it keeps gcc-4 and earlier working. [1] https://gcc.gnu.org/gcc-5/porting_to.html Cc: Jason Ekstrand <[email protected]> Cc: Topi Pohjolainen <[email protected]> Fixes: 7ddb21708c80 ("intel/isl: Add an isl_swizzle structure and use it for isl_view swizzles") Signed-off-by: Ville Syrjälä <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: rename max_ds_* variable to max_tes_*Timothy Arceri2016-10-033-27/+27
| | | | | | Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: rename max_hs_* variables to max_tcs_*Timothy Arceri2016-10-033-27/+27
| | | | | | Using consistent naming allows us to create macros more easily. Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Fix the decoding of values that span two DwordsSirisha Gandikota2016-09-261-13/+37
| | | | | | | | | | | | | | | | | | | | | | Fixed the way the values that span two Dwords are decoded. Based on the start and end indices of the field, the Dwords are fetched and decoded accordingly. v2: rename dw to qw in gen_field_iterator_next and remove extra white space (Anuj) v3: change all instances of dw to qw (Anuj) Earlier, 64-bit fields (such as most pointers on Gen8+) weren't decoded correctly. gen_field_iterator_next seemed to walk one DWord at a time, sets v.dw, and then passes it to field(). So, even though field() takes a uint64_t, we're passing it a uint32_t (which gets promoted, so the top 32 bits will always be zero). This seems pretty bogus... (Ken) Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: fix resource leakNayan Deshmukh2016-09-251-1/+3
| | | | | | | CovID: 1373370 Signed-off-by: Nayan Deshmukh <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: Check for VK_WHOLE_SIZE in anv_CmdFillBufferNicolas Koch2016-09-231-0/+6
| | | | | | | | | | | From the Vulkan spec: Size is the number of bytes to fill, and must be either a multiple of 4, or VK_WHOLE_SIZE to fill the range from offset to the end of the buffer. If VK_WHOLE_SIZE is used and the remaining size of the buffer is not a multiple of 4, then the nearest smaller multiple is used. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: get rid of duplicated values from gen_device_infoLionel Landwerlin2016-09-236-43/+28
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/i965: make gen_device_info mutableLionel Landwerlin2016-09-237-53/+59
| | | | | | | | | | | | Make gen_device_info a mutable structure so we can update the fields that can be refined by querying the kernel (like subslices and EU numbers). This patch does not make any functional change, it just makes gen_get_device_info() fill a structure rather than returning a const pointer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: pipeline: use correct number of thread for computeLionel Landwerlin2016-09-211-1/+4
| | | | | | | | | | | | | Reproduces this commit : commit 0fb85ac08d61d365e67c8f79d6955e9f89543560 Author: Kenneth Graunke <[email protected]> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Use the correct number of threads for compute shaders. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: allocator: correct scratch space for haswellLionel Landwerlin2016-09-211-1/+21
| | | | | | | | | | | | | This reproduces this commit : commit 2213ffdb4bb79856f0556bdf2bfd4bdf57720232 Author: Kenneth Graunke <[email protected]> Date: Mon Jun 6 21:37:34 2016 -0700 i965: Allocate scratch space for the maximum number of compute threads. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv: device: calculate compute thread numbers using subslices numbersLionel Landwerlin2016-09-216-18/+74
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: add a custom handler for immediate register loadLionel Landwerlin2016-09-203-3/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Transforming this : 0x00c77084: 0x11000001: MI_LOAD_REGISTER_IMM 0x00c77088: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x00c7708c: 0x00880038 : Dword 2 Data DWord: 8912952 Into this: 0x007880f0: 0x11000001: MI_LOAD_REGISTER_IMM 0x007880f4: 0x0000b020 : Dword 1 Register Offset: 0x0000b020 0x007880f8: 0x00080040 : Dword 2 Data DWord: 524352 register L3CNTLREG2 (0xb020) : 0x80040 SLM Enable: 0 URB Allocation: 32 URB Low Bandwidth: 0 RO Allocation: 32 RO Low Bandwidth: 0 DC Allocation: 0 DC Low Bandwidth: 0 v2: Drop unused arguments (Sirisha) Print out register name Signed-off-by: Lionel Landwerlin <[email protected]>
* isl: Finish tiling filtering for Gen6.Kenneth Graunke2016-09-153-5/+15
| | | | | | | | | | | | | Gen6 only has one additional restriction over Gen7+, so we just add it to the existing gen7 function (which actually covers later gens too). This should stop FINISHME spew when running GL on Sandybridge. v2: Fix bytes per block vs. bits per block confusion (Jason) and rename function to gen6_filter_tiling (Jason and Chad). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a flag to lower_io to force "sample" interpolationJason Ekstrand2016-09-151-1/+1
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/cmd_buffer: Set the L3 atomic disable mask bit in CHICKEN3 on HSWJason Ekstrand2016-09-142-0/+2
| | | | | | | | | | | Without this bit set, the value in "L3 Atomic Disable" won't get applied by the hardware so we won't properly get L3 atomic caching. Fixes dEQP-VK.spirv_assembly.instruction.compute.opatomic.compex and 198 of the dEQP-VK.image.atomic_operations.* tests on HSW Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel/blorp: Stop setting 3DSTATE_DRAWING_RECTANGLEJason Ekstrand2016-09-142-20/+0
| | | | | | | | | | | | The Vulkan driver sets 3DSTATE_DRAWING_RECTANGLE once to MAX_INT x MAX_INT at the GPU initialization time and never sets it again. The GL driver sets it every time the framebuffer changes. Originally, blorp set it to the size of the drawing area but meant we had to set it back in the Vulkan driver. Instead, we can easily just do that in the GL driver's blorp_exec implementation and not set it in blorp core. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel/blorp: Emit 3DSTATE_MULTISAMPLE directlyJason Ekstrand2016-09-142-40/+45
| | | | | | | | | Previously, we relied on a driver hook for 3DSTATE_MULTISAMPLE. However, now that Vulkan and GL use the same sample positions, we can set up 3DSTATE_MULTISAMPLE directly in blorp and delete the driver hook. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: Move Vulkan sample positions to common codeJason Ekstrand2016-09-144-21/+21
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* aubinator: Remove bogus "end" parameter in gen_disasm_disassemble()Sirisha Gandikota2016-09-133-10/+12
| | | | | | | | | Earlier, the loop pretends to loop over instructions from "start" to "end", but the callers always pass 8192 for end, which is some huge bogus value. The real loop termination condition is send-with-EOT or 0. (Ken) Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Make gen_disasm_disassemble handle split sendsSirisha Gandikota2016-09-131-7/+12
| | | | | | | | | | | | Skylake adds new SENDS and SENDSC opcodes, which should be handled in the send-with-EOT check. Make an is_send() helper that checks if the opcode is SEND/SENDC/SENDS/SENDSC (Ken) v2: Make is_send() much more crispier, Mix declaration and code to make the code compact (Ken) Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Simplify print_dword_val() methodSirisha Gandikota2016-09-131-8/+4
| | | | | | | | | | Remove the float/dword union and use the iter->p[f->start / 32] directly as printf formatter %08x expects uint32_t (Ken) v2: Make the cleanup much more crispier (Ken) Signed-off-by: Sirisha Gandikota <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv/image: Set correct base_array_layer and array_len for storage imagesJason Ekstrand2016-09-131-0/+4
| | | | | | | Since Vulkan doesn't allow single-slice 3D storage images, we need to just set the base_array_layer and array_len to the full size of the 3-D LOD. Signed-off-by: Jason Ekstrand <[email protected]>
* Revert "intel/isl: Ignore base_array_layer and array_len for 3D storage..."Jason Ekstrand2016-09-131-6/+2
| | | | | | | | | This reverts commit 3943888c94beca69e575b8d3d1ec7a6cbf474ee4. It turns out that commit was pretty-much bogus since it breaks binding a 3-D texture as a 2-D storage image. The correct fix for the Vulkan CTS tests needs to be in the Vulkan driver itself rather than ISL. Signed-off-by: Jason Ekstrand <[email protected]>
* anv: Use blorp for doing MSAA resolvesJason Ekstrand2016-09-135-881/+121
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: Use blorp for ClearColorImageJason Ekstrand2016-09-132-21/+56
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: Delete meta_blit2dJason Ekstrand2016-09-134-1590/+0
| | | | | | | | Everything that we were once using the blit2d framework for is now done with blorp. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv/blorp: Add a gcd_pow2_u64 helper and use it for buffer alignmentsJason Ekstrand2016-09-131-24/+24
| | | | | | | | This is a lot cleaner and easier to read than the old piles of if statements. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp for CopyBuffer and UpdateBufferJason Ekstrand2016-09-133-181/+187
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp for CopyImageJason Ekstrand2016-09-132-158/+67
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp for CopyBufferToImageJason Ekstrand2016-09-132-125/+16
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp for CopyImageToBufferJason Ekstrand2016-09-132-16/+134
| | | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp to implement VkBlitImageJason Ekstrand2016-09-135-750/+144
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: Make image_get_surface_for_aspect_mask constJason Ekstrand2016-09-133-7/+8
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* anv: Add initial blorp supportJason Ekstrand2016-09-137-0/+400
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/anv: Use #defines for all __gen_ helpersJason Ekstrand2016-09-131-5/+6
| | | | | | | This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* anv: Generalize emit_urb_setupJason Ekstrand2016-09-132-20/+45
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/pipeline: Roll compute_urb_partition into emit_urb_setupJason Ekstrand2016-09-133-156/+138
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use #defines for all __gen_ helpersJason Ekstrand2016-09-131-5/+6
| | | | | | | This allows us to #undef them later if we don't want them to persist Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Divide QPitch by 2 for 3-D stencil textures on SKL+Jason Ekstrand2016-09-131-1/+14
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* isl/state: Don't set QPitch for GEN4_3D surfacesJason Ekstrand2016-09-131-1/+16
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/blorp: Rework alloc_binding_tableJason Ekstrand2016-09-131-5/+5
| | | | | | | | | | | | | The original blorp_alloc_binding_table helper was supposed to return the binding table offset and map along with the surface state maps. This isn't quite what we want, however. What we really want is the binding table offsets, surface state offsets, and surface state maps. In the GL driver, the binding table map *is* an array of surface state offsets. However, in Vulkan, this isn't quite true as the entries in the binding table are surface state offsets combined with another binding table block offset. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* anv/allocator: Use VG_NOACCESS_WRITE in anv_bo_pool_freeJason Ekstrand2016-09-131-2/+4
| | | | | | | | | | | | Previously, we were relying on the fact that VALGRIND_MEMPOOL_FREE came later on in the function to prevent "link->bo = bo" from causing an invalid write. However, in the case where the size requested by the user is very small (less than sizeof(struct anv_bo)), this isn't sufficient. Instead, we should call VALGRIND_MEMPOOL_FREE early and then use VG_NOACCESS_WRITE. We do, however, have to call VALGRIND_MEMPOOL_FREE after reading bo_in because it may be stored in the bo itself. Signed-off-by: Jason Ekstrand <[email protected]>
* intel/isl: Ignore base_array_layer and array_len for 3D storage surfacesJason Ekstrand2016-09-131-2/+6
| | | | | | | | | | | | | | The time we want to restrict the Z range of a 3-D surface is when rendering to it. For storage surfaces, we always want he full range. However, we still need to set MinimumArrayElement and RenderTargetViewExtent to sensible values so we'll just set them to the reasonable defaults we used before we started respecting the base_array_layer and array_len. This fixes a bunch of Vulkan CTS regressions caused by 48f195d7c6483ed. Signed-off-by: Jason Ekstrand <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97790 Reviewed-by: Chad Versace <[email protected]>
* intel/isl: Add support for RGB formats in X and Y-tiled memoryJason Ekstrand2016-09-122-14/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Normally, using a non-linear tiling format helps improve cache locality by ensuring that neighboring pixels are usually close-by in memory. For RGB formats, this still sort-of holds, but it can also lead to rather terrible memory access patterns where a single RGB pixel value crosses a tile boundary and gets split into two pieces in different 4K pages. It also makes for some rather awkward calculations because your tile size is no longer an even multiple of surface element size. For these reasons, we chose to simply never create tiled RGB images in the Vulkan driver. The GL driver, however, is not so kind so we need to support it somehow. I briefly toyed with a couple of different schemes but this is the best one I could come up with. The fundamental problem is that a tile no longer contains an integer number of surface elements. I briefly considered a couple other options but found them wanting: 1) Using floats for the logical tile size. This leads to potential rounding error problems. 2) When presented with a RGB format, just make the tile 3-times as wide. This isn't so nice because now our tiles are no longer power-of-two size. Also, it can force the row_pitch to be larger than needed which, while not strictly a problem for ISL, causes incompatibility problems with the way the GL driver chooses surface pitches. The chosen method requires that you pay attention and not just assume that your tile_info is in the units you think it is. However, it's nice because it provides a nice "these are the units" declaration in isl_tile_info itself. Previously, the tile_info wasn't usable as a stand-alone structure because you had to also know the format. It also forces figuring out how to deal with inconsistencies between tiling and format back to the caller which is good because the two different consumers of isl_tile_info really want to deal with it differently: Computation of the surface size wants the fewest number of horizontal tiles possible while get_intratile_offset is far more concerned with things aligning nicely. Signed-off-by: Jason Ekstrand <[email protected]> Acked-by: Chad Versace <[email protected]>
* intel/isl: Allow valign2 for texture-only Y-tiled surfaces on gen7Jason Ekstrand2016-09-121-1/+2
| | | | | | | | | | The restriction that Y-tiled surfaces must have valign == 4 only aplies to render targets but we were applying it universally. This causes problems if ISL_FORMAT_R32G32B32_FLOAT is used because it requires valign == 2; this should be okay because you can't render to that format. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* intel/blorp: Work in terms of logical array layersJason Ekstrand2016-09-121-12/+2
| | | | | | | | | | | | | | | | | | When Ivy Bridge introduced array multisampling, someone made the decision to do lots of stuff throughout the driver in terms of physical array layers rather than logical array layers. In ISL, we use logical array layers most of the time and it really makes no sense to use physical array layers in the blorp API. Every time someone passes physical array layers into blorp for an array multisampled surface, they're always divisible by the number of samples and we divide right away. Eventually, I'd like to rework most of the GL driver internals to use logical array layers but that's going to be a big project and will probably happen as part of the ISL conversion. For now, we'll do the conversion in brw_blorp and let blorp just use the logical layers. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Increase the presision of coordinate transform calculationsJason Ekstrand2016-09-121-3/+3
| | | | | | | | | | | | The result of this calculation goes into an fma() in the shader and we would like it to be as precise as possible. The division in particular was a source of imprecision whenever dst1 - dst0 was not a power of two. This prevents regressions in some of the new Vulkan CTS tests for blitting using a filtering of NEAREST. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>