summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: merge si_set_streamout_targets with si_common_set_streamout_targetsMarek Olšák2017-10-093-126/+109
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add si_so_target_referenceMarek Olšák2017-10-091-2/+8
| | | | | | The src type is different on purpose. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: import r600_streamout from drivers/radeonMarek Olšák2017-10-0915-177/+181
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add performance thresholds for CP DMA, decrease it for clearsMarek Olšák2017-10-091-1/+7
| | | | | | The first one isn't used yet. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: disable primitive binning on Vega10 (v2)Marek Olšák2017-10-093-4/+19
| | | | | | | | | | | | | | | Our driver implementation is known to decrease performance for some tests, but we don't know if any apps and benchmarks (e.g. those tested by Phoronix) are affected. This disables the feature just to be safe. Set this to enable partial primitive binning: R600_DEBUG=dpbb Set this to enable full primitive binning: R600_DEBUG=dpbb,dfsm v2: add new debug options Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enumerize DBG flagsMarek Olšák2017-10-0912-144/+155
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* drirc: whitelist glthread for Spec Ops: The LineMarek Olšák2017-10-091-0/+6
| | | | | | On i7 4790k and a 280X, there is a boost of about 10% more FPS. Nominated by John Ettedgui.
* radv: configure VGT_VERTEX_REUSE at pipeline creationSamuel Pitoiset2017-10-093-10/+16
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: do not need to zero-init ds/raster statesSamuel Pitoiset2017-10-091-3/+0
| | | | | | | Already done when creating the pipeline. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: remove unused fields in radv_raster_stateSamuel Pitoiset2017-10-091-3/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set ALPHA_TO_MASK_ENABLE at blend state initSamuel Pitoiset2017-10-091-7/+7
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: emit PA_SU_POINT_{SIZE,MINMAX} in si_emit_config()Samuel Pitoiset2017-10-092-16/+15
| | | | | | | | | These registers don't change during the lifetime of the command buffer, there is no need to re-emit them when binding a new pipeline. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: allow launching waves out-of-order for computeSamuel Pitoiset2017-10-091-1/+9
| | | | | | | | Ported from RadeonSI, and -pro seems to enable it as well. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv/wsi: Allocate enough memory for the entire imageJason Ekstrand2017-10-071-3/+4
| | | | | | | | | | | | | | | | Previously, we allocated memory for image->plane[0].surface.isl.size which is great if there is no compression. However, on BDW, we can do CCS_D on X-tiled images so we also have to allocate space for the auxiliary buffer. This fixes hangs in some of the WSI CTS tests and should also reduce hangs in real applications. In particular, it fixes the dEQP-VK.wsi.*.incremental_present.* test group. When we hand the image off to X11 or Wayland, it will ignore the CCS entirely which is ok because we do a resolve when it's transitioned to VK_IMAGE_LAYOUT_PRESENT_SRC_KHR. Reviewed-by: Lionel Landwerlin <[email protected]> Cc: [email protected]
* anv: fix nir.h includeLionel Landwerlin2017-10-071-1/+1
| | | | | | | | | | | | All over mesa we include "nir/nir.h", we should probably do the same here. This fixes the meson build that was broken by the ycbcr series. Thanks to Dylan for finding the issue. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: f3e91e78a337 ("anv: add nir lowering pass for ycbcr textures") Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Don't warn on the ImageCubeArray capabilityJason Ekstrand2017-10-071-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* mesa: make glFramebuffer* check immutable texture level boundsKenneth Graunke2017-10-071-6/+14
| | | | | | | | | | | | | | When a texture is immutable, we can't tack on extra levels after-the-fact like we could with glTexImage. So check against that level limit and return an error if it's surpassed. This fixes: KHR-GL45.geometry_shader.layered_fbo.fb_texture_invalid_level_number (Based on a patch by Ilia Mirkin.) Reviewed-by: Antia Puentes <[email protected]> [imirkin v2] Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't change viewport for blits, use window-space positionsMarek Olšák2017-10-077-19/+6
| | | | | | The viewport state was an identity anyway. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set correct PA_SC_VPORT_ZMIN/ZMAX when viewport is disabledMarek Olšák2017-10-071-2/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: minor cleanup of si_update_vs_writes_viewport_indexMarek Olšák2017-10-073-5/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't save and restore vertex buffers and elements for u_blitterMarek Olšák2017-10-072-8/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use new VS blit shaders (VS inputs in SGPRs)Marek Olšák2017-10-075-57/+51
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add VS blit shader creationMarek Olšák2017-10-077-2/+217
| | | | | | no users yet Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split declare_default_desc_pointersMarek Olšák2017-10-071-9/+14
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_blitter: let drivers decide which VS to use for draw_rectangleMarek Olšák2017-10-078-42/+72
| | | | | | | This approach allows drivers to set their own vertex shader and skip compilation of u_blitter vertex shaders. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_blitter: let drivers set the vertex elements stateMarek Olšák2017-10-078-28/+45
| | | | | | radeonsi won't set it. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_blitter: remove blitter_context_priv::viewportMarek Olšák2017-10-071-10/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't use util_draw_arrays_instanced in si_draw_rectangleMarek Olšák2017-10-071-3/+7
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move si_draw_rectangle into si_state_draw.cMarek Olšák2017-10-074-88/+85
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove wrappers si_decompress_xx_texturesMarek Olšák2017-10-074-15/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove r600_atom::num_dwMarek Olšák2017-10-074-39/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: remove old r600g code checking chip_class and familyMarek Olšák2017-10-077-226/+53
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* st/va: Implement vaExportSurfaceHandle()Mark Thompson2017-10-073-1/+153
| | | | | | | | | | | | | | | This is a new interface in libva2 to support wider use-cases of passing surfaces to external APIs. In particular, this allows export of NV12 and P010 surfaces. v2: Convert surfaces to progressive before exporting them (Christian). v3: Set destination rectangle to match source when converting (Leo). Add guards to allow building with libva1. Signed-off-by: Mark Thompson <[email protected]> Acked-by: Christian König <[email protected]> Acked-and-Tested-by: Leo Liu <[email protected]>
* gallivm: don't use pabs intrinsic with llvm version >= 6Roland Scheidegger2017-10-071-9/+4
| | | | | | | | | | | | The intrinsic is gone, causing shader compilation to crash. While here, also change the fallback code to match what llvm's auto-updater of these intrinsics would do (except that there will still be zext/trunc instructions in there), which should ensure that the sequence gets recognized and fused back into a pabs in the end (I didn't test this, and it's possible even the old sequence would get recognized, but I don't see a reason why we shouldn't use the same sequence in any case). Tested-by: Vinson Lee <[email protected]>
* swr/rast: use proper alignment for debug transposedPrimsTim Rowley2017-10-061-2/+2
| | | | | | | | Causing a crash in ParaView waveletcontour.py test when _DEBUG defined due to vector aligned copy with unaligned address. Reviewed-by: Bruce Cherniak <[email protected]>
* anv/cmd_buffer: Reset state in cmd_buffer_destroyLionel Landwerlin2017-10-061-9/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This ensures that everything gets cleaned up properly. In particular, it fixes a memory leak where we were leaking the push constants structs. Valgrind stats on dEQP-VK.pipeline.push_constant.graphics_pipeline.range_size_128 : Before: HEAP SUMMARY: in use at exit: 2,467,513 bytes in 1,305 blocks total heap usage: 697,853 allocs, 696,530 frees, 138,466,600 bytes allocated LEAK SUMMARY: definitely lost: 1,068 bytes in 11 blocks indirectly lost: 24,669 bytes in 412 blocks possibly lost: 0 bytes in 0 blocks still reachable: 2,441,776 bytes in 882 blocks suppressed: 0 bytes in 0 blocks After: HEAP SUMMARY: in use at exit: 2,467,381 bytes in 1,304 blocks total heap usage: 697,853 allocs, 696,531 frees, 138,466,600 bytes allocated LEAK SUMMARY: definitely lost: 936 bytes in 10 blocks indirectly lost: 24,669 bytes in 412 blocks possibly lost: 0 bytes in 0 blocks still reachable: 2,441,776 bytes in 882 blocks suppressed: 0 bytes in 0 blocks Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.2 17.1" <[email protected]>
* anv/cmd_buffer: fix push descriptors with set > 0Lionel Landwerlin2017-10-062-13/+49
| | | | | | | | | | | | | | | | | When writing to set > 0, we were just wrongly writing to set 0. This commit fixes this by lazily allocating each set as we write to them. We didn't go for having them directly into the command buffer as this would require an additional ~45Kb per command buffer. v2: Allocate push descriptors from system memory rather than in BO streams. (Lionel) Cc: "17.2 17.1" <[email protected]> Fixes: 9f60ed98e501 ("anv: add VK_KHR_push_descriptor support") Reported-by: Daniel Ribeiro Maciel <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: enable VK_KHR_sampler_ycbcr_conversionLionel Landwerlin2017-10-066-18/+186
| | | | | | | | | | | | v2: Make GetImageMemoryRequirements2KHR() iterate over all pInfo structs (Lionel) Handle VkSamplerYcbcrConversionImageFormatPropertiesKHR (Andrew/Jason) Iterator over BindImageMemory2KHR's pNext structs correctly (Jason) v3: Revert GetImageMemoryRequirements2KHR() change from v2 (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: enable multiple planes per image/imageViewLionel Landwerlin2017-10-069-484/+915
| | | | | | | | | | | | | | | | | | | | | This change introduce the concept of planes for image & views. It matches the planes available in new formats. We also refactor depth & stencil support through the usage of planes for the sake of uniformity. In the backend (genX_cmd_buffer.c) we have to take some care though with regard to auxilliary surfaces. Multiplanar color buffers can have multiple auxilliary surfaces but depth & stencil share the same HiZ one (only store in the depth plane). v2: by Jason Remove unused aspect parameters from anv_blorp.c Assert when attempting to resolve YUV images Drop redundant logic for plane offset in make_surface() Rework anv_foreach_plane_aspect_bit() Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Take an image in can_sample_with_hizJason Ekstrand2017-10-063-9/+10
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Take a single aspect in anv_layout_to_aux_usageJason Ekstrand2017-10-063-18/+15
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/cmd_buffer: Make get_fast_clear_state return an addressJason Ekstrand2017-10-061-22/+24
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/blorp: Add a concept of default aux usageJason Ekstrand2017-10-061-11/+16
| | | | | | | A good chunk of anv_blorp just wants the aux usage from the image. This magic aux_usage value means just that. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: add nir lowering pass for ycbcr texturesLionel Landwerlin2017-10-066-2/+496
| | | | | | | | | | | | | | | | | | This pass implements all the implicit conversions required by the VK_KHR_sampler_ycbcr_conversion specification. It also inserts plane sources onto sampling instructions that we then let the pipeline layout pass deal with, when mapping things correctly to descriptors. v2: Add new file to meson build (Lionel) Use nir_frcp() rather than (1.0f / x) (Jason) Reuse nir_tex_instr_dest_size() rather than handwritten one (Jason) Return progress (Jason) Account for array of samplers (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: prepare sampler emission code for multiplanar imagesLionel Landwerlin2017-10-063-41/+43
| | | | | | | | | | New settings from the KHR_sampler_ycbcr_conversion specifications might require different sampler settings for luma and chroma planes. This change makes the sampler table emission ready to handle multiple planes. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/apply_pipeline_layout: Prepare for multi-planar imagesLionel Landwerlin2017-10-064-22/+116
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: add new formats KHR_sampler_ycbcr_conversionLionel Landwerlin2017-10-063-6/+189
| | | | | | | Adding new downsampling factors for each planes. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: modify the internal concept of format to express multiple planesLionel Landwerlin2017-10-064-256/+331
| | | | | | | | | | | | | | | A given Vulkan format can now be decomposed into a set of planes. We now use 'struct anv_format_plane' to represent the format of those planes. v2: by Jason Rename anv_get_plane_format() to anv_get_format_plane() Don't rename anv_get_isl_format() Replace ds_fmt() by fmt2() Introduce fmt_unsupported() Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: prepare formats to handle disjoints setsLionel Landwerlin2017-10-061-10/+27
| | | | | | | | | | | Newer format enums start at offset 1000000000, making it impossible to have them all in one table. This change splits the formats into sets that we then access through indirection. v2: rename format_extract to vk_to_anv_format (Chad/Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl: fill out layout descriptions for yuv formatsLionel Landwerlin2017-10-061-4/+4
| | | | | | | Some description was missing. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>