summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* r600/atomic: fix ATOMCAS instruction.Dave Airlie2018-02-071-1/+31
| | | | | | | | | | This has 4 srcs. This fixes: KHR-GL45.shader_atomic_counter_ops_tests.ShaderAtomicCounterOpsExchangeTestCase Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/sb/cayman: fix indirect ubo access on caymanDave Airlie2018-02-071-1/+1
| | | | | | | | | | | | | | With sb enabled on cayman, this was overwriting the proper cf index value with random ones if the dst gpr was 2 or 3, only save the value for a MOVA instruction. Fixes: KHR-GL45.gpu_shader5.uniform_blocks_array_indexing (on cayman with sb) Cc: <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/eg: use texture target to pick array size not view target (v2)Dave Airlie2018-02-071-7/+10
| | | | | | | | | | | | | | This fixes a few CTS cases in : KHR-GL45.texture_view.view_sampling some multisample cases are still broken, but not sure this is the same problem. v2: fix more cases Cc: <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: don't support tc-compat on multisample d32s8 at all.Dave Airlie2018-02-061-2/+2
| | | | | | | | | | | | RX550 fails dEQP-VK.renderpass.suballocation.multisample.d32_sfloat_s8_uint.samples_2 So increase the range of the workaround. Fixes: f4c534ef6 (radv: don't enable tc compat for d32s8 + 4/8 samples (v1.1)) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* winsys/amdgpu: allow non page-aligned size bo creation from pointerMichal Navratil2018-02-061-4/+7
| | | | | | | | | Fix INVALID_OPERATION caused by BufferData with target EXTERNAL_VIRTUAL_MEMORY_BUFFER_AMD when the buffer size is not page aligned. Signed-off-by: Marek Olšák <[email protected]> Cc: 17.3 18.0 <[email protected]>
* meson: ensure xmlpool/options.h is generated for libgalliumJon Turney2018-02-061-1/+1
| | | | | | | | | | | In file included from ../src/gallium/targets/dri/target.c:1: In file included from ../src/gallium/auxiliary/target-helpers/drm_helper.h:8: ../src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found See also 26bde1e3. Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vbo: provide 64bits support to print_draw_arraysAndres Gomez2018-02-061-2/+19
| | | | | | | Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* vbo: take into account the size when printing VAO elementsAndres Gomez2018-02-061-1/+1
| | | | | | | | | | | | | When using print_draw_arrays for debugging, we were printing an "n" amount of vertex but that meant not to print all the size in the "n" vertex, depending on the stride used. Now we print the whole size in the "n" vertex. Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* vbo: print first element of the VAO when the binding stride is 0Andres Gomez2018-02-061-3/+4
| | | | | | | Cc: Mathias Fröhlich <[email protected]> Cc: Brian Paul <[email protected]> Signed-off-by: Andres Gomez <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* anv/device: initialize the list of enabled extensions properlyIago Toral Quiroga2018-02-061-1/+1
| | | | | | | | | | | | | | | The loop goes through the list of enabled extensions marking them as enabled in the list, but this relies on every other extension being initialized to false by default. This bug would make us, for example, advertise certain device extension entry points as available even when the corresponding extensions had not been enabled. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: abc62282b5c "anv: Add a per-device table of enabled extensions" Cc: "18.0" <[email protected]>
* spirv: split constant initializers on in/out structsIago Toral Quiroga2018-02-061-0/+8
| | | | | | | | | The SPIR-V parser splits in/out struct variables and creates a separate variable for each first-level member of the struct. When the struct variable has an initializer this means that we also need to split the initializer. Reviewed-by: Jason Ekstrand <[email protected]>
* i965/nir: do int64 lowering before optimizationIago Toral Quiroga2018-02-061-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Otherwise loop unrolling will fail to see the actual cost of the unrolling operations when the loop body contains 64-bit integer instructions, and very specially when the divmod64 lowering applies, since its lowering is quite expensive. Without this change, some in-development CTS tests for int64 get stuck forever trying to register allocate a shader with over 50K SSA values. The large number of SSA values is the result of NIR first unrolling multiple seemingly simple loops that involve int64 instructions, only to then lower these instructions to produce a massive pile of code (due to the divmod64 lowering in the unrolled instructions). With this change, loop unrolling will see the loops with the int64 code already lowered and will realize that it is too expensive to unroll. v2: Run nir_algebraic first so we can hopefully get rid of some of the int64 instructions before we even attempt to lower them. Reviewed-by: Matt Turner <[email protected]>
* mesa: add OES_EGL_image_external_essl3 supportIlia Mirkin2018-02-066-2/+24
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* r600/fp64: Fix build.Vinson Lee2018-02-051-1/+1
| | | | | | | | | | | | | CC r600_shader.lo r600_shader.c: In function ‘egcm_int_to_double’: r600_shader.c:4543:12: error: ‘ctx’ is a pointer; did you mean to use ‘->’? if (ctx.bc->chip_class == CAYMAN) ^ -> Fixes: 35b430157776 ("r600/fp64: fix integer->double conversion") Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* r600/fp64: fix integer->double conversionDave Airlie2018-02-061-28/+93
| | | | | | | | | | | | | Doing a straight uint/int->fp32->fp64 conversion causes some precision issues, Roland suggested splitting the integer into two portions and doing two separate int->fp32->fp64 conversions then adding the results. This passes the tests in CTS and piglit. [airlied: fix cypress conversion opcodes] Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: remove emission of nir_op_fdivSamuel Pitoiset2018-02-051-5/+0
| | | | | | | | RadeonSI and RADV lower fdiv. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* travis: add macOS meson buildJon Turney2018-02-051-0/+5
| | | | | | | | v2: Simplify set of options now we have better defaults Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: osx ld doesn't support --build-idJon Turney2018-02-052-1/+5
| | | | | Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: build src/glx/appleJon Turney2018-02-052-0/+65
| | | | | | Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: set apple glx definesDylan Baker2018-02-051-0/+2
| | | | | Reviewed-by: Jon Turney <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: better defaults for osx, windows and cygwinJon Turney2018-02-051-6/+15
| | | | | | | | | | | set suitable defaults for 'dri-drivers', 'gallium-drivers', 'vulkan-drivers' and 'platforms' options for osx, windows and cygwin, adding cygwin where appropriate. v2: error() for unknown OS Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: Move mistakenly placed lineMatt Turner2018-02-051-1/+1
| | | | | | | Ken called this out in review, but it seems I forgot to make the change. I noticed that the control flow annotations in the fragment shader disassembly of tests/shaders/glsl-fs-loop-continue.shader_test were not correct, and moving this line to the correct place fixes it.
* glsl/linker: check same name is not used in block and outsideJuan A. Suarez Romero2018-02-051-23/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | According with OpenGL GLSL 3.20 spec, section 4.3.9: "It is a link-time error if any particular shader interface contains: - two different blocks, each having no instance name, and each having a member of the same name, or - a variable outside a block, and a block with no instance name, where the variable has the same name as a member in the block." This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") that covered this case, but did not take in account that precision qualifiers are ignored when comparing blocks with no instance name. With this commit, the original tests KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision regression is fixed, which was broken by previous commit. v2: use helper varibles (Matteo Bruni) Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777 CC: Mark Janes <[email protected]> CC: "18.0" <[email protected]> Tested-by: Matteo Bruni <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* mesa: enable ASTC format for CompressedTexSubImage3DJuan A. Suarez Romero2018-02-051-8/+33
| | | | | | | | | | | | If extensions GL_KHR_texture_compression_astc_hdr or GL_KHR_texture_compression_astc_sliced_3d are implemented then ASTC format are supported in CompressedTex*Îmage3D. Fixes KHR-GLES2.texture_3d.* with this format. CC: Eric Anholt <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Juan A. Suarez Romero <[email protected]>
* util/build-id: Fix address comparison for binaries with LOAD vaddr > 0Stephan Gerhold2018-02-051-1/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | build_id_find_nhdr_for_addr() fails to find the build-id if the first LOAD segment has a virtual address other than 0x0. For most shared libraries, the first LOAD segment has vaddr=0x0: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align LOAD 0x000000 0x00000000 0x00000000 0x2d2e26 0x2d2e26 R E 0x1000 LOAD 0x2d2e54 0x002d3e54 0x002d3e54 0x2e248 0x2f148 RW 0x1000 However, compiling the Intel Vulkan driver as 32-bit binary on Android produces the following ELF header with vaddr=0x8000 instead: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align PHDR 0x000034 0x00008034 0x00008034 0x00100 0x00100 R 0x4 LOAD 0x000000 0x00008000 0x00008000 0x224a04 0x224a04 R E 0x1000 LOAD 0x225710 0x0022e710 0x0022e710 0x25988 0x27364 RW 0x1000 build_id_find_nhdr_callback() compares the address of dli_fbase from dladdr() and dlpi_addr from dl_iterate_phdr(). With vaddr > 0, these point to a different memory address, e.g.: dli_fbase=0xd8395000 (offset 0x8000) dlpi_addr=0xd838d000 At least on glibc and bionic (Android) dli_fbase refers to the address where the shared object is mapped into the process space, whereas dlpi_addr is just the base address for the vaddrs declared in the ELF header. To compare them correctly, we need to calculate the start of the mapping by adding the vaddr of the first LOAD segment to the base address. Note: musl users will need the following patch. https://git.musl-libc.org/cgit/musl/commit/?id=b3ae7beabb9f0c219bb8a8b63567a01c6530c1ac Cc: Chad Versace <[email protected]> Cc: <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104642 Fixes: 5c98d38 "util: Query build-id by symbol address, not library name" Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* radeonsi: enable vcn encode for HEVC mainBoyuan Zhang2018-02-051-1/+3
| | | | | | | Enable vcn encode for HEVC main profile on Raven. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: implement HEVC encode functionsBoyuan Zhang2018-02-051-6/+144
| | | | | | | Implement HEVC encode functions based on VAAPI HEVC encode interface. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: add HEVC encode functionsBoyuan Zhang2018-02-055-4/+111
| | | | | | | Add a separate file for HEVC encode functions. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: enable dual instances encode only for H264Boyuan Zhang2018-02-052-11/+15
| | | | | | | | Logics that related to dual instances encode should only be done for H264, not other codecs. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: add entrypoint check for HEVCBoyuan Zhang2018-02-051-10/+12
| | | | | | | Add entrypoint check for HEVC to differentiate decode and encode jobs. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: add HEVC picture descBoyuan Zhang2018-02-052-4/+23
| | | | | | | | Add HEVC picture desc, and add codec check when creating and destroying context. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* st/va: move H264 enc functions into separate fileBoyuan Zhang2018-02-055-139/+260
| | | | | | | | | Move all H264 encode related functions into separate file. Similar to VAAPI decode side, there will be separate file for each codec on encode side as well. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add header implementations for HEVCBoyuan Zhang2018-02-051-1/+347
| | | | | | | | Implement encoding of sps, pps, vps, aud, and slice headers for HEVC based on HEVC specs. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add ib implementations for HEVCBoyuan Zhang2018-02-051-45/+222
| | | | | | | Implement required ibs for vcn HEVC encode. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: support picture parameters for HEVCBoyuan Zhang2018-02-053-21/+64
| | | | | | | | | Pass pipe_picture_desc instead of pipe_h264_enc_picture_desc so that it can be used for different codecs. Add functions to handle picture parameters that will be used for HEVC encode. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add vcn encode interface for HEVCBoyuan Zhang2018-02-051-2/+79
| | | | | | | | Add vcn encode interface for HEVC, and rename radeon_enc_h264_enc_pic to radeon_enc_pic since radeon_enc_pic is used by both H264 and HEVC. Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* vl: add parameters for HEVC encodeBoyuan Zhang2018-02-051-0/+99
| | | | | | | Add HEVC encode interface Signed-off-by: Boyuan Zhang <[email protected]> Acked-by: Christian König <[email protected]>
* broadcom/vc5: Ignore samplers for finding uniform offsets.Eric Anholt2018-02-051-1/+12
| | | | | | | | Fixes: KHR-GLES3.shaders.struct.uniform.sampler_array_fragment KHR-GLES3.shaders.struct.uniform.sampler_array_vertex KHR-GLES3.shaders.struct.uniform.sampler_nested_fragment KHR-GLES3.shaders.struct.uniform.sampler_nested_vertex
* broadcom/vc5: Fix non-mipfiltered sampling.Eric Anholt2018-02-051-1/+6
| | | | | We need to clamp the LOD to 0 if mip filtering is disabled. This is part of fixing KHR-GLES3.shaders.struct.uniform.sampler_array_fragment.
* broadcom/vc5: Fix "hardwrae" typo in a field name in XML.Eric Anholt2018-02-052-2/+2
|
* ac/nir: fix a crash in load_gs_input() on pre-GFX9 chipsSamuel Pitoiset2018-02-051-0/+3
| | | | | | Fixes: df1d5174fcc ("ac/nir: replace SI.buffer.load.dword with amdgcn.buffer.load") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* broadcom/vc5: Try to merge more than 2 QPU instructions together.Eric Anholt2018-02-051-5/+13
| | | | | | | | Obviously it would be good to have an ADD and a MUL and a signal together, but we can even potentially have multiple signals merged, as well. total instructions in shared programs: 100423 -> 97874 (-2.54%) instructions in affected programs: 78812 -> 76263 (-3.23%)
* broadcom/vc5: Remove no-op MOVs after register allocation.Eric Anholt2018-02-051-1/+60
| | | | | | | | We emit some MOVs to track lifetimes of payload registers, but we don't need there to be actual MOV instructions for them. total instructions in shared programs: 101045 -> 100423 (-0.62%) instructions in affected programs: 37083 -> 36461 (-1.68%)
* broadcom/vc5: Add missing shader-db instruction counting.Eric Anholt2018-02-051-0/+7
| | | | I must have misplaced it in the instruction packing rework.
* r600: fix resq for buffer images.Dave Airlie2018-02-051-1/+4
| | | | | | | | | | | If this is an image buffer, we need to calculate the correct resource id. Fixes: KHR-GL45.shader_image_size.* Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600/eg: fix cube map array buffer images.Dave Airlie2018-02-051-1/+1
| | | | | | | | This fixes a crash in: KHR-GL45.texture_cube_map_array.texture_size_compute_sh. Reviewed-by: Roland Scheidegger <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* mesa: change ctx->Color.ColorMask into a 32-bit bitmaskMarek Olšák2018-02-0433-232/+174
| | | | | | | | 4 bits per draw buffer, 8 draw buffers in total --> 32 bits. This is easier to work with. Reviewed-by: Eric Anholt <[email protected]>
* i965: Create new program cache bo when clearing the program cacheJordan Justen2018-02-031-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | When the disk shader cache CI testing was enabled, we started noticing occasional failures on deqp test runs. (Mainly SNB, rarely HSW) Before this change, when we cleared the (in memory) program cache we reused the same bo. Since the disk shader cache quickly restores programs, it appears that this would lead to overwrites of the older program binaries in the in memory program cache that apparently were still executing in some cases. If these programs were still executing, this could cause a GPU hang. This issue is probably not disk shader cache specific, but may have been hidden due to the compiler taking time to recompile programs after the cache was cleared. v2: * Don't add `copy` param to brw_cache_new_bo (Ken) * Call from brw_program_cache_check_size (Ken) Cc: Kenneth Graunke <[email protected]> Cc: [email protected] Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* aubinator: Multiply count by 4 to compute buffer sizesJason Ekstrand2018-02-021-1/+1
| | | | | | The count field is in terms of dwords and not bytes. In 7d4007d58ab7c0c1796e116b55814f8be4e699a9, I fixed one instance of this but missed another.
* broadcom/vc5: Enable UIF XOR on textures.Eric Anholt2018-02-023-7/+40
| | | | | | | | | | This should increase performance by reducing SDRAM bank conflicts when crossing between UIF columns (particularly on power-of-two height textures). The uif_xor_disable setup is dropped, since we need to allow XOR on lower miplevels even when level 0 is XOR. The level 0 force UIF and level 0 XOR flags should handle setting XOR properly on imported buffers.