summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: fix resource leakEric Engestrom2018-06-111-1/+3
| | | | | | | | | | v2: intel_miptree_release() already takes care of the planes, no need to hand-code the loop (Lionel) Coverity ID: 1436909 Fixes: 3352f2d746d3959b22ca4 "i965: Create multiple miptrees for planar YUV images" Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* freedreno/ir3: use pipe_image_view's cppRob Clark2018-06-111-1/+6
| | | | | | | At least for PIPE_BUFFER, we could get the resource used as (for example) R32F imageBuffer. So using cpp=1 from the rsc is wrong. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix image dimensions offsetRob Clark2018-06-111-1/+1
| | | | | | copy-pasta fail from how SSBO sizes are handled. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: correct image/ssbo offsetRob Clark2018-06-111-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use saml always if we have lodRob Clark2018-06-111-1/+1
| | | | | | | In some cases we get plain tex opcodes (but w/ a lod argument).. in this case always use the saml instruction. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't cp absneg into meta:fiRob Clark2018-06-111-0/+4
| | | | | | | | | | | If using a fanin (collect) to collect of consecutive registers together, we can CP mov's into the fanin, but not (abs) or (neg). No places that allow those modifiers are consuming a fanin anyways. But this caused an absneg to be lost between a ldgb and stgb for shaders like: outputs[n] = abs(input[n]) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rework size/type conversion instructionsRob Clark2018-06-111-10/+156
| | | | | | With 8b and 16b, there are a lot more to handle. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: propagate HALF flag across fanoutRob Clark2018-06-111-1/+4
| | | | | | | | | | If we have a fanout (split) meta instruction to split the result of a vector instruction, propagate the HALF flag back to the original instruction. Otherwise result ends up in a full precision register while instruction(s) that use the result look in a half-precision register. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: add sample-id/sample-mask-inRob Clark2018-06-111-3/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add sample-id/sample-mask-inRob Clark2018-06-111-0/+21
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-06-118-87/+213
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: image atomics use image-store pathRob Clark2018-06-111-0/+8
| | | | | | | | image reads are handled via tex state, whereas image writes and atomics are handled via SSBO state block. Previously we were only considering image write, and not image atomics which also uses the SSBO state block. Signed-off-by: Rob Clark <[email protected]>
* egl/glvnd: Fix a segfault in eglGetProcAddress.Kyle Brenneman2018-06-111-17/+13
| | | | | | | | | | | | If FindProcIndex in egldispatchstubs.c is called with a name that's less than the first entry in the array, it would end up trying to store an index of -1 in an unsigned integer, wrap around to 2^32, and then crash when it tries to look that up. Change FindProcIndex so that it uses bsearch(3) instead of implementing its own binary search, like the GLX equivalent FindGLXFunction does. Reviewed-by: Eric Engestrom <[email protected]>
* mesa/program_binary: add implicit UseProgram after successful ProgramBinaryJordan Justen2018-06-101-0/+31
| | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106810 Fixes: b4c37ce2140 "i965: Add ARB_get_program_binary support using nir_serialization" Ref: 3fe8d04a6d6 "mesa: don't always set _NEW_PROGRAM when linking" Ref: c505d6d8522 "mesa: use gl_program for CurrentProgram rather than gl_shader_program" Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Plamena Manolova <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* virgl: enable ARB_gpu_shader_fp64Dave Airlie2018-06-111-1/+2
| | | | | | | This enables ARB_gpu_shader_fp64 if the host provides it. Tested-by: Gurchetan Singh <[email protected]> Reviewed-by: Gurchetan Singh <[email protected]>
* radv: add a workaround for DXVK hangs by setting amdgpu-skip-thresholdSamuel Pitoiset2018-06-091-1/+78
| | | | | | | | | | | | | | | | | | | | | | | | | | Workaround for bug in llvm that causes the GPU to hang in presence of nested loops because there is an exec mask issue. The proper solution is to fix LLVM but this might require a bunch of work. This fixes a bunch of GPU hangs that happen with DXVK. Vega10: Totals from affected shaders: SGPRS: 110456 -> 110456 (0.00 %) VGPRS: 122800 -> 122800 (0.00 %) Spilled SGPRs: 7478 -> 7478 (0.00 %) Spilled VGPRs: 36 -> 36 (0.00 %) Code Size: 9901104 -> 9922928 (0.22 %) bytes Max Waves: 7143 -> 7143 (0.00 %) Code size slightly increases because it inserts more branch instructions but that's expected. I don't see any real performance changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105613 Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix missing ZRANGE_PRECISION(1) for GFX9+Samuel Pitoiset2018-06-091-1/+2
| | | | | | | | | | | | | | | ZRANGE_PRECISION(1) seems to be the default optimal value, but it was only set for VI and older chips. This fixes a rendering issue with Banished through DXVK, and might fix more than that. There is still the ZRANGE_PRECISION bug that we need to handle but that can be fixed later. Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: enable VK_EXT_shader_stencil_exportGustavo Lima Chaves2018-06-083-0/+3
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add/hookup SpvCapabilityStencilExportEXTGustavo Lima Chaves2018-06-083-0/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: An attempt to support SpvExecutionModeStencilRefReplacingEXT's behavior also follows, with the interpretation to said mode being we prevent writes to the built-in FragStencilRefEXT variable when the execution mode isn't set. v3: A more cautious reading of 1db44252d01bf7539452ccc2b5210c74b8dcd573 led me to a missing change that would stop (what I later discovered were) GPU hangs on the CTS test written to exercise this. v4: Turn FragStencilRefEXT decoration usage without StencilRefReplacingEXT mode into a warning, instead of trying to make the variable read-only. If we are to follow the originating extension on GL, the built-in variable in question should never be readable anyway. v5/v6: rebases. v7: Fix check for gen9 lost in rebase. (Ilia) Reduce the scope of the bool used to track whether SpvExecutionModeStencilRefReplacingEXT was used. Was in shader_info, moved to vtn_builder. (Jason) v8: Assert for fragment shader handling StencilRefReplacingEXT execution mode. (Caio) Remove warning logic, since an entry point might not have StencilRefReplacingEXT execution mode, but the global output variable might still exist for another entry point in the module. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* autotools/meson: compile against wayland-egl-*backend*Eric Engestrom2018-06-081-1/+1
| | | | | | | | Bug: https://bugs.freedesktop.org/show_bug.cgi?id=106861 Fixes: 1db4ec05462914096b1f "egl: rewire the build systems to use libwayland-egl" Suggested-by: Emil Velikov <[email protected]> Tested-by: Andreas Hartmetz <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* vulkan/wsi: Destroy swapchain images after terminating FIFO queuesCameron Kumar2018-06-081-3/+3
| | | | | | | | | The queue_manager thread can access the images from x11_present_to_x11, hence this reorder prevents dereferencing of dangling pointers. Cc: "18.1" <[email protected]> Fixes: e73d136a023080 ("vulkan/wsi/x11: Implement FIFO mode.") Reviewed-by: Lionel Landwerlin <[email protected]>
* radeonsi: emit_dpbb_state packets optimizationSonny Jiang2018-06-072-21/+26
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: emit_clip_state packets optimizationSonny Jiang2018-06-072-3/+7
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: emit_msaa_sample_locs packets optimizationSonny Jiang2018-06-072-2/+6
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: emit_msaa_config packets optimizationSonny Jiang2018-06-072-28/+28
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: emit_cb_render_state packets optimizationSonny Jiang2018-06-073-9/+48
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: emit_db_render_state packets optimizationSonny Jiang2018-06-075-29/+95
| | | | | | | Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* drisw: Fix invalid pointer arithmeticJan Vesely2018-06-071-1/+1
| | | | | | | | Use of void * in pointer arithmetic is illegal, use char * instead. Fixes: cf54bd5e8381dba18d52fe438acda20cc1685bf3 ("drisw: use shared memory when possible") Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Jan Vesely <[email protected]>
* radeonsi: fix possible truncation on renderer stringTimothy Arceri2018-06-081-1/+1
| | | | | | | Fixes truncation warning in gcc 8.1 Fixes: 8539c9bf3158 ("gallium/radeon: add the kernel version into the renderer string") Reviewed-by: Michel Dänzer <[email protected]>
* ac: fix possible truncation of intrinsic nameTimothy Arceri2018-06-081-1/+1
| | | | | | | | Fixes the gcc warning: snprintf’ output between 26 and 33 bytes into a destination of size 32 Fixes: d5f7ebda3ec0 ("ac: add LLVM build functions for subgroup instrinsics") Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* amd/common: Fix number of coords for getlod.Bas Nieuwenhuizen2018-06-071-3/+18
| | | | | | | | | | The LLVM 6 code reduced it to a non-array call. We need to do that with the new code too. This fixes dEQP-VK.glsl.texture_functions.query.texturequerylod.*array* for radv. Fixes: a9a79934412 "amd/common: use the dimension-aware image intrinsics on LLVM 7+" Reviewed-by: Dave Airlie <[email protected]>
* i965/screen: Sanity check that all formats we advertise are useableJason Ekstrand2018-06-071-4/+20
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Use RGBA non-sRGB formats for imagesJason Ekstrand2018-06-071-0/+9
| | | | | | | | | | Not all of the MESA_FORMAT and ISL_FORMAT helpers we use can properly handle RGBX formats. Also, we don't want to make decisions based on those in the first place because we can't render to RGBA and we use the non-sRGB version to determine whether or not to allow CCS_E. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Return false for unsupported formats in query_modifiersJason Ekstrand2018-06-071-2/+14
| | | | | Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/screen: Refactor query_dma_buf_formatsJason Ekstrand2018-06-071-12/+13
| | | | | | | | | This reworks it to work like query_dma_buf_modifiers and, in particular, makes it more flexible so that we can disallow a non-static set of formats. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/isl: Add bounds-checking assertions for the format_info tableJason Ekstrand2018-06-071-8/+16
| | | | | | | | | | | We follow the same convention as isl_format_get_layout in having two assertions to ensure that only valid formats are passed in. We also check against the array size of the table because some valid formats such as CCS formats will may be past the end of the table. This fixes some potential out-of-bounds array access even in valid cases. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/isl: Add bounds-checking assertions in isl_format_get_layoutJason Ekstrand2018-06-072-12/+22
| | | | | | | | | | | | We add two assertions instead of one because the first assertion that format != ISL_FORMAT_UNSUPPORTED is more descriptive and checks for a real but unsupported enumerant while the second ensures that they don't pass in garbage values. We also update some other helpers to use isl_format_get_layout instead of using the table directly so that they get bounds checking too. Cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Set fence/semaphore types to NONE in impl_cleanupJason Ekstrand2018-06-071-13/+16
| | | | | | | | | | There were some places that were calling anv_semaphore_impl_cleanup and neither deleting the semaphore nor setting the type back to NONE. Just set it to NONE in impl_cleanup to avoid these issues. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106643 Fixes: 031f57eba "anv: Add a basic implementation of VK_KHX_external..." Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Add global invocation id intrinsic.Plamena Manolova2018-06-072-0/+5
| | | | | | | | Add the missing nir intrinsic for the gl_GlobalInvocationID compute shader variable. Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Require softpin support for Cannonlake and later.Kenneth Graunke2018-06-061-0/+10
| | | | | | | | | | | This isn't strictly necessary, but anyone running Cannonlake will already have Kernel 4.5 or later, so there's no reason to support the relocation model on Gen10+. This will let us avoid dealing with them for new features. Reviewed-by: Scott D Phillips <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* i965: Allocate VMA in userspace for full-PPGTT systems.Kenneth Graunke2018-06-061-1/+1
| | | | | | | | | | | | | | | | This patch enables soft-pinning of all buffers, allowing us to skip relocation processing entirely. All systems with full PPGTT and > 4GB of VMA should gain these benefits. This should be most Gen8+. Unfortunately, this excludes a few systems: - Cherryview (only has 32-bit addressing, despite 48-bit pointers) - Broadwell with a 32-bit kernel - Anybody running pre-4.5 kernel. We may enable it for Cherryview in the future, but it would require some tweaks to the memory zone. Reviewed-by: Jordan Justen <[email protected]>
* intel/blorp: Emit VF cache invalidates for 48-bit bugs with softpin.Kenneth Graunke2018-06-063-5/+51
| | | | | | | | | | | | | | | | | | commit 92f01fc5f914fd500497d0c3aed75f3ac8dc054d made i965 start emitting VF cache invalidates when the high bits of vertex buffers change. But we were not tracking vertex buffers emitted by BLORP. This was papered over by a mistake where I emitted VF cache invalidates all the time, which Chris fixed in commit 3ac5fbadfd8644d30fce9ff267cb811ad157996a. This patch adds a new hook which allows the driver to track addresses and request a VF cache invalidate as appropriate. v2: Make the driver do the PIPE_CONTROL so it can apply workarounds (caught by Jason Ekstrand). Rebase on anv bug fix. v3: Don't screw up the boolean (caught by Jason Ekstrand). Fixes: 92f01fc5f914 ("i965: Emit VF cache invalidates for 48-bit addressing bugs with softpin.") Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add opt_if_loop_terminator()Timothy Arceri2018-06-071-0/+68
| | | | | | | | | | | | | | | | | | | | | | | | This pass detects potential loop terminators and moves intructions from the non breaking branch after the if-statement. This enables both the new opt_if_simplification() pass and loop unrolling to potentially progress further. Unexpectedly this change speed up shader-db run times by ~3% Ivy Bridge shader-db results (all changes in dolphin/ubershaders): total instructions in shared programs: 9995662 -> 9995338 (-0.00%) instructions in affected programs: 87845 -> 87521 (-0.37%) helped: 27 HURT: 0 total cycles in shared programs: 230931495 -> 230925015 (-0.00%) cycles in affected programs: 56391385 -> 56384905 (-0.01%) helped: 27 HURT: 0 Reviewed-by: Ian Romanick <[email protected]>
* nir: move ends_in_break() helper to nir_loop_analyze.hTimothy Arceri2018-06-072-13/+13
| | | | | | | We will use the helper while simplifying potential loop terminators in the following patch. Reviewed-by: Ian Romanick <[email protected]>
* radv: fix Coverity no effect control flow issueTimothy Arceri2018-06-071-1/+1
| | | | | swizzle is unsigned so "desc->swizzle[c] < 0" is never true. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/blorp: Don't vertex fetch directly from clear valuesJason Ekstrand2018-06-061-44/+41
| | | | | | | | | | | | | | On gen8+, we have to VF cache flush whenever a vertex binding aliases a previous binding at the same index modulo 4GiB. We deal with this in Vulkan by ensuring that vertex buffers and the dynamic state (from which BLORP pulls its vertex buffers) are in the same 4GiB region of the address space. That doesn't work if we're reading clear colors with the VF unit. In order to work around this we switch to using MI commands to copy the clear value into the vertex buffer we allocate for the normal constant data. Cc: [email protected] Reviewed-by: Kenneth Graunke <[email protected]>
* dri: add missing 16bits formats mappingLionel Landwerlin2018-06-071-0/+16
| | | | | | | | | | | | | | | | | | i965 advertises the 16-bit R and RG formats through eglQueryDmaBufFormatsEXT but falls over when a client tries to use or asks more information about such a format because driImageFormatToGLFormat returns MESA_FORMAT_NONE. Found by Eero Tamminen. v2: Add G16R16 formats (Lionel) v3: Fix G16R16 mapping to mesa format (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106642 Reviewed-by: Plamena Manolova <[email protected]> (v2) Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Look into uniform structs for samplers when counting num_textures.Eric Anholt2018-06-061-12/+44
| | | | | | | | | | | | | | mesa/st decides whether to update samplers after a program change based on whether num_textures is nonzero. By not counting samplers in a uniform struct, we would segfault in KHR-GLES3.shaders.struct.uniform.sampler_vertex if it was run in the same context after a non-vertex-shader-uniform testcase (as is the case during a full conformance run). v2: Implement using two separate pure functions instead of updating pointers. Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Work around GFXH-1461/GFXH-1689 by using CLEAR_TILE_BUFFERS.Eric Anholt2018-06-061-10/+17
| | | | | | | This doesn't seem to have done anything to my test results. However, given that we've still got a class of GPU hangs, following the workarounds that the closed driver does so that we get the same command sequences seems like a good idea.
* v3d: Enable the new NIR bitfield operation lowering paths.Eric Anholt2018-06-061-2/+19
| | | | | | | | | | These together get the GLSL 3.00 unorm/snorm pack functions and MESA_shader_integer operations working. v2: Fix commit message typo. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>