summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: fix srgbRob Clark2018-08-171-7/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix dEQP-GLES3.functional.fence_sync.*Rob Clark2018-08-171-0/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: Add a6xx backendKristian H. Kristensen2018-08-1638-17/+6368
| | | | | | | | | | This adds a freedreno backend for the a6xx generation GPUs, which at the time of this commit is about 98% GLES2 conformant. Much remains to be done - both performance work and feature work towards more recent GLES versions, but this is a good start. Signed-off-by: Kristian H. Kristensen <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-08-167-66/+4928
| | | | | | pull in a6xx registers Signed-off-by: Rob Clark <[email protected]>
* freedreno: Fix warningsKristian H. Kristensen2018-08-165-15/+9
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* svga: simplify Mesa version stringEric Engestrom2018-08-161-1/+1
| | | | | | Suggested-by: Emil Velikov <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* bin: always define MESA_GIT_SHA1 to make it directly usable in codeEric Engestrom2018-08-161-5/+1
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* virgl: report actual max-texture sizesErik Faye-Lund2018-08-152-0/+10
| | | | | | | | Instead of doing conservative guesses, we should report the max levels based on the max sizes we get from GL on the host. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* virgl: do not use SP_MAX_TEXTURE_*_LEVELS definesErik Faye-Lund2018-08-151-7/+3
| | | | | | | | | | | | | These macro-names are also used for softpipe, so let's avoid confusion by avoiding them. Besides, they are just used in one place in virgl, so let's just inline them into the place they are used instead. While we're at it, fixup an error in the comment for the 3D version. Mesa subtracts computes max-size by doing by 2^(n-1), which means this should be 256 cubed, not 512 cubed. The other comments are correct. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* radv: disable the auto-waitcnt-before-barrier LLVM optionSamuel Pitoiset2018-08-151-0/+1
| | | | | | | | | | | | | | This option allows us to remove additional s_waitcnt instructions because s_barrier internally does s_waitcnt 0. Though, apparently there is a problem with LDS accesses that causes rendering issues with FFXV and DXVK. Disable this optimization for now (RadeonSI still uses it). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107460 CC: 18.2 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: enable 1 missing PS_SU perf counter on PolarisMarek Olšák2018-08-141-1/+1
|
* radeonsi: use radeon_info::nameMarek Olšák2018-08-143-40/+12
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: split si_clear_buffer to remove enum si_methodMarek Olšák2018-08-146-53/+60
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: replace CP_DMA_USE_L2 with enum si_cache_policyMarek Olšák2018-08-142-26/+41
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: declare coher in si_copy_bufferMarek Olšák2018-08-141-8/+7
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: make PFP_SYNC_ME an explicit CP DMA flagMarek Olšák2018-08-141-17/+25
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: don't use emit_data->args in load_emitMarek Olšák2018-08-141-94/+37
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: don't use emit_data->args in store_emitMarek Olšák2018-08-141-92/+71
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: don't use emit_data->args in atomic_emitMarek Olšák2018-08-143-36/+47
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: don't use emit_data->args in build_interp_intrinsicMarek Olšák2018-08-141-19/+13
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline atomic_fetch_argsMarek Olšák2018-08-141-74/+51
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline store_fetch_argsMarek Olšák2018-08-141-61/+42
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline load_fetch_argsMarek Olšák2018-08-141-39/+28
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: merge txq_emit and resq_emitMarek Olšák2018-08-141-48/+45
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline resq_fetch_argsMarek Olšák2018-08-141-62/+34
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline txq_fetch_argsMarek Olšák2018-08-141-26/+7
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: use get_resinfo directly in lower_gather4_integerMarek Olšák2018-08-141-13/+12
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: inline tex_fetch_args into build_tex_intrinsicMarek Olšák2018-08-141-222/+188
| | | | | | | The diff looks like it moves code that I didn't touch. Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: remove fetch_args callbacks for ALU instructionsMarek Olšák2018-08-142-103/+55
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.cMarek Olšák2018-08-148-319/+348
| | | | | Tested-by: Dieter Nützel <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radeonsi: implement EXT_window_rectanglesMarek Olšák2018-08-147-2/+95
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* freedreno/ir3: add support for a6xx 'merged' register setRob Clark2018-08-142-2/+24
| | | | | | | | | | Starting with a6xx, half and full precision registers conflict. Which makes things a bit more efficient, ie. if some parts of the shader are heavy on half-precision and others on full precision, you don't have to allocate the worst case for both. But it means we need to setup some additional conflicts. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: small RA cleanupRob Clark2018-08-142-13/+8
| | | | | | | Collapse is_temp() into it's only callsite, and pass compiler object as struct rather than void. Just cleanups to reduce noise in next patch. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: stop hard-coding FS input regsRob Clark2018-08-147-183/+103
| | | | | | | | | | We originally did this because at the time we didn't know all the bitfields to configure where various frag shader sysval's went. But we do. So switch to using sysvals for all the frag shader inputs. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use r63.x for unused inputsRob Clark2018-08-141-3/+3
| | | | | | | This way, unused sysval inputs, like frag_vcoord, get the correct regid value to disable the input. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: create all inputs in first blockRob Clark2018-08-141-17/+17
| | | | | | | | create_input()/create_input_compmask() should take the ctx as arg, rather than block, to enforce that all inputs are created in the first block, so that RA sees them as live at the start of the shader. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rename s/frag_pos/frag_vcoord/gRob Clark2018-08-142-17/+22
| | | | | | | Make it more clear that this is varying fetch related. Also fixup some comments. Just cleanup for next patches. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: move per-generation compiler configRob Clark2018-08-143-43/+52
| | | | | | | Move it from the compile ctx to the compiler object, before adding new things for a6xx. Signed-off-by: Rob Clark <[email protected]>
* freedreno: move free() into fdN_context_destroy()Rob Clark2018-08-145-2/+7
| | | | | | | | Following patches will be doing further cleanup after calling fd_context_destroy() so it is easier if we move the free() into the per-gen backend code. Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: ir2 updateJonathan Marek2018-08-145-545/+615
| | | | | | | | | | | | | | | | | | | this patch brings a number of changes to ir2: -ir2 now generates CF clauses as necessary during assembly. this simplifies fd2_program/fd2_compiler and is necessary to implement optimization passes -ir2 now has separate vector/scalar instructions. this will make it easier to implementing scheduling of scalar+vector instructions together. dst_reg is also now seperate from src registers instead of a single list -ir2 now implements register allocation. this makes it possible to compile shaders which have more than 64 TGSI registers -ir2 now implements the following optimizations: removal of IN/OUT MOV instructions generated by TGSI and removal of unused instructions when some exports are disabled -ir2 now allows full 8-bit index for constants -ir2_alloc no longer allocates 4 times too many bytes Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* virgl: ARB_texture_barrier supportDave Airlie2018-08-146-3/+24
| | | | Reviewed-by: Tomeu Vizoso <[email protected]>
* meson: Build with Python 3Mathieu Bridon2018-08-106-12/+12
| | | | | | | | | | | | Now that all the build scripts are compatible with both Python 2 and 3, we can flip the switch and tell Meson to use the latter. Since Meson already depends on Python 3 anyway, this means we don't need two different Python stacks to build Mesa. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* vc4: Implement texture_subdata() to directly upload tiled data.Eric Anholt2018-08-081-1/+39
| | | | | | This avoids a memcpy into a temporary in the upload path. Improves x11perf -putimage100 performance by 12.1586% +/- 1.38155% (n=145)
* vc4: Handle partial loads/stores of tiled textures.Eric Anholt2018-08-083-60/+155
| | | | | | | | | | | | | | | | Previously, we would load out the tile-aligned area, update the raster copy, and store it back. This was a huge cost for XPutImage calls to the screen under glamor. Instead, implement a general load/store path that walks over the source x/y writing into the corresponding pixel of the destination (using clever math from https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-swizzling/). If things are aligned, we go through the previous utile-at-a-time loop. Improves x11perf -putimage10 performance by 139.777% +/- 2.83464% (n=5) Improves x11perf -putimage100 performance by 383.908% +/- 22.6297% (n=11) Improves x11perf -getimage10 performance by 2.75731% +/- 0.585054% (n=145)
* vc4: Compile the LT image helper per cpp we might load/store.Eric Anholt2018-08-081-2/+31
| | | | | | | | For the partial load/store support I'm about to add, we want the memcpy to be compiled out to a single load/store. This should also eliminate the calls to vc4_utile_width/height(). Improves x11perf -putimage100 performance by 3.76344% +/- 1.16978% (n=15)
* vc4: Refactor to reuse the LT tile walking code.Eric Anholt2018-08-081-24/+34
|
* svga: use pipe_sampler_view::target in svga_set_sampler_views()Brian Paul2018-08-081-1/+1
| | | | | | | | | | | | | | | | | | instead of the underlying texture's target. This fixes an issue where the TGSI sampler type was not agreeing with the sampler view target/type. In particular, this fixes a Mint 19 XFCE desktop scaling issue because the TGSI code was using a RECT sampler but the sampler view's underlying texture was PIPE_TEXTURE_2D. We want to use the sampler view's type rather than the underlying resource, as we do for the view's surface format. No piglit regressions. VMware issue 2156696. Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use SVGA3D_RS_FILLMODE for vgpu9Brian Paul2018-08-083-26/+37
| | | | | | | | | | | | | I'm not sure why we didn't support this in the past, but fillmode is supported by all renderers nowadays. Also fix the logic in svga_create_rasterizer_state() to avoid a few swtnl case. No piglit regressions Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: add TGSI_SEMANTIC_FACE switch case in svga_swtnl_update_vdecl()Brian Paul2018-08-081-0/+1
| | | | | | | | Fixes failed assertion running Piglit polygon-mode-face test. Though, the test still does not pass. Reviewed-by: Neha Bhende <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* vc4: Fix vc4_fence_server_sync() on pre-syncobj kernels.Eric Anholt2018-08-071-1/+2
| | | | | | | | | We won't have an FD if we're just having the server wait on a fence created by eglCreateSyncKHR(). Our seqno fences will happen in order, so server-side waits are no-ops in that case. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_server_sync.buffers.gen_delete Fixes: b0acc3a5628c ("broadcom/vc4: Native fence fd support")