summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* gallium: add PIPE_CAP_TGSI_FS_FBFETCHIlia Mirkin2017-01-1615-2/+17
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: true up exposing of the HW_METRIC_QUERY_GROUP for maxwellIlia Mirkin2017-01-161-2/+2
| | | | | | | This had been updated in one place but not the other. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: handle new DDIV op which will be used for double divisionsIlia Mirkin2017-01-161-0/+3
| | | | | | | The existing lowering is in place to lower that to RCP + MUL, or fancier things down the line if necessary. Signed-off-by: Ilia Mirkin <[email protected]>
* radeonsi: fix R600_DEBUG=nooptvariantNicolai Hähnle2017-01-161-1/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Vedran Miletić <[email protected]>
* radeonsi: implement GL_FIXED vertex formatMarek Olšák2017-01-163-7/+20
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement 32-bit SNORM/UNORM/SSCALED/USCALED vertex formatsMarek Olšák2017-01-163-18/+90
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make fix_fetch 64-bitMarek Olšák2017-01-165-9/+9
| | | | | | v2: add u_bit_consecutive64 Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add GPU-shaders-busy HUD queryMarek Olšák2017-01-164-1/+31
| | | | | | | It should be close to the GPU load, but it can be much lower if something is stalling shader execution (e.g. CP DMA). Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: make the GPU load / GRBM_STATUS monitoring extensibleMarek Olšák2017-01-163-32/+53
| | | | | | The next patch will add SPI_BUSY monitoring. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: show average results per frame for perf counters in HUDMarek Olšák2017-01-161-1/+1
| | | | | | so that the graphs are independent from FPS. Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0/ir: emit FMZ flag when requested on FFMAIlia Mirkin2017-01-151-0/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* radeonsi: replace si_shader_context::soa by bld_baseSamuel Pitoiset2017-01-133-82/+78
| | | | | | | | | | We no longer need to use lp_build_tgsi_soa_context. No regressions founds with full piglit run. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: replace ctx->soa.outputs by ctx->outputsSamuel Pitoiset2017-01-132-23/+26
| | | | | | | | | The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_shader_context::soa::addr to si_shader_contextSamuel Pitoiset2017-01-133-11/+12
| | | | | | | | | The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: allocate the array of immediates dynamicallySamuel Pitoiset2017-01-133-13/+24
| | | | | | | | | | | | | | | Currently, we can store up to 256 immediates in a static array, but this is not always enough. Instead, allocate a dynamic array like what we currently do for temps. This fixes a segfault with dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 No regressions found with full piglit run. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0/ir: only try to check for zero LOD if we aren't already forcing itIlia Mirkin2017-01-121-1/+1
| | | | | | | | | | There's a levelZero flag which forces texturing to pick level zero (and not consume an explicit LOD argument). This is set for MS targets, but could also be set for any other incoming instruction. As that is what determines whether a LOD argument is present, check that rather than the more indirect isMS logic. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: take extra push space into account for pushbuf_space callsIlia Mirkin2017-01-1215-56/+26
| | | | | | | | | | | | | | | | | | | | | | Ever since a long time ago when I messed around with fences, I ensure that after a PUSH_SPACE call there is enough space to write a fence out into the pushbuf. However the PUSH_SPACE macro is not all-knowing, and so sometimes we have to invoke nouveau_pushbuf_space manually with the relocs/pushes args set. If we don't take the extra allocation from PUSH_SPACE into account, then we will end up accidentally flushing when the code was not expecting a flush. This can lead to various runtime and rendering failures. The amount of extra allocation isn't that important - it has to be at least 8 based on the current nouveau_winsys.h setting, but even more won't hurt. I just rounded up to powers of 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354 Cc: "12.0 13.0" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Ben Skeggs <[email protected]>
* radeonsi: remove unused si_prepare_cube_coordsNicolai Hähnle2017-01-132-200/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: unify cube map coordinate handling between radeonsi and radvNicolai Hähnle2017-01-133-1/+11
| | | | | | | | | | | | | | | Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only touch first three coordinates in si_prepare_cube_coordsNicolai Hähnle2017-01-131-12/+1
| | | | | | | | Sourcing coords_arg[4] is actually never correct, since bias is handled differently in tex_fetch_args anyway. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused si_llvm_cube_to_2d_coordsNicolai Hähnle2017-01-131-28/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: restrict cube map derivative computations to the correct planeNicolai Hähnle2017-01-131-23/+107
| | | | | | | | | | | | | | | | | | | | As remarked by the comment in the original code, the old algorithm fails when (tc + deriv) points at a different cube face. Instead, simply project the derivative directly to the plane of the selected cube face. The new code is based on exactly differentiating (using the chain rule) the projection onto a plane corresponding to a fixed cube map face (which is still selected in the usual way based on the texture coordinate itself). The computations end up fairly involved, but we do save two reciprocal computations. Fixes GL45-CTS.texture_cube_map_array.sampling. v2: add 0.5 offset to tex coords only after derivative calculation v3: go back to 1.5 offset Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: communicate cube map coordinates more explicitlyNicolai Hähnle2017-01-131-33/+43
| | | | | | | v2: fix compile error that snuck in during rebase Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: move .gitignore for sid_tables.h tooGrazvydas Ignotas2017-01-131-1/+0
| | | | | | | | b838f642 "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac, radeonsi: automake: add missing builddir includeEmil Velikov2017-01-121-0/+1
| | | | | | | | | | The generated file is correctly stored in the builddir as of earlier commit. Yet the commit forgot to add the respective include flag thus the compiler would error out failing to find sid_tables.h Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=99389 Fixes: d1dc22eb466 "ac: automake: rework sid_tables.h generation" Signed-off-by: Emil Velikov <[email protected]>
* etnaviv: automake: include all files in the sources listsEmil Velikov2017-01-121-1/+9
| | | | | | Note: the currently mentioned etnaviv_utils.h is typo. Signed-off-by: Emil Velikov <[email protected]>
* imx: gallium driver for imx-drm scanout driverChristian Gmeiner2017-01-122-0/+17
| | | | | | | | | | Changes from V1 -> V2: - updated Copyright - added $(top_srcdir)/src/gallium/winsys to include path (suggested by Emil) - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <[email protected]> Acked-by: Emil Velikov <[email protected]>
* etnaviv: gallium driver for Vivante GPUsThe etnaviv authors2017-01-1261-0/+14672
| | | | | | | | | | | | | | | | | This driver supports a wide range of Vivante IP cores like GC880, GC1000, GC2000 and GC3000. Changes from V1 -> V2: - added missing files to actually integrate the driver into build system. - adapted driver to new renderonly API Signed-off-by: Christian Gmeiner <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Rob Herring <[email protected]> Signed-off-by: Russell King <[email protected]> Signed-off-by: Wladimir J. van der Laan <[email protected]> Acked-by: Emil Velikov <[email protected]>
* Always defer memory free in swr_resource_destroyGeorge Kyriazis2017-01-121-12/+5
| | | | | | | Defer delete on regular resources. This ensures that any work being done on the resource is completed before freeing up the resource's memory. Reviewed-by: Bruce Cherniak <[email protected]>
* nvc0: enable GL 4.3 on gm107+Samuel Pitoiset2017-01-121-7/+4
| | | | | | | | | | | | Although, arb_shader_image_load_store-atomicity will most likely hang your box, I think it's now quite reasonable to enable GL 4.3 on Maxwell/Pascal GPUs. I suspect that test to be wrong because it doesn't even work on the NVIDIA blob. I have tested a bunch of benchmarks (UE4 demos) and real games like Shadow of Mordor and they all work fine. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: use sched control codes for gm107 MP counters codeSamuel Pitoiset2017-01-121-44/+44
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* nvc0: use sched control codes for gm107 blitter shaderSamuel Pitoiset2017-01-121-6/+14
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50/ir: use sched control codes for gm107 builtinsSamuel Pitoiset2017-01-122-40/+40
| | | | | | | | | | Yes, IMUL/IMAD require dependency barriers and we should definitely replace these instructions by XMAD but the different flags need to be figured out. Note that XMAD only supports 16-bits integers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* nv50/ir: improve instruction pipelining on gm107Samuel Pitoiset2017-01-123-4/+1027
| | | | | | | | | | | | | | | | | | | | | This makes use of scheduling control codes which are very useful for improving the instruction pipelining. This patch will increase performance on Maxwell GPUs by, at least, x1.5 up to x3.5 for some benchmarks. Although this has been fairly well tested, I would not be suprised if someone hit a corner case somewhere. That way, the scheduler is enabled by default but it can be deactivated by using NV50_PROG_SCHED=0. Thanks to Scott Gray for the reverse engineering work available from https://github.com/NervanaSystems/maxas/wiki/Control-Codes. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Pierre Moreau <[email protected]> Tested-by: Alexandre Courbot <[email protected]> Tested-by: Jan Vesely <[email protected]>
* nv50/ir: do not insert texture barriers on gm107Samuel Pitoiset2017-01-121-1/+2
| | | | | | | | | | It's actually useless to insert those texture barriers post RA because the current control code (ie. st 0x0) will wait for all dependencies before issuing a new instruction. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* radeonsi: num_records is in units of stride for swizzled buffers even on VINicolai Hähnle2017-01-121-2/+0
| | | | | | The old setting didn't hurt, but this is cleaner. Reviewed-by: Marek Olšák <[email protected]>
* freedreno: add "nogrow" debug paramRob Clark2017-01-103-1/+4
| | | | | | | Sometimes it is useful to disable the "growable" cmdstream buffers for debugging. (See 419a154d in libdrm) Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: remove hack for glamorRob Clark2017-01-101-3/+0
| | | | | | | Now that issues glamor was hitting w/ glsl>=130 (aka missing INSTANCED bit in vertex attribute state) is fixed, remove hack. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fixed instancedRob Clark2017-01-101-0/+1
| | | | | | Add missing bit, now that we know where it is. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: use the non-_ZERO_BASE for vertexidRob Clark2017-01-104-6/+20
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: add texture MIPLVLSRob Clark2017-01-101-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix fragcoord related hangsRob Clark2017-01-102-2/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2017-01-106-13/+22
| | | | Signed-off-by: Rob Clark <[email protected]>
* ac/debug: Dump indirect buffers.Bas Nieuwenhuizen2017-01-091-3/+6
| | | | | | | | | | | | | | This is for handling chained command buffers and secondary command buffers. It doesn't handle the trace id for secondary command buffers yet, but I don't think that is possible in general with just writes, as we could call a secondary command buffer multiple times. I think this is good enough for now, as the most useful case is the chaining when we grow an IB. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/debug: Move IB decode to common code.Bas Nieuwenhuizen2017-01-093-332/+15
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/debug: Move sid_tables.h generation to common code.Bas Nieuwenhuizen2017-01-093-308/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix the Witcher 2 black transitionsMarek Olšák2017-01-091-2/+13
| | | | | | | | v2: do it properly Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98238 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set si_shader_context::input_decls for ranged decls correctlyMarek Olšák2017-01-091-1/+4
| | | | | | | | This has no effect because no code uses those members with ranged decls. Tested-by: Edmondo Tommasina <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: cleanly communicate whether si_shader_dump should check R600_DEBUGMarek Olšák2017-01-095-13/+15
| | | | | | Tested-by: Edmondo Tommasina <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: use the internal clear_buffer callback to fix r600gMarek Olšák2017-01-061-1/+3
| | | | | | | | r600g doesn't set pipe_context::clear_buffer. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99303 Reviewed-by: Alex Deucher <[email protected]>