aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: restore si_emit_cache_flush call at the end of IBsMarek Olšák2018-04-131-0/+2
| | | | Fixes: 918b798668c "radeonsi: make sure CP DMA is idle at the end of IBs"
* gallium: move ddebug, noop, rbug, trace to auxiliary to improve build timesMarek Olšák2018-04-1349-13220/+2
| | | | which also simplifies the build scripts.
* radeonsi: make sure CP DMA is idle at the end of IBsMarek Olšák2018-04-133-2/+16
|
* radeonsi: always prefetch later shaders after the draw packetMarek Olšák2018-04-133-26/+75
| | | | | | | | | so that the draw is started as soon as possible. v2: only prefetch the API VS and VBO descriptors Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: emit shader pointers before cache flushes & waitsMarek Olšák2018-04-131-13/+7
| | | | | | | | This code was written with the constant engine in mind. We can simplify it now. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi/gfx9: don't use the workaround for gather4 + stencilMarek Olšák2018-04-131-2/+11
| | | | | | | it doesn't seem to be needed. Acked-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: disable TC-compat HTILE on Tonga and IcelandMarek Olšák2018-04-131-0/+7
| | | | | Acked-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: force 2D tiling on VI only when TC-compat HTILE is really enabledMarek Olšák2018-04-131-9/+7
| | | | | | | just pass the flag that indicates it. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't flush HTILE if there is no HTILE clearMarek Olšák2018-04-131-2/+2
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: merge 2 identical if statements in si_clearMarek Olšák2018-04-131-9/+2
| | | | | | | and other cleanups Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: don't do GFX-specific texture decompression for computeMarek Olšák2018-04-131-10/+10
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* radeonsi: simplify generating the renderer stringMarek Olšák2018-04-131-11/+8
| | | | | | | HAVE_LLVM > 0 is a tautology. Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* broadcom/vc5: Fix a stray '`' in a comment.Eric Anholt2018-04-121-1/+1
|
* broadcom/vc5: Update the UABI for in/out syncobjsEric Anholt2018-04-129-90/+55
| | | | | | | | | This is the ABI I'm hoping to stabilize for merging the driver. seqnos are eliminated, which allows for the GPU scheduler to task-switch between DRM fds even after submission to the kernel. In/out sync objects are introduced, to allow the Android fencing extension (not yet implemented, but should be trivial), and to also allow the driver to tell the kernel to not start a bin until a previous render is complete.
* broadcom/vc5: Drop the finished_seqno optimization.Eric Anholt2018-04-122-11/+0
| | | | | With the DRM scheduler changes, I'm about to remove all seqnos from the UABI.
* broadcom/vc5: Drop the throttling code.Eric Anholt2018-04-121-9/+0
| | | | | Since I'll be using the DRM scheduler, we won't run into the problem of a runaway client starving other clients of GPU time.
* broadcom/vc5: Move flush_last_load into load_general, like for stores.Eric Anholt2018-04-121-28/+29
| | | | | | | This should avoid mistakes with not flushing as we change the series of loads. Already, it fixes a hopefully unreachable case where we were emitting just the TILE_COORDINATES and not the dummy store that needs to go with it.
* broadcom/vc5: Rename read_but_not_cleared to loads_pending.Eric Anholt2018-04-121-13/+13
| | | | | This is a more obvious name for what the variable means, and matches what it's called for stores.
* broadcom/vc5: Refactor the implicit coords/stores_pending logic.Eric Anholt2018-04-121-23/+13
| | | | | Since I just fixed a bug due to forgetting to do these right, do it once in the helper func.
* broadcom/vc5: Emit missing TILE_COORDINATES_IMPLICIT in separate z/s stores.Eric Anholt2018-04-121-5/+16
| | | | | Fixes a simulator assertion failure in KHR-GLES3.packed_depth_stencil.blit.depth32f_stencil8
* broadcom/vc5: Add checks that we don't try to do raw Z+S load/stores.Eric Anholt2018-04-121-0/+8
| | | | | | | This was dying in the simulator on GTF-GLES3.gtf.GL3Tests.packed_depth_stencil.packed_depth_stencil_blit. We'll need to do basically the same thing as Z32F/S8 does in the MSAA Z24S8 case.
* broadcom/vc5: Fix MSAA depth/stencil size setup.Eric Anholt2018-04-121-2/+4
| | | | | | | The v3dX(get_internal_type_bpp_for_output_format)() call only handles color output formats (which overlap in enum numbers with depth output formats), so for depth we just need to take the normal cpp times the number of samples.
* radeonsi: use PIPE_FORMAT_P016 format for VP9 profile2Leo Liu2018-04-121-1/+2
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 profile2 supportLeo Liu2018-04-121-0/+16
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: cap VP9 support to progressive bufferLeo Liu2018-04-121-0/+2
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: cap VP9 support to RavenLeo Liu2018-04-121-0/+4
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 context bufferLeo Liu2018-04-121-0/+26
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: get VP9 msg bufferLeo Liu2018-04-122-1/+176
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: fill probability table to prob buffersLeo Liu2018-04-121-0/+38
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 message buffer interfaceLeo Liu2018-04-121-0/+134
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 prob table bufferLeo Liu2018-04-122-18/+37
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 dpb buffer sizeLeo Liu2018-04-121-0/+6
| | | | | | | | | The current FW has restricted the size to the worse case, and the new dynamic dpb buffer support is on the way from firmware side, we will change accordingly. Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeon/vcn: add VP9 stream type for decoderLeo Liu2018-04-122-0/+4
| | | | | Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]>
* radeonsi: correctly parse disassembly with labelsNicolai Hähnle2018-04-111-31/+32
| | | | | | | | | | LLVM now emits labels as part of the disassembly string, which is very useful but breaks the old parsing approach. Use the semicolon to detect the boundary of instructions instead of going by line breaks. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: pass -O halt_waves to umr for hang debuggingNicolai Hähnle2018-04-111-2/+2
| | | | | | | | | | | This will give us meaningful wave information in the case of a hang where shaders are still running in an infinite loop. Note that we call umr multiple times for different sections of the ddebug hang dump, and so the wave information will not necessarily match up between sections. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add shader binary padding for UMRMarek Olšák2018-04-101-3/+15
|
* radeonsi: autotools: add si_build_pm4.h in dist tarballJuan A. Suarez Romero2018-04-101-0/+1
| | | | | | | | Fixes: 5777488406c ("radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h") Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radeonsi/nir: tidy up si_nir_load_sampler_desc()Timothy Arceri2018-04-101-5/+3
| | | | | | | | This makes it easier to follow the code, and also initialises dynamic_index which will be useful for adding bindless textures support. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: set uses_bindless_images for imagesTimothy Arceri2018-04-101-1/+16
| | | | | | V2: add missing intrinsics (Spotted-by: Samuel Pitoiset) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/nir: don't add bindless samplers/images to declared bitmasksTimothy Arceri2018-04-101-6/+6
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: convert dispatch packet to little endianBas Vermeulen2018-04-091-12/+12
| | | | | | | | | | | | | | | The parameters for the compute engine are wrong when using an E8860 on a big endian machine. To fix this, convert the contents of struct dispatch_packet to little endian. This ensures that get_global_id(0) and similar functions in the OpenCL code get the correct endian values, and makes my simple OpenCL program work correctly. Signed-off-by: Bas Vermeulen <[email protected]> Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: correct si_vgt_param_key on big endian machinesBas Vermeulen2018-04-091-0/+13
| | | | | | | | | | | | Using mesa OpenCL failed on a big endian PowerPC machine because si_vgt_param_key is using bitfields and a 32 bit int for an index into an array. Fix si_vgt_param_key to work correctly on both little endian and big endian machines. Signed-off-by: Bas Vermeulen <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* radeonsi: don't set RB+ registers on GFX9 chips without RB+Marek Olšák2018-04-091-6/+1
| | | | | | CLEAR_STATE initializes them properly. Reviewed-by: Samuel Pitoiset <[email protected]>
* etnaviv: meson: add etnaviv_query_pm.[ch] to the sourcesEmil Velikov2018-04-091-0/+2
| | | | | | | | | | | Otherwise building the driver will fail with unresolved symbols. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105960 Fixes: 72d2043be06 ("etnaviv: add perfmon query implementation") Cc: Christian Gmeiner <[email protected]> Cc: Clayton Craft <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: expose perfmon query groupsChristian Gmeiner2018-04-081-2/+6
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>
* etnaviv: add query_group_info for perfmon countersChristian Gmeiner2018-04-082-0/+50
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>
* etnaviv: assign group_ids to perfmon queriesChristian Gmeiner2018-04-082-1/+56
| | | | | | | Prep work for AMD_performance_monitor support. Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>
* etnaviv: support MC performance countersChristian Gmeiner2018-04-082-0/+25
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>
* etnaviv: support TX performance countersChristian Gmeiner2018-04-082-0/+73
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>
* etnaviv: support RA performance countersChristian Gmeiner2018-04-082-0/+57
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Tested-by: Chris Healy <[email protected]>