aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau/nvc0
Commit message (Collapse)AuthorAgeFilesLines
* gallium/drivers/nouveau: Make use of ARRAY_SIZE macroEdward O'Callaghan2015-12-066-9/+8
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* nvc0: expose a group of performance metrics for SM30 (Kepler)Samuel Pitoiset2015-12-052-2/+8
| | | | | | | This allows to monitor these performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: re-introduce performance metrics for SM30 (Kepler)Samuel Pitoiset2015-12-052-5/+188
| | | | | | | This implements more performance metrics than the previous support, but some other metrics still need to be figured out. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: remove useless counting operations for MP countersSamuel Pitoiset2015-12-051-101/+5
| | | | | | Those bits were related to old performance metrics support. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: remove old performance metrics support on KeplerSamuel Pitoiset2015-12-052-37/+0
| | | | | | | These performance metrics will be re-introduced in an upcoming patch that will follow the same design as Fermi. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: remove wrong inst_issued HW SM perf counter on KeplerSamuel Pitoiset2015-12-052-3/+0
| | | | | | | inst_issued is performance metric not a hardware event on Kepler (SM30). It will be re-introduced in an upcoming patch. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: add missing HW SM perf counters for SM30 (Kepler)Samuel Pitoiset2015-12-053-0/+10
| | | | | | | SM30 is the compute capability version for GK104/GK106/GK107. This also introduces a new signal group selection called UNK0F. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: fix the comment that describe MP counters storage on KeplerSamuel Pitoiset2015-12-051-0/+5
| | | | Signed-off-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: allow to create resources other than buffersSamuel Pitoiset2015-12-012-2/+4
| | | | | | | | | | | | For the compute support, we might stick buffers as surfaces. This fixes an assertion when executing src/gallium/tests/trivial/compute. To avoid using these "restricted" surfaces as render targets, these assertions have been moved. Note that it's already handled for the framebuffer thing on nvc0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: properly handle buffer storage invalidation on dsa bufferIlia Mirkin2015-11-221-8/+9
| | | | | | | | | In case that the buffer has no bind at all, assume it can be a regular buffer. This can happen on buffers created through the ARB_dsa interfaces. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* gallium: add the concept of batch queriesNicolai Hähnle2015-11-201-0/+1
| | | | | | | | | | | | | | | | | Some drivers (in particular radeon[si], but also freedreno judging from a quick grep) may want to expose performance counters that cannot be individually enabled or disabled. Allow such drivers to mark driver-specific queries as requiring a new type of batch query object that is used to start and stop a list of queries simultaneously. v3: adjust recently added nv50 queries v2: documentation for create_batch_query Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* gallium: remove pipe_driver_query_group_info field typeNicolai Hähnle2015-11-201-4/+0
| | | | | | | | | | | | | | | | | | | This was only used to implement an unnecessarily restrictive interpretation of the spec of AMD_performance_monitor. The spec says A performance monitor consists of a number of hardware and software counters that can be sampled by the GPU and reported back to the application. I guess one could take this as a requirement that counters _must_ be sampled by the GPU, but then why are they called _software_ counters? Besides, there's not much reason _not_ to expose all counters that are available, and this simplifies the code. v3: add a missing change in the nouveau driver (thanks Samuel Pitoiset) Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: disable render condition around clear_* functionsIlia Mirkin2015-11-142-0/+13
| | | | | | | Only the regular "clear" call is supposed to respect the render condition. The rest should ignore it. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: reduce the number of GPR used when reading MP perf countersSamuel Pitoiset2015-11-141-1/+2
| | | | | | | No need to allocate more GPR than used in the compute kernel which reads MP performance counters on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: add ARB_clear_texture supportIlia Mirkin2015-11-112-1/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_CLEAR_TEXTURE and clear_texture prototypeIlia Mirkin2015-11-111-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: enable compute support on FermiSamuel Pitoiset2015-11-081-2/+2
| | | | | | | | | | | Altough the compute support is still not complete because textures and surfaces need to be implemented, it allows to launch very simple compute kernel like one which reads reading MP performance counters. This turns on PIPE_CAP_COMPUTE and PIPE_SHADER_COMPUTE. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reintroduce BGRA4 format supportIlia Mirkin2015-11-061-1/+1
| | | | | | | | | | | | | | Commit 342e68dc60 (nvc0: remove BGRA4 format support) removed the support to fix a WoW trace. However after further experimentation, I was able to get the blit to work by using a different "fake" format in the 2d engine. The reason why this worked on nv50 is that nv50 falls back to the 3d blit path in case either the src or the dst aren't "faithfully" supported, while nvc0 only does it for the dst format. RG8 is better supported by the nvc0 2d engine than R16. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: send back a debug message when waiting for a fence to completeIlia Mirkin2015-11-052-3/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: provide debug messages with shader compilation statsIlia Mirkin2015-11-055-5/+13
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: add support for sending debug messages via KHR_debugIlia Mirkin2015-11-051-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: relax fence emit space assertIlia Mirkin2015-11-041-1/+1
| | | | | | | | | We also have the "reserved for kick" space available. Some of my earlier changes can probably be removed, but this is a quick fix for some of the rarer fallout. Signed-off-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* nvc0: add missing compute parameters required by cloverSamuel Pitoiset2015-11-031-1/+10
| | | | | | | This fixes crashes with some piglit OpenCL tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: handle NULL pointer in nvc0_get_compute_param()Samuel Pitoiset2015-11-031-24/+21
| | | | | | | | | To get the size (in bytes) of a compute parameter, clover first calls get_compute_param() with a NULL data pointer. The RET() macro is based on nv50. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: get rid of tabsIlia Mirkin2015-10-312-75/+75
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: expose a group of performance metrics on FermiSamuel Pitoiset2015-10-293-3/+16
| | | | | | | This allows to monitor those performance metrics through GL_AMD_performance_monitor. Signed-off-by: Samuel Pitoiset <[email protected]>
* nv50/ir: adapt to new method for passing in cull/clip distance masksIlia Mirkin2015-10-291-6/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: share shaders between contexts and build immediatelyIlia Mirkin2015-10-293-1/+7
| | | | | | | Avoid deferring building shaders until draw time, should hopefully reduce any stuttering, as well as enable shader-db style analysis. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: do upload-time fixups for interpolation parametersIlia Mirkin2015-10-298-10/+82
| | | | | | | | | | | | | | | | | Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add ARB_copy_image supportIlia Mirkin2015-10-282-7/+11
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix crash when nv50_miptree_from_handle failsJulien Isorce2015-10-281-1/+2
| | | | | Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_COPY_BETWEEN_COMPRESSED_AND_PLAIN_FORMATSMarek Olšák2015-10-281-0/+1
| | | | | | For ARB_copy_image. Reviewed-by: Brian Paul <[email protected]>
* nvc0: respect edgeflag attribute widthIlia Mirkin2015-10-231-7/+33
| | | | | | | | | | | | The edgeflag comes in as ubyte with glEdgeFlagPointer but as float with plain immediate glEdgeFlag. Avoid reading bytes that weren't meant for the edgeflag in the pointer case. Fixes intermittent failures with gl-2.0-edgeflag piglit (and valgrind complaints about reading uninitialized memory). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium: add PIPE_SHADER_CAP_MAX_UNROLL_ITERATIONS_HINTMarek Olšák2015-10-201-0/+2
| | | | | | | | | | | | | | This avoids a serious r600g bug leading to a GPU hang. The chances this bug will get fixed are pretty low now. I deeply regret listening to others and not pushing this patch, leaving other users with a GPU-crashing driver. Yes, it should be fixed in the compiler and it's ugly, but users couldn't care less about that. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86720 Cc: 11.0 10.6 <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_SHAREABLE_SHADERSMarek Olšák2015-10-201-0/+1
| | | | | | I'll let drivers figure out how to do it. Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: do not bind input params at compute state init on FermiSamuel Pitoiset2015-10-181-8/+0
| | | | | | | | | | | | | | | | | | It looks like binding a constant buffer on compute overwrites the 3D state. To avoid that, we already re-bind all the 3D constant buffers after launching a compute grid but this is not enough. Binding the constant buffer of input parameters for the compute state at initialization corrupts the 3D constant buffers, and it's just useless to bind it because this is not needed until we really launch a grid. This fixes some piglit regressions related to interpolation tests introduced in "nvc0: enable compute support by default on Fermi". Fixes: 00d6186 (nvc0: enable compute support by default on Fermi) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add support for performance monitoring metrics on FermiSamuel Pitoiset2015-10-173-3/+498
| | | | | | | | | As explained in the CUDA toolkit documentation, "a metric is a characteristic of an application that is calculated from one or more event values." Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add a note about MP counters on GF100/GF110Samuel Pitoiset2015-10-161-0/+5
| | | | | | | | | MP counters on GF100/GF110 (compute capability 2.0) are buggy because there is a context-switch problem that we need to fix. Results might be wrong sometimes, be careful! Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add MP counters variants for GF100/GF110Samuel Pitoiset2015-10-162-77/+483
| | | | | | | | | GF100 and GF110 chipsets are compute capability 2.0, while the other Fermi chipsets are compute capability 2.1. That's why, some MP counters are different between these chipsets and we need to handle variants. Signed-off-by: Samuel Pitoiet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: move SW/HW queries info to their respective filesSamuel Pitoiset2015-10-167-178/+228
| | | | | | | This will help for handling HW SM queries variants on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: enable compute support by default on FermiSamuel Pitoiset2015-10-162-8/+2
| | | | | | | | | | Compute support was not enabled by default because weird effects on 3D state happened, but I can't reproduce them anymore. This also enables MP performance counters by default on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow only one active query for the MP counters groupSamuel Pitoiset2015-10-161-11/+9
| | | | | | | | | | | | | | Because we can't expose the number of hardware counters needed for each different query, we don't want to allow more than one active query simultaneously to avoid failure when the maximum number of counters is reached. Note that these groups of GPU counters are currently only used by AMD_performance_monitor. Like for Kepler, this limits the maximum number of active queries to 1 on Fermi. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: read MP counters of all GPCs on FermiSamuel Pitoiset2015-10-161-1/+1
| | | | | | | | | | | | When a card has more than one GPC, the grid used by the compute kernel which reads MP performance counters seems to be too small. The consequence is that the kernel is not launched on all TPCs. Increasing the grid size using the number of GPCs now launches enough blocks and we can read MP performance counters of all TPCs. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: store the number of GPCs to nvc0_screenSamuel Pitoiset2015-10-162-0/+2
| | | | | | | | | | | | NOUVEAU_GETPARAM_GRAPH_UNITS param returns the number of GPCs, the total number of TPCs and the number of ROP units. Note that when the DRM version is too old the default number of GPCs is fixed to 4. This will be used to launch the compute kernel which is used to read MP performance counters over all GPCs. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix unaligned mem access when reading MP counters on FermiSamuel Pitoiset2015-10-161-6/+12
| | | | | | | | | | | | | Memory access have to be aligned to 128-bits. Note that this doesn't happen when the card only has TPC. This patch fixes the following dmesg fail: gr: GPC0/TPC1/MP trap: global 00000004 [MULTIPLE_WARP_ERRORS] warp 000f [UNALIGNED_MEM_ACCESS] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix monitoring multiple MP counters queries on FermiSamuel Pitoiset2015-10-161-76/+87
| | | | | | | | | For strange reasons, the signal id depends on the slot selected on Fermi but not on Kepler. Fortunately, the signal ids are just offseted by the slot id! Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix queries which use multiple MP counters on FermiSamuel Pitoiset2015-10-161-47/+81
| | | | | | | | | | | | | | Queries which use more than one MP counters was misconfigured and computing the final result was also wrong because sources need to be configured on different hardware counters instead. According to the blob, computing the result is now as follows: FOR i..n val += ctr[i] * pow(2, i) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to use 8 MP counters on FermiSamuel Pitoiset2015-10-162-19/+13
| | | | | | | | On Fermi, we have one domain of 8 MP counters while we have two domains of 4 MP counters on Kepler. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix sequence field init for MP counters on FermiSamuel Pitoiset2015-10-161-2/+4
| | | | | | | | Sequence fields are located at MP[i] + 0x20 in the buffer object. This is used to check if result is available for MP[i]. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: correctly enable the MP counters' multiplexer on FermiSamuel Pitoiset2015-10-161-4/+1
| | | | | | | | | Writing 0x408000 to 0x419e00 (like on Kepler) has no effect on Fermi because we only have one domain of 8 counters. Instead, we have to write 0x80000000. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>