| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
0 is PIPE_QUERY_OCCLUSION_COUNTER, which is not what we want.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
just use the new scratch buffer.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 7088b655e8828bb960f528dd33132de27c505b8f.
It breaks performance counters. If you use them with this commit, they hang
the machine hard. Sysrq and ssh don't work.
|
|
|
|
|
|
| |
This reverts commit 485ece83aceb3a35792efbc6fe2bca57ba46c04a.
It's needed to revert 7088b655e8828bb960f528dd33132de27c505b8f.
|
|
|
|
|
|
|
| |
Fixes: 7088b655e8 ("radeonsi: constify a bunch of the perfcounter structs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100937
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This moves the structs from the data segment to the rodata segment,
which seems like the more correct place for them.
Reviewed-by: Nicolai Hähnle <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is only needed for r600 which doesn't have ARB_query_buffer_object and
therefore wouldn't really need the fences, but let's be optimistic about
filling in this feature gap eventually.
Cc: Dieter Nützel <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
For bottom-of-pipe fences inside the gfx command stream.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
This is useful for shader-related counters, since they tend to quickly
exceed 32 bits.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
I'm not sure about this. This will make the engines go idle, but the caches
will be unflushed. This should match app behavior without performance
counters, which can be a good thing.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Yet another change motivated by AMD GPUPerfStudio compatibility. These groups
are not directly accessible from userspace, and AMD GPUPerfStudio does not
actually query them - it just requires them to be there. Hence, adding
a placeholder for now.
Reviewed-by: Edward O'Callaghan <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
This is yet another change motivated by appeasing AMD GPUPerfStudio's
hardcoding of performance counter group numbers.
Reviewed-by: Edward O'Callaghan <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
As documented in the comment, AMD GPUPerfStudio unfortunately hardcodes the
order of performance counter groups. Let's do the pragmatic thing and present
the same order as Catalyst/Crimson.
Reviewed-by: Edward O'Callaghan <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
The incorrectly computed register count caused lockups.
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
This shaves a bit more time off the startup of programs that don't
actually use performance counters.
Reviewed-by: Marek Olšák <[email protected]>
|
|
Expose most of the performance counter groups that are exposed by Catalyst.
Ideally, the driver will work with GPUPerfStudio at some point, but we are not
quite there yet. In any case, this is the reason for grouping multiple
instances of hardware blocks in the way it is implemented.
The counters can also be shown using the Gallium HUD. If one is interested to
see how work is distributed across multiple shader engines, one can set the
environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance
counter groups.
Part of the implementation is in radeon because an implementation for
older hardware would largely follow along the same lines, but exposing
a different set of blocks which are programmed slightly differently.
Reviewed-by: Marek Olšák <[email protected]>
|