| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
|
|
|
|
|
|
|
|
|
|
|
| |
Where they belong.
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This stuff is somewhat specific to the GL extension & drivers. On
Vulkan we won't use this, it also made a rather large file.
v2: Fix Android build (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A few hash_table users roll their own integer hash functions which
call _mesa_hash_data to perform the hashing which ultimately calls
into XXH32 with a dynamic key length. When using small keys with a
constant size the hash rate can be greatly improved by inlining
XXH32 and providing it a constant key length, see:
https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html
Additionally, this patch removes calls to _mesa_key_hash_string and
makes them instead call _mesa_has_string directly, matching the new
integer hash functions.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Signed-off-by: Alan Coopersmith <[email protected]>
[Eric: factor out the is_dir_or_link() check and fix a bug in v1]
Signed-off-by: Eric Engestrom <[email protected]>
v3: include directory path when lstat'ing files
v4: fix inverted check in enumerate_sysfs_metrics()
Reviewed-by: Eric Engestrom <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also forgotten in the initial implementation.
v2: Report begin timestamp scaled by the timestamp frequency (Windows
behavior)
v3: Rename split to disjoint to match GL terminology (Tapani)
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
|
|
|
|
|
|
|
|
|
|
|
| |
This was forgotten in the initial implementation.
v2: ensure the value is written for both GL & Vulkan queries
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Acked-by: Tapani Pälli <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
|
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our current implementation of performance queries is fairly harsh
because it completely flushes and invalidates the 3d pipeline caches
at the beginning and end of each query. An argument can be made that
this is how performance should be measured but it probably doesn't
reflect what the application is actually doing and the actual cost of
draw calls.
A more appropriate approach is to just stall the pipeline at
scoreboard, so that we measure the effect of a draw call without
having the pipeline in a completely pristine state for every draw
call.
v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was initially intended to fix issues with the query timings going
occassionally high.
It turns out there was a bug in the attribution of OA reports to our
context when parsing the OA data. This led to reports flagged with
other context IDs to be included in our queries results.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This expression was unused by the macro, probably why it didn't
register in the compilation.
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a more accurate description of what happens in processing the
OA reports.
Previously we only had a somewhat difficult to parse state machine
tracking the context ID.
What we really only need to do to decide if the delta between 2
reports (r0 & r1) should be accumulated in the query result is :
* whether the r0 is tagged with the context ID relevant to us
* if r0 is not tagged with our context ID and r1 is: does r0 have a
invalid context id? If not then we're in a case where i915 has
resubmitted the same context for execution through the execlist
submission port
v2: Update comment (Ken)
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we read the OA reports late enough after the query happens, we can
get a timestamp in the report that is significantly in the past
compared to the start timestamp of the query. The current code must
deal with the wraparound of the timestamp value (every ~6 minute). So
consider that if the difference is greater than half that wraparound
period, we're probably dealing with an old report and make the caller
aware it should read more reports when they're available.
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We always add an empty buffer in the list when creating the query.
Let's set the len appropriately so that we can recognize it when we
read OA reports up to the end of a query.
We were using an 0 timestamp value associated with the empty buffer
and incorrectly assuming this was a valid value. In turn that led to
not reading enough reports and resulted in deltas added to our counter
values which should have been discarded because those would be flagged
for a different context.
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Accumulation happens between 2 reports, it can be between a start/end
report from another context. So only consider updating the hw_id of
the results when it's not already valid and that we have a valid value
to put in there.
Signed-off-by: Lionel Landwerlin <[email protected]>
Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf")
Reviewed-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A few equations/programming changes for ICL.
v2: Fix a couple of issues in naming and floating/integer operations (Ken)
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Those are not part of the OA reports.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
We use this as a communication mechanism between MDAPI & Anv.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
Will conflict with the genxml RPSTAT register.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
| |
We want to query the content of register configurations from the
kernel. Let's pull this out of the query.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The Vulkan performance query extension is a bit lower level than the
GL one. Expose some of the functions to do the result accumulation
directly in the Anv driver.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
A simple utility to put the marker at the right location.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
Fixes: 134e750e16bfc53480e0 ("i965: extract performance query metrics")
|
|
|
|
|
|
| |
Misspelling was seen with INTEL_DEBUG=perfmon.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Encapsulate the details of this structure within the perf implemenation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Encapsulate the details of this data structure.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
INTEL_DEBUG=perfmon will iterate over the perf queries, printing
information about the state of each query. Some of this information
will be private to intel/perf, and needs to a dump routine that can be
called from i965.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Now that all references from i965 have been moved to perf, we can make
internal methods private again.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
All references to this data structure have been moved inside the perf
subsystem.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
By encapsulating this implementation within perf, we can eventually
make struct gen_perf_ctx private.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This refactor moves several helper functions for get_query_data as
well:
- accumulate_oa_reports
- read_gt_frequency
- get_pipeline_stats_data
- get_oa_counter_data
Functions which are no longer referenced in brw_performance_query.c
have been removed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following methods have duplicate implementation of read_oa_samples_until in
brw_performance_query.c:
- read_oa_samples_for_query
- read_oa_samples_until
They ar still referenced by other methods in the file and will be
removed on the subsequent commit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
To move more operations into intel/perf, several state items are
needed. Save references to that state in the perf_ctxt, rather than
passing them in for every operation.
This commit includes an initializer for gen_perf_context, to set those
references and also encapsulate the initialization of the sample
buffer state.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
These operations are needed to refactor subsequent methods into perf
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This method is needed to move subsequent methods into perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|