aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/perf/gen_perf.c
Commit message (Collapse)AuthorAgeFilesLines
* intel/perf: Move perf query register programming to static tables.Eric Anholt2020-07-171-6/+6
| | | | | | | | | | | And now that they're static tables, we don't need to ralloc a copy in non-shared memory. Saves ~210k in the built intel drivers. Bug: https://bugs.chromium.org/p/chromium/issues/detail?id=1048434 Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5829>
* intel/perf: Fix unused var warning in release builds.Eric Anholt2020-07-171-1/+1
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5829>
* intel/perf: move query_mask and location out of gen_perf_query_counterMarcin Ślusarz2020-07-061-26/+39
| | | | | | | Signed-off-by: Marcin Ślusarz <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5399>
* iris: remove iris_monitor_configMarcin Ślusarz2020-07-061-0/+2
| | | | | | | | | | | | | | | | | | | perf_cfg is enough - it already contains almost all necessary information and is constructed in a more optimal way (O(n) vs O(n^2) - it uses hash table to build the unique counter list). "Almost all", because it doesn't contain OA raw counters, but we should have not exposed them anyway. Quoting Mark Janes: "I see no reason to include the OA raw counters in the list that are provided to the user. They are unusable. The MDAPI library can be used to configure raw counters in a way that provides esoteric metrics, but that library is written against INTEL_performance_query." Signed-off-by: Marcin Ślusarz <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5399>
* anv: disable i915_perf warning on non-LinuxJan Beich2020-06-301-2/+2
| | | | | | | | $ vkcube INTEL-MESA: warning: Performance support disabled, consider sysctl dev.i915.perf_stream_paranoid=0 Reviewed-by: Lionel Landwerlin <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5461>
* intel/perf: repurpose INTEL_DEBUG=no-oaconfigLionel Landwerlin2020-05-201-9/+30
| | | | | | | | | | | | | We initially used this debug option to mean "don't bother registering the OA configuration into the kernel". This change makes this option suppress any interaction with the i915/perf interface. This is useful when debugging self modifying batches with performance queries while running on the intel_mi_runner. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: reuse offset specified in the queryLionel Landwerlin2020-05-201-12/+28
| | | | | | | | | | | | | The current code relies on the order of the function gen_perf_query_result_accumulate() to match the descriptions written by gen_perf.py. Let's just reuse the offset specified in the python script. v2: Use accumlator offsets more (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: report whether the platform supportedLionel Landwerlin2020-05-201-0/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: add helper to compute metrics from countersLionel Landwerlin2020-05-201-0/+24
| | | | | | | | The produced array tells use what metric to enable for a given pass. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: compute number of passes for a set of countersLionel Landwerlin2020-05-201-0/+53
| | | | | | | | | We want to compute the number of passes required to gather performance data about a set of counters. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: create a unique list of countersLionel Landwerlin2020-05-201-1/+58
| | | | | | | | | | | | | | For a future extension we want to be able to list the counters. Our existing sets counters might contain the same counters multiple times. This is a side effect of the fixed OA counters in the HW. We track thoses with a mask so that we know when a counter is available from multiple metrics. v2: Use BITFIELD64_BIT() (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: store the appropriate OA formats in queriesLionel Landwerlin2020-05-201-7/+13
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: make pipeline statistic query loading optionalLionel Landwerlin2020-05-201-3/+6
| | | | | | | | | On Vulkan most of those are already covered by standard queries so add the ability to skip them. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2775>
* intel/perf: specify sseu configuration when supportedLionel Landwerlin2020-04-231-5/+0
| | | | | | | | | | | | | | Because of functional requirements for Gen11, when perf is enabled we only power half the EU array. This change forces it to enable everything. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4021>
* intel/perf: store default sseu configurationLionel Landwerlin2020-04-231-0/+15
| | | | | | | | | | | | This is the powergating configuration of the EU array. The default is everything powered. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4021>
* intel/perf: store the probed i915-perf versionLionel Landwerlin2020-03-271-0/+18
| | | | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: move mdapi query definitions to their own fileLionel Landwerlin2020-03-271-345/+68
| | | | | | | | | | | Where they belong. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: break GL query stuff awayLionel Landwerlin2020-03-271-1549/+6
| | | | | | | | | | | | | | This stuff is somewhat specific to the GL extension & drivers. On Vulkan we won't use this, it also made a rather large file. v2: Fix Android build (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: move register definition to special fileLionel Landwerlin2020-03-271-19/+0
| | | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* util/hash_table: added hash functions for integer typesAnthony Pesch2020-01-231-1/+1
| | | | | | | | | | | | | | | | | A few hash_table users roll their own integer hash functions which call _mesa_hash_data to perform the hashing which ultimately calls into XXH32 with a dynamic key length. When using small keys with a constant size the hash rate can be greatly improved by inlining XXH32 and providing it a constant key length, see: https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html Additionally, this patch removes calls to _mesa_key_hash_string and makes them instead call _mesa_has_string directly, matching the new integer hash functions. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
* intel/perf: adapt to platforms like Solaris without d_type in struct direntAlan Coopersmith2020-01-221-5/+20
| | | | | | | | | | | | Signed-off-by: Alan Coopersmith <[email protected]> [Eric: factor out the is_dir_or_link() check and fix a bug in v1] Signed-off-by: Eric Engestrom <[email protected]> v3: include directory path when lstat'ing files v4: fix inverted check in enumerate_sysfs_metrics() Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
* intel/perf: report query split for mdapiLionel Landwerlin2020-01-161-0/+6
| | | | | | | | | | | | | | | Also forgotten in the initial implementation. v2: Report begin timestamp scaled by the timestamp frequency (Windows behavior) v3: Rename split to disjoint to match GL terminology (Tapani) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Acked-by: Tapani Pälli <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
* intel/perf: expose timestamp begin for mdapiLionel Landwerlin2020-01-161-0/+2
| | | | | | | | | | | This was forgotten in the initial implementation. v2: ensure the value is written for both GL & Vulkan queries Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Acked-by: Tapani Pälli <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
* i965/iris/perf: factor out frequency register captureLionel Landwerlin2019-12-181-11/+23
| | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
* i965/iris: perf-queries: don't invalidate/flush 3d pipelineLionel Landwerlin2019-12-131-11/+5
| | | | | | | | | | | | | | | | | | | Our current implementation of performance queries is fairly harsh because it completely flushes and invalidates the 3d pipeline caches at the beginning and end of each query. An argument can be made that this is how performance should be measured but it probably doesn't reflect what the application is actually doing and the actual cost of draw calls. A more appropriate approach is to just stall the pipeline at scoreboard, so that we measure the effect of a draw call without having the pipeline in a completely pristine state for every draw call. v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: drop batchbuffer flushing at query beginLionel Landwerlin2019-12-131-8/+0
| | | | | | | | | | | | This was initially intended to fix issues with the query timings going occassionally high. It turns out there was a bug in the attribution of OA reports to our context when parsing the OA data. This led to reports flagged with other context IDs to be included in our queries results. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: fix improper pointer accessLionel Landwerlin2019-12-041-1/+1
| | | | | | | | | | This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: simplify the processing of OA reportsLionel Landwerlin2019-12-041-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: take into account that reports read can be fairly oldLionel Landwerlin2019-12-041-3/+4
| | | | | | | | | | | | | | | If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: set read buffer len to 0 to identify empty bufferLionel Landwerlin2019-12-041-2/+3
| | | | | | | | | | | | | | | | | We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: fix invalid hw_id in query resultsLionel Landwerlin2019-12-041-2/+6
| | | | | | | | | | | | Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: add EHL performance query supportLionel Landwerlin2019-11-151-1/+4
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel/perf: add TGL supportLionel Landwerlin2019-10-311-0/+10
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel/perf: add support for querying kernel loaded configurationsLionel Landwerlin2019-10-231-25/+163
| | | | | | | We use this as a communication mechanism between MDAPI & Anv. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: move registers to their own headerLionel Landwerlin2019-10-231-0/+1
| | | | | | | Will conflict with the genxml RPSTAT register. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: extract register configurationLionel Landwerlin2019-10-231-6/+6
| | | | | | | | We want to query the content of register configurations from the kernel. Let's pull this out of the query. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: expose some utility functionsLionel Landwerlin2019-10-231-28/+30
| | | | | | | | | The Vulkan performance query extension is a bit lower level than the GL one. Expose some of the functions to do the result accumulation directly in the Anv driver. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: fix debug typoMark Janes2019-08-071-5/+5
| | | | | | Misspelling was seen with INTEL_DEBUG=perfmon. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make gen_perf_query_object privateMark Janes2019-08-071-0/+78
| | | | | | Encapsulate the details of this structure within the perf implemenation. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make perf context privateMark Janes2019-08-071-0/+102
| | | | | | Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: print debug informationMark Janes2019-08-071-0/+30
| | | | | | | | | INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make internal methods privateMark Janes2019-08-071-62/+62
| | | | | | | Now that all references from i965 have been moved to perf, we can make internal methods private again. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make oa_sample_buffers privateMark Janes2019-08-071-0/+120
| | | | | | | All references to this data structure have been moved inside the perf subsystem. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: expose method to create queryMark Janes2019-08-071-0/+17
| | | | | | | By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move initialization of pipeline statistics metrics to gen_perfMark Janes2019-08-071-66/+216
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_query_data into gen_perfMark Janes2019-08-071-0/+359
| | | | | | | | | | | | | | | This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move delete_query to gen_perfMark Janes2019-08-071-0/+91
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move is_query_ready to gen_perfMark Janes2019-08-071-0/+28
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move wait_query to perfMark Janes2019-08-071-0/+164
| | | | | | | | | | | | | The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_end_query into gen_perfMark Janes2019-08-071-0/+54
| | | | Reviewed-by: Kenneth Graunke <[email protected]>