aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel/perf
Commit message (Collapse)AuthorAgeFilesLines
* intel/perf: document meaning of query fieldLionel Landwerlin2020-03-271-0/+1
| | | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: move mdapi query definitions to their own fileLionel Landwerlin2020-03-274-346/+387
| | | | | | | | | | | Where they belong. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: break GL query stuff awayLionel Landwerlin2020-03-275-1595/+1679
| | | | | | | | | | | | | | This stuff is somewhat specific to the GL extension & drivers. On Vulkan we won't use this, it also made a rather large file. v2: Fix Android build (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* intel/perf: move register definition to special fileLionel Landwerlin2020-03-272-19/+8
| | | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Tapani Pälli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]> Reviewed-by: Mark Janes <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4344>
* util/hash_table: added hash functions for integer typesAnthony Pesch2020-01-231-1/+1
| | | | | | | | | | | | | | | | | A few hash_table users roll their own integer hash functions which call _mesa_hash_data to perform the hashing which ultimately calls into XXH32 with a dynamic key length. When using small keys with a constant size the hash rate can be greatly improved by inlining XXH32 and providing it a constant key length, see: https://fastcompression.blogspot.com/2018/03/xxhash-for-small-keys-impressive-power.html Additionally, this patch removes calls to _mesa_key_hash_string and makes them instead call _mesa_has_string directly, matching the new integer hash functions. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3475>
* intel/perf: adapt to platforms like Solaris without d_type in struct direntAlan Coopersmith2020-01-221-5/+20
| | | | | | | | | | | | Signed-off-by: Alan Coopersmith <[email protected]> [Eric: factor out the is_dir_or_link() check and fix a bug in v1] Signed-off-by: Eric Engestrom <[email protected]> v3: include directory path when lstat'ing files v4: fix inverted check in enumerate_sysfs_metrics() Reviewed-by: Eric Engestrom <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2258>
* intel/perf: report query split for mdapiLionel Landwerlin2020-01-163-2/+18
| | | | | | | | | | | | | | | Also forgotten in the initial implementation. v2: Report begin timestamp scaled by the timestamp frequency (Windows behavior) v3: Rename split to disjoint to match GL terminology (Tapani) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Acked-by: Tapani Pälli <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
* intel/perf: expose timestamp begin for mdapiLionel Landwerlin2020-01-163-0/+9
| | | | | | | | | | | This was forgotten in the initial implementation. v2: ensure the value is written for both GL & Vulkan queries Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Acked-by: Tapani Pälli <[email protected]> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3112>
* i965/iris/perf: factor out frequency register captureLionel Landwerlin2019-12-182-14/+24
| | | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Mark Janes <[email protected]> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3113>
* i965/iris: perf-queries: don't invalidate/flush 3d pipelineLionel Landwerlin2019-12-132-12/+6
| | | | | | | | | | | | | | | | | | | Our current implementation of performance queries is fairly harsh because it completely flushes and invalidates the 3d pipeline caches at the beginning and end of each query. An argument can be made that this is how performance should be measured but it probably doesn't reflect what the application is actually doing and the actual cost of draw calls. A more appropriate approach is to just stall the pipeline at scoreboard, so that we measure the effect of a draw call without having the pipeline in a completely pristine state for every draw call. v2: Use end of pipe PIPE_CONTROL instruction for Iris (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: drop batchbuffer flushing at query beginLionel Landwerlin2019-12-131-8/+0
| | | | | | | | | | | | This was initially intended to fix issues with the query timings going occassionally high. It turns out there was a bug in the attribution of OA reports to our context when parsing the OA data. This led to reports flagged with other context IDs to be included in our queries results. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: fix improper pointer accessLionel Landwerlin2019-12-041-1/+1
| | | | | | | | | | This expression was unused by the macro, probably why it didn't register in the compilation. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: simplify the processing of OA reportsLionel Landwerlin2019-12-041-28/+36
| | | | | | | | | | | | | | | | | | | | | | | | This is a more accurate description of what happens in processing the OA reports. Previously we only had a somewhat difficult to parse state machine tracking the context ID. What we really only need to do to decide if the delta between 2 reports (r0 & r1) should be accumulated in the query result is : * whether the r0 is tagged with the context ID relevant to us * if r0 is not tagged with our context ID and r1 is: does r0 have a invalid context id? If not then we're in a case where i915 has resubmitted the same context for execution through the execlist submission port v2: Update comment (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: take into account that reports read can be fairly oldLionel Landwerlin2019-12-041-3/+4
| | | | | | | | | | | | | | | If we read the OA reports late enough after the query happens, we can get a timestamp in the report that is significantly in the past compared to the start timestamp of the query. The current code must deal with the wraparound of the timestamp value (every ~6 minute). So consider that if the difference is greater than half that wraparound period, we're probably dealing with an old report and make the caller aware it should read more reports when they're available. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: set read buffer len to 0 to identify empty bufferLionel Landwerlin2019-12-041-2/+3
| | | | | | | | | | | | | | | | | We always add an empty buffer in the list when creating the query. Let's set the len appropriately so that we can recognize it when we read OA reports up to the end of a query. We were using an 0 timestamp value associated with the empty buffer and incorrectly assuming this was a valid value. In turn that led to not reading enough reports and resulted in deltas added to our counter values which should have been discarded because those would be flagged for a different context. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: fix invalid hw_id in query resultsLionel Landwerlin2019-12-041-2/+6
| | | | | | | | | | | | Accumulation happens between 2 reports, it can be between a start/end report from another context. So only consider updating the hw_id of the results when it's not already valid and that we have a valid value to put in there. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 41b54b5faf ("i965: move OA accumulation code to intel/perf") Reviewed-by: Mark Janes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: add EHL performance query supportLionel Landwerlin2019-11-153-2/+11807
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel/perf: add TGL supportLionel Landwerlin2019-10-314-0/+8611
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel/perf: update ICL configurationsLionel Landwerlin2019-10-291-59/+28
| | | | | | | | | A few equations/programming changes for ICL. v2: Fix a couple of issues in naming and floating/integer operations (Ken) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel/perf: add mdapi writes for register perf countersLionel Landwerlin2019-10-231-0/+36
| | | | | | | Those are not part of the OA reports. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: add support for querying kernel loaded configurationsLionel Landwerlin2019-10-232-27/+181
| | | | | | | We use this as a communication mechanism between MDAPI & Anv. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: move registers to their own headerLionel Landwerlin2019-10-233-25/+55
| | | | | | | Will conflict with the genxml RPSTAT register. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: extract register configurationLionel Landwerlin2019-10-233-16/+24
| | | | | | | | We want to query the content of register configurations from the kernel. Let's pull this out of the query. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: expose some utility functionsLionel Landwerlin2019-10-232-31/+55
| | | | | | | | | The Vulkan performance query extension is a bit lower level than the GL one. Expose some of the functions to do the result accumulation directly in the Anv driver. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: add mdapi maker helperLionel Landwerlin2019-10-231-0/+28
| | | | | | | A simple utility to put the marker at the right location. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/perf: use MAJOR_IN_SYSMACROS/MAJOR_IN_MKDEVGreg V2019-08-081-0/+4
| | | | | Reviewed-by: Eric Engestrom <[email protected]> Fixes: 134e750e16bfc53480e0 ("i965: extract performance query metrics")
* intel/perf: fix debug typoMark Janes2019-08-071-5/+5
| | | | | | Misspelling was seen with INTEL_DEBUG=perfmon. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make gen_perf_query_object privateMark Janes2019-08-072-72/+80
| | | | | | Encapsulate the details of this structure within the perf implemenation. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make perf context privateMark Janes2019-08-072-64/+109
| | | | | | Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: print debug informationMark Janes2019-08-072-0/+35
| | | | | | | | | INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make internal methods privateMark Janes2019-08-072-95/+62
| | | | | | | Now that all references from i965 have been moved to perf, we can make internal methods private again. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make oa_sample_buffers privateMark Janes2019-08-072-119/+120
| | | | | | | All references to this data structure have been moved inside the perf subsystem. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: expose method to create queryMark Janes2019-08-072-0/+19
| | | | | | | By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move initialization of pipeline statistics metrics to gen_perfMark Janes2019-08-072-124/+219
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_query_data into gen_perfMark Janes2019-08-072-0/+376
| | | | | | | | | | | | | | | This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move delete_query to gen_perfMark Janes2019-08-072-0/+93
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move is_query_ready to gen_perfMark Janes2019-08-072-0/+31
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move wait_query to perfMark Janes2019-08-072-0/+167
| | | | | | | | | | | | | The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_busyMark Janes2019-08-071-0/+1
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_wait_renderingMark Janes2019-08-071-0/+1
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for batch_referencesMark Janes2019-08-071-0/+1
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_end_query into gen_perfMark Janes2019-08-072-0/+56
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_begin_query into gen_perfMark Janes2019-08-072-0/+259
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move perf-related state into gen_perf_contextMark Janes2019-08-072-0/+51
| | | | | | | | | | | | To move more operations into intel/perf, several state items are needed. Save references to that state in the perf_ctxt, rather than passing them in for every operation. This commit includes an initializer for gen_perf_context, to set those references and also encapsulate the initialization of the sample buffer state. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entries for buffer object map/unmapMark Janes2019-08-072-0/+5
| | | | | | These operations are needed to refactor subsequent methods into perf Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move client reference counts into perfMark Janes2019-08-072-0/+32
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move open_perf into perfMark Janes2019-08-072-0/+47
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move close_perf into perfMark Janes2019-08-072-0/+18
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for emit_mi_flushMark Janes2019-08-071-0/+1
| | | | | | This method is needed to move subsequent methods into perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move snapshot_statistics_registers into perfMark Janes2019-08-072-0/+30
| | | | Reviewed-by: Kenneth Graunke <[email protected]>