| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of asking spirv_to_nir to lower the workgroup (shared memory)
to offsets, keep them as derefs longer, then lower it later on.
Because Workgroup memory doesn't have explicit offsets, we need to set
those using nir_lower_vars_to_explicit_types before calling the I/O
lowering pass.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes: 8ae6667992ccca41d088 ("intel/perf: move query_object into perf")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Mark Janes <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
| |
v5: add patch
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This avoids a warning about implicitly casting away the constness of the
pointer.
Signed-off-by: Erik Faye-Lund <[email protected]>
Acked-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is an object-level preemption workaround which requires this.
However, even without object-level preemption, we seem to have issues
with geometry flickering when 3D and compute are combined in the same
batch and this appears to fix it.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395
Suggested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Encapsulate the details of this data structure.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
INTEL_DEBUG=perfmon will iterate over the perf queries, printing
information about the state of each query. Some of this information
will be private to intel/perf, and needs to a dump routine that can be
called from i965.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
By encapsulating this implementation within perf, we can eventually
make struct gen_perf_ctx private.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This refactor moves several helper functions for get_query_data as
well:
- accumulate_oa_reports
- read_gt_frequency
- get_pipeline_stats_data
- get_oa_counter_data
Functions which are no longer referenced in brw_performance_query.c
have been removed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following methods have duplicate implementation of read_oa_samples_until in
brw_performance_query.c:
- read_oa_samples_for_query
- read_oa_samples_until
They ar still referenced by other methods in the file and will be
removed on the subsequent commit.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 variants of this method need to be called by perf
routines.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
To move more operations into intel/perf, several state items are
needed. Save references to that state in the perf_ctxt, rather than
passing them in for every operation.
This commit includes an initializer for gen_perf_context, to set those
references and also encapsulate the initialization of the sample
buffer state.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
These operations are needed to refactor subsequent methods into perf
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This method is needed to move subsequent methods into perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Most accesses to perf state were made through repeated dereferences of
brw_context members. Prefering temporary variables of perf_ctx and
perf_cfg has the following advantages:
- more concise implementation
- easier refactor when moving subsequent methods to perf
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Query objects can now be encapsulated within the perf subsystem.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This method is needed to move subsequent methods into perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "context" that is necessary to submit and process perf commands to
the hardware was previously present in the brw_context.perfquery
struct. This commit moves it into perf and provides a more
understandable name.
The intention is for this struct to be private, when all methods that
access it are migrated into perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
oa_sample_buf holds the data provided by the kernel that will be
collated into performance metrics. Since this functionality will be
implemented in perf, the struct needs to be defined there.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Iris and i965 both need to enumerate the available metrics, so these
routines must be located in perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The perf subsystem needs several macro definitions that were
duplicated in Iris and i965 headers. Place these macros within perf,
if the perf implementation contains the only references to the values.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
In preparation for calling both Iris and i965 implementions from perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
In preparation for calling both Iris and i965 implementions from perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
In preparation for calling both Iris and i965 implementions from perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
In preparation for calling both Iris and i965 implementions from perf.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Performance metrics collections requires several actions (eg bo_map())
that have different implementations for Iris and i965. The perf
subsystem needs a vtable for each of these actions, so it can invoke
the corresponding implementation for each driver.
The first call to be added to the table is bo_alloc.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
There were multiple ioctl-wrapper functions, so a common
implementation was put in gen_gem.h. With a common implementation,
perf no longer needs the caller to configure one for it.
Reviewed-by: Kenneth Graunke <[email protected]>
|