summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965/gen11: fix genX_bits.h include pathMauro Rossi2019-08-131-1/+1
| | | | | | | | | | | Instead of "genX_bits.h" use "genxml/genX_bits.h" as already done in other similar cases Besides being more correct, it also fixes building error in Android. Fixes: f0d2923 ("i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.Rafael Antognolli2019-08-122-0/+88
| | | | | | | If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Set Mask field to 0xffff for workaround (Ken).
* intel/compiler: Fill a compiler statistics structJason Ekstrand2019-08-126-6/+7
| | | | | | | | | This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/gen9: Optimize slice and subslice load balancing behavior.Francisco Jerez2019-08-125-6/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default pixel hashing mode settings used for slice and subslice load balancing are far from optimal under certain conditions (see the comments below for the gory details). The top-of-the-line GT4 parts suffer from a particularly severe performance problem currently due to a subslice load balancing issue. Fixing this seems to improve graphics performance across the board for most of the benchmarks in my test set, up to ~20% in some cases, e.g. from SKL GT4: unigine/valley: 3.44% ±0.11% gfxbench/gl_manhattan31: 3.99% ±0.13% gputest/pixmark_piano: 7.95% ±0.33% synmark/OglTexFilterAniso: 15.22% ±0.07% synmark/OglTexMem128: 22.26% ±0.06% Lower-end platforms are also affected by some subslice load imbalance to a lesser degree, especially during CCS resolve and fast clear operations, which are handled specially here due to rasterization ocurring in reduced CCS coordinates, which changes the semantics of the pixel hashing mode settings. No regressions seen during my tests on some SKL, KBL and BXT configurations. Additional benchmark reports welcome on any Gen9 platforms (that includes anything with Skylake, Broxton, Kabylake, Geminilake, Coffeelake, Whiskey Lake, Comet Lake or Amber Lake in your renderer string). P.S.: A similar problem is likely to be present on other non-Gen9 platforms, especially for CCS resolve and fast clear operations. Will follow-up with additional patches fixing the hashing mode for those once I have enough performance data to justify it. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: don't allocate mipmapped texture for NEAREST_MIPMAP_LINEARMarek Olšák2019-08-121-0/+12
| | | | Reviewed-by: Brian Paul <[email protected]>
* i965/spirv: Lower shared memory laterCaio Marcelo de Oliveira Filho2019-08-102-1/+20
| | | | | | | | | | | Instead of asking spirv_to_nir to lower the workgroup (shared memory) to offsets, keep them as derefs longer, then lower it later on. Because Workgroup memory doesn't have explicit offsets, we need to set those using nir_lower_vars_to_explicit_types before calling the I/O lowering pass. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use force_compat_profile driconf optionDanylo Piliaiev2019-08-102-2/+8
| | | | | Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: fix mem leak in error pathEric Engestrom2019-08-101-1/+3
| | | | | | Fixes: 8ae6667992ccca41d088 ("intel/perf: move query_object into perf") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* mesa: be consistent on GL_TRUE/GL_FALSE & TRUE/FALSELionel Landwerlin2019-08-092-3/+3
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* mesa: drop some p_compiler.h typesLionel Landwerlin2019-08-091-2/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* mesa: add stddef include in preparation for dropping p_compiler.hLionel Landwerlin2019-08-092-0/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* i965: don't use p_compiler.h typesLionel Landwerlin2019-08-091-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* meson: define ETIME to ETIMEDOUT if not presentGreg V2019-08-081-3/+0
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* anv,i965,iris: deduplicate setting of total_sharedRhys Perry2019-08-081-2/+0
| | | | | | | | v5: add patch Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa/main: cast away constnessErik Faye-Lund2019-08-081-1/+1
| | | | | | | | This avoids a warning about implicitly casting away the constness of the pointer. Signed-off-by: Erik Faye-Lund <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3DDanylo Piliaiev2019-08-081-0/+21
| | | | | | | | | | | | | There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395 Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* st/mesa: eliminate unnecessary redirectionMark Janes2019-08-071-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make perf context privateMark Janes2019-08-073-39/+27
| | | | | | Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: print debug informationMark Janes2019-08-071-25/+8
| | | | | | | | | INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: expose method to create queryMark Janes2019-08-071-10/+3
| | | | | | | By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move initialization of pipeline statistics metrics to gen_perfMark Janes2019-08-071-93/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_query_data into gen_perfMark Janes2019-08-072-399/+2
| | | | | | | | | | | | | | | This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move delete_query to gen_perfMark Janes2019-08-071-39/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move is_query_ready to gen_perfMark Janes2019-08-071-133/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move wait_query to perfMark Janes2019-08-071-38/+1
| | | | | | | | | | | | | The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_busyMark Janes2019-08-071-4/+5
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_wait_renderingMark Janes2019-08-071-1/+3
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for batch_referencesMark Janes2019-08-071-6/+11
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_end_query into gen_perfMark Janes2019-08-071-45/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_begin_query into gen_perfMark Janes2019-08-071-247/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move perf-related state into gen_perf_contextMark Janes2019-08-071-49/+29
| | | | | | | | | | | | To move more operations into intel/perf, several state items are needed. Save references to that state in the perf_ctxt, rather than passing them in for every operation. This commit includes an initializer for gen_perf_context, to set those references and also encapsulate the initialization of the sample buffer state. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entries for buffer object map/unmapMark Janes2019-08-071-6/+12
| | | | | | These operations are needed to refactor subsequent methods into perf Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move client reference counts into perfMark Janes2019-08-071-36/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move open_perf into perfMark Janes2019-08-071-48/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move close_perf into perfMark Janes2019-08-071-18/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for emit_mi_flushMark Janes2019-08-071-2/+4
| | | | | | This method is needed to move subsequent methods into perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: use temporary pointers to simplify access to perf stateMark Janes2019-08-071-78/+92
| | | | | | | | | | | Most accesses to perf state were made through repeated dereferences of brw_context members. Prefering temporary variables of perf_ctx and perf_cfg has the following advantages: - more concise implementation - easier refactor when moving subsequent methods to perf Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move snapshot_statistics_registers into perfMark Janes2019-08-071-28/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move query_object into perfMark Janes2019-08-073-158/+63
| | | | | | Query objects can now be encapsulated within the perf subsystem. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for store_register_mem64Mark Janes2019-08-071-3/+7
| | | | | | This method is needed to move subsequent methods into perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move free_sample_bufs into perfMark Janes2019-08-071-15/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move reap_old_sample_buffers into perfMark Janes2019-08-071-25/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_free_sample_buf into perfMark Janes2019-08-071-21/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move the perf context into perfMark Janes2019-08-072-147/+92
| | | | | | | | | | | | The "context" that is necessary to submit and process perf commands to the hardware was previously present in the brw_context.perfquery struct. This commit moves it into perf and provides a more understandable name. The intention is for this struct to be private, when all methods that access it are migrated into perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_metric_id to perfMark Janes2019-08-071-36/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move oa_sample_buf structure to perfMark Janes2019-08-071-139/+16
| | | | | | | | oa_sample_buf holds the data provided by the kernel that will be collated into performance metrics. Since this functionality will be implemented in perf, the struct needs to be defined there. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: enumerate query-based metrics in perfMark Janes2019-08-075-266/+4
| | | | | | | Iris and i965 both need to enumerate the available metrics, so these routines must be located in perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move perf-related constants to common locationMark Janes2019-08-073-22/+1
| | | | | | | | The perf subsystem needs several macro definitions that were duplicated in Iris and i965 headers. Place these macros within perf, if the perf implementation contains the only references to the values. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for capture_frequency_stat_registerMark Janes2019-08-071-2/+9
| | | | | | In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for batchbuffer_flushMark Janes2019-08-071-8/+18
| | | | | | In preparation for calling both Iris and i965 implementions from perf. Reviewed-by: Kenneth Graunke <[email protected]>