aboutsummaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* glsl/nir: Fill in the Parameters in NIR linkerCaio Marcelo de Oliveira Filho2019-09-101-1/+1
| | | | | | | | | | | | | | | | | | | | | The parameter lists were not being created nor filled since i965 doesn't use them. In Gallium they are used for uniform handling, so add a way to fill them. The gl_uniform_storage struct got two new fields that let us go - from a Parameter to the matching UniformStorage and, - from the variable to the *first* UniformStorage without relying on names -- since they are optional for ARB_gl_spirv. Later patches will make use of them. v2: Do not fill parameters for i965. (Timothy) Use uint32_t for the new attributes. (Marek) v3: Serialize the new fields. (Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Eliminate gl_config::rgbModeAdam Jackson2019-09-092-16/+3
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Eliminate gl_config::have{Accum,Depth,Stencil}BufferAdam Jackson2019-09-095-20/+12
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Remove unused gl_config::indexBitsAdam Jackson2019-09-091-1/+0
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel: Stop redirecting state cache to command streamer cache sectionKenneth Graunke2019-09-061-5/+0
| | | | | | | | | | | | | | | | | | This bit redirects the state cache from the unified/RO sections of the L3 cache to the "CS command buffer" section of the cache, which would be set up via TCCNTLREG. The documentation says: "Additionaly, this redirection should be enabled only if there is a non-zero allocation for the CS command buffer section." We don't allocate any cache to the CS command buffer section, so enabling this redirection effectively disabled the state cache. The Windows driver only sets up that section when using POSH, which we do not currently use. So, leave it unallocated and disable the redirection to get a functional state cache again. Improves performance in Civilization VI by 18%, Manhattan 3.0 by 6%, and Car Chase by 2%.
* intel/dri: finish proper glthreadSergii Romantsov2019-09-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | KWin was able to get NULL-context in the call intelUnbindContext. But a call _mesa_glthread_finish is not resistent to such case. Case can be catched with steps: 1. Create both glx and egl contexts 2. Make glx as current 3. Make egl as current 4. Reset glx context 5. Make egl as current Solution adds proper finishing of glthread-context (context will be taken from the requested dri-context for unbinding, but not from the saved current context). Piglit-test: https://gitlab.freedesktop.org/mesa/piglit/merge_requests/87 Cc: 19.1 19.2 <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110814 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111271 Fixes: dca36d5516d0 (i965: Implement threaded GL support) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: initialize bo_reuse when creating brw_bufmgrTapani Pälli2019-08-294-26/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes a possible data race spotted while debugging on other EGL related failures where glFinish and eglCreateContext are going on at the same time: ==11558== Possible data race during read of size 1 at 0x5E78CD0 by thread #23 ==11558== Locks held: 1, at address 0x5E77CA8 ==11558== at 0x61B71D4: bo_alloc_internal (brw_bufmgr.c:639) ==11558== by 0x61B7328: brw_bo_alloc (brw_bufmgr.c:669) ==11558== by 0x61EF975: recreate_growing_buffer (intel_batchbuffer.c:231) ==11558== by 0x61EFAAE: intel_batchbuffer_reset (intel_batchbuffer.c:255) ==11558== by 0x61EFB85: intel_batchbuffer_reset_and_clear_render_cache (intel_batchbuffer.c:280) ==11558== by 0x61F0507: brw_new_batch (intel_batchbuffer.c:551) ==11558== by 0x61F12C1: _intel_batchbuffer_flush_fence (intel_batchbuffer.c:888) ==11558== by 0x61BDD6B: intel_glFlush (brw_context.c:296) ==11558== by 0x61BDDB9: intel_finish (brw_context.c:307) ==11558== by 0x623831B: _mesa_Finish (context.c:1906) ==11558== by 0x46D556: deqp::egl::GLES2ThreadTest::Operation::execute(tcu::ThreadUtil::Thread&) ==11558== by 0x721502: tcu::ThreadUtil::Thread::run() ==11558== ==11558== This conflicts with a previous write of size 1 by thread #26 ==11558== Locks held: 1, at address 0x5D09878 ==11558== at 0x61B98A9: brw_bufmgr_enable_reuse (brw_bufmgr.c:1541) ==11558== by 0x61BF09D: brw_process_driconf_options (brw_context.c:854) ==11558== by 0x61BF6CA: brwCreateContext (brw_context.c:993) ==11558== by 0x621181F: driCreateContextAttribs (dri_util.c:473) ==11558== by 0x53FE87B: dri2_create_context (egl_dri2.c:1388) ==11558== by 0x53EE7BE: eglCreateContext (eglapi.c:807) ==11558== by 0x5C8AB9: eglw::FuncPtrLibrary::createContext(void*, void*, void*, int const*) const ==11558== by 0x46E027: deqp::egl::GLES2ThreadTest::CreateContext::exec(tcu::ThreadUtil::Thread&) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Exit with error if gen12+ is detectedJordan Justen2019-08-281-0/+5
| | | | | | | For OpenGL support on gen12, the iris driver should be used. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glx: Fix incompatible function pointer types.Jose Fonseca2019-08-281-1/+1
| | | | | | | | | | | I don't know how Meson didn't hit this issue, when it too already uses -Werror=incompatible-pointer-types Fixes: 3dd299c3d5b88114894e ("glx: Sync <GL/glxext.h> with Khronos") Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* intel/fs: Drop the gl_program from fs_visitorJason Ekstrand2019-08-252-2/+2
| | | | | | | | | It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Silence brw_blorp uninitialized warningCaio Marcelo de Oliveira Filho2019-08-231-1/+1
| | | | | | | | | | The variables level and start_layer are not initialized, then initialized if we have a BUFFER_BIT_DEPTH set. We assert on them later using the same check. This should be enough but GCC 9.1.1 is not convinced, so let's initialize the variables. Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glx: Sync <GL/glxext.h> with KhronosAdam Jackson2019-08-223-9/+8
| | | | | | | | Minor fixups required to keep the prototypes matching and to remove mention of retired enums. Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: honor scanout requirement from DRILionel Landwerlin2019-08-211-1/+3
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: Add handling for fp16 configsKevin Strasser2019-08-211-1/+24
| | | | | | | | | | | | Expose configs when allow_fp16_configs has been enabled and DRI_LOADER_CAP_FP16 is set in the loader. Also, define a new dri configuration option so users can disable exposure of fp16 formats. Make fp16 opt-in for i965. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Add fp16 formatsKevin Strasser2019-08-212-0/+22
| | | | | | | | | | | Add dri formats for RGBA ordered 64 bpp IEEE 754 half precision floating point. Leverage existing offscreen render support for MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Handle configs with floating point pixel dataKevin Strasser2019-08-211-0/+5
| | | | | | | | | | | In order to handle pixel formats that consist of floating point data, enable floatMode field in the dri config, and set __DRI_ATTRIB_FLOAT_BIT in the render type attribute. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Add config attributes for color channel shiftKevin Strasser2019-08-211-19/+49
| | | | | | | | | | | | | | | The existing mask attributes can only support up to 32 bpp. Introduce per-channel SHIFT attributes that indicate how many bits, from lsb towards msb, the bit field is offset. A shift of -1 will indicate that there is no bit field set for the channel. As old loaders will still be looking for masks, we set the masks to 0 for any formats wider than 32 bpp. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Add helper function for allowed config formatsKevin Strasser2019-08-211-31/+34
| | | | | | | | | The driver checks dri config options and loader caps to filter out certain formats during config creation. Fold 4 call sites under a single helper function. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-211-2/+4
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable OpenGL 4.6 for Gen8+Alejandro Piñeiro2019-08-212-2/+2
| | | | | | | | | The last remaining stuff was ARB_gl_spirv and ARB_spirv_extensions. Note that it is really likely that we can enable it for some Gen7 (as 4.5 was), but it was not tested yet. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965: enable ARB_gl_spirv extension and ARB_spirv_extensions for gen7+Alejandro Piñeiro2019-08-211-0/+3
| | | | | | v2: squashed the two enable patches with the docs one (Jason) Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/gen11: fix genX_bits.h include pathMauro Rossi2019-08-131-1/+1
| | | | | | | | | | | Instead of "genX_bits.h" use "genxml/genX_bits.h" as already done in other similar cases Besides being more correct, it also fixes building error in Android. Fixes: f0d2923 ("i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.Rafael Antognolli2019-08-122-0/+88
| | | | | | | If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Set Mask field to 0xffff for workaround (Ken).
* intel/compiler: Fill a compiler statistics structJason Ekstrand2019-08-126-6/+7
| | | | | | | | | This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/gen9: Optimize slice and subslice load balancing behavior.Francisco Jerez2019-08-125-6/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default pixel hashing mode settings used for slice and subslice load balancing are far from optimal under certain conditions (see the comments below for the gory details). The top-of-the-line GT4 parts suffer from a particularly severe performance problem currently due to a subslice load balancing issue. Fixing this seems to improve graphics performance across the board for most of the benchmarks in my test set, up to ~20% in some cases, e.g. from SKL GT4: unigine/valley: 3.44% ±0.11% gfxbench/gl_manhattan31: 3.99% ±0.13% gputest/pixmark_piano: 7.95% ±0.33% synmark/OglTexFilterAniso: 15.22% ±0.07% synmark/OglTexMem128: 22.26% ±0.06% Lower-end platforms are also affected by some subslice load imbalance to a lesser degree, especially during CCS resolve and fast clear operations, which are handled specially here due to rasterization ocurring in reduced CCS coordinates, which changes the semantics of the pixel hashing mode settings. No regressions seen during my tests on some SKL, KBL and BXT configurations. Additional benchmark reports welcome on any Gen9 platforms (that includes anything with Skylake, Broxton, Kabylake, Geminilake, Coffeelake, Whiskey Lake, Comet Lake or Amber Lake in your renderer string). P.S.: A similar problem is likely to be present on other non-Gen9 platforms, especially for CCS resolve and fast clear operations. Will follow-up with additional patches fixing the hashing mode for those once I have enough performance data to justify it. Reviewed-by: Kenneth Graunke <[email protected]>
* i965/spirv: Lower shared memory laterCaio Marcelo de Oliveira Filho2019-08-101-0/+20
| | | | | | | | | | | Instead of asking spirv_to_nir to lower the workgroup (shared memory) to offsets, keep them as derefs longer, then lower it later on. Because Workgroup memory doesn't have explicit offsets, we need to set those using nir_lower_vars_to_explicit_types before calling the I/O lowering pass. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use force_compat_profile driconf optionDanylo Piliaiev2019-08-102-2/+8
| | | | | Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: fix mem leak in error pathEric Engestrom2019-08-101-1/+3
| | | | | | Fixes: 8ae6667992ccca41d088 ("intel/perf: move query_object into perf") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* i965: don't use p_compiler.h typesLionel Landwerlin2019-08-091-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* meson: define ETIME to ETIMEDOUT if not presentGreg V2019-08-081-3/+0
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* anv,i965,iris: deduplicate setting of total_sharedRhys Perry2019-08-081-2/+0
| | | | | | | | v5: add patch Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3DDanylo Piliaiev2019-08-081-0/+21
| | | | | | | | | | | | | There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395 Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* intel/perf: make perf context privateMark Janes2019-08-073-39/+27
| | | | | | Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: print debug informationMark Janes2019-08-071-25/+8
| | | | | | | | | INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: expose method to create queryMark Janes2019-08-071-10/+3
| | | | | | | By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move initialization of pipeline statistics metrics to gen_perfMark Janes2019-08-071-93/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_query_data into gen_perfMark Janes2019-08-072-399/+2
| | | | | | | | | | | | | | | This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move delete_query to gen_perfMark Janes2019-08-071-39/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move is_query_ready to gen_perfMark Janes2019-08-071-133/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move wait_query to perfMark Janes2019-08-071-38/+1
| | | | | | | | | | | | | The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_busyMark Janes2019-08-071-4/+5
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_wait_renderingMark Janes2019-08-071-1/+3
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for batch_referencesMark Janes2019-08-071-6/+11
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_end_query into gen_perfMark Janes2019-08-071-45/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: refactor gen_perf_begin_query into gen_perfMark Janes2019-08-071-247/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move perf-related state into gen_perf_contextMark Janes2019-08-071-49/+29
| | | | | | | | | | | | To move more operations into intel/perf, several state items are needed. Save references to that state in the perf_ctxt, rather than passing them in for every operation. This commit includes an initializer for gen_perf_context, to set those references and also encapsulate the initialization of the sample buffer state. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entries for buffer object map/unmapMark Janes2019-08-071-6/+12
| | | | | | These operations are needed to refactor subsequent methods into perf Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move client reference counts into perfMark Janes2019-08-071-36/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move open_perf into perfMark Janes2019-08-071-48/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move close_perf into perfMark Janes2019-08-071-18/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>