summaryrefslogtreecommitdiffstats
path: root/src/mesa
Commit message (Collapse)AuthorAgeFilesLines
* i965: Add handling for fp16 configsKevin Strasser2019-08-211-1/+24
| | | | | | | | | | | | Expose configs when allow_fp16_configs has been enabled and DRI_LOADER_CAP_FP16 is set in the loader. Also, define a new dri configuration option so users can disable exposure of fp16 formats. Make fp16 opt-in for i965. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Add fp16 formatsKevin Strasser2019-08-212-0/+22
| | | | | | | | | | | Add dri formats for RGBA ordered 64 bpp IEEE 754 half precision floating point. Leverage existing offscreen render support for MESA_FORMAT_RGBA_FLOAT16 and MESA_FORMAT_RGBX_FLOAT16. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Handle configs with floating point pixel dataKevin Strasser2019-08-211-0/+5
| | | | | | | | | | | In order to handle pixel formats that consist of floating point data, enable floatMode field in the dri config, and set __DRI_ATTRIB_FLOAT_BIT in the render type attribute. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri: Add config attributes for color channel shiftKevin Strasser2019-08-213-22/+56
| | | | | | | | | | | | | | | The existing mask attributes can only support up to 32 bpp. Introduce per-channel SHIFT attributes that indicate how many bits, from lsb towards msb, the bit field is offset. A shift of -1 will indicate that there is no bit field set for the channel. As old loaders will still be looking for masks, we set the masks to 0 for any formats wider than 32 bpp. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Adam Jackson <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Add helper function for allowed config formatsKevin Strasser2019-08-211-31/+34
| | | | | | | | | The driver checks dri config options and loader caps to filter out certain formats during config creation. Fold 4 call sites under a single helper function. Signed-off-by: Kevin Strasser <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nir: Add explicit signs to image min/max intrinsicsJason Ekstrand2019-08-211-2/+4
| | | | | | | | | | | This better matches all the other atomic intrinsics such as those for SSBOs and shared variables where the sign is part of the intrinsic opcode. Both generators (GLSL and SPIR-V) know the sign from the type of the image variable or handle. In SPIR-V, signed min/max are separate opcodes from unsigned. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable OpenGL 4.6 for Gen8+Alejandro Piñeiro2019-08-212-2/+2
| | | | | | | | | The last remaining stuff was ARB_gl_spirv and ARB_spirv_extensions. Note that it is really likely that we can enable it for some Gen7 (as 4.5 was), but it was not tested yet. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa/version: uncomment SPIR-V extensionsAlejandro Piñeiro2019-08-211-2/+2
| | | | | | As they are implemented on i965, so we can expose 4.6. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965: enable ARB_gl_spirv extension and ARB_spirv_extensions for gen7+Alejandro Piñeiro2019-08-211-0/+3
| | | | | | v2: squashed the two enable patches with the docs one (Jason) Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa/compiler: rework tear down of builtin/typesLionel Landwerlin2019-08-214-12/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | The issue we're running into when running CTS is that glsl types are deleted while builtins depending on them are not. This happens because on one hand we have glsl types ref counted, but builtins are not. Instead builtins are destroyed when unloading libGL or explicitly calling glReleaseShaderCompiler(). This change removes almost entirely any dealing with glsl types ref/unref by letting the builtins deal with it instead. In turn we introduce a builtin ref count mechanism. Each GL context takes a reference on the builtins when compiling a shader for the first time. It releases the reference when the context is destroyed. It can also explicitly release those when glReleaseShaderCompiler() is called. Finally we also take a reference on the glsl types when loading libGL to avoid recreating glsl types too often. v2: Ensure we take a reference if we don't have one in link step (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110796 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* mesa: reverse no_error on compressed_tex_sub_image for TEX_MODE_CURRENTJose Maria Casanova Crespo2019-08-201-2/+2
| | | | | | | | | This fixes the regression introduced on "mesa: refactor compressed_tex_sub_image function" that started to crash KHR-GLES2.texture_3d.compressed_texture.negative_compressed_tex_sub_image Fixes: 7df233d68dc ("mesa: refactor compressed_tex_sub_image function") Reviewed-by: Eric Anholt <[email protected]>
* mesa/program: Take ARB_framebuffers_no_attachments into account in wpos ↵Gert Wollny2019-08-201-2/+2
| | | | | | | | | | | | | correction If a drawbuffer is an fbo without an attachment then its 'Height' will be zero, and we have to take its 'DefaultGeometry.Height' into account. Fixes on softpipe (with the exception of tests that use multisample): dEQP-GLES31.functional.fbo.no_attachments.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: add ext_dsa GetMultiTexLevelParameterEXTPierre-Eric Pelloux-Prayer2019-08-193-2/+57
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: add EXT_dsa glCompressedMultiTex* functions display list supportPierre-Eric Pelloux-Prayer2019-08-191-0/+276
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: add EXT_dsa glCompressedMultiTex* functionsPierre-Eric Pelloux-Prayer2019-08-195-12/+196
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: add EXT_dsa glCompressedTex* functions display list supportPierre-Eric Pelloux-Prayer2019-08-191-0/+239
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: add EXT_dsa glCompressedTexture(Sub)Image1D/2D/3D functionsPierre-Eric Pelloux-Prayer2019-08-195-6/+152
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: refactor compressed_tex_sub_image functionPierre-Eric Pelloux-Prayer2019-08-191-101/+115
| | | | | | | | | | Combine compressed_tex_sub_image, compressed_tex_sub_image_error and compressed_tex_sub_image_no_error in a single function. The added "enum tex_mode mode" parameter allows to implement the DSA / non-DSA variants and their error/no_error combination. Reviewed-by: Marek Olšák <[email protected]>
* swrast: Make the fetch funcs table sparse.Eric Anholt2019-08-191-191/+24
| | | | | | | | | This shrinks the table, avoids needing to update the table with NULL entries on every MESA_FORMAT addition, and removes a surprising, non-unit-tested format number ordering dependency. Acked-by: Jose Fonseca <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* mesa: add support for CET to x86/x86-64 asm files.Dave Airlie2019-08-1615-73/+96
| | | | | | | | | | | | | Control-flow enforcement technology is a new instructions on x86 processors to denote where indirect jumps can land. Gcc auto adds the instruction (which encodes as a NOP on older CPUs) to entrypoints but assembler files need manual adding. This adds it to all the entry points in the mesa x86/x86-64 assembler files. This will only happen if mesa is built with the -fcf-protection flag to gcc as some distros are wanting to do. Acked-by: Eric Anholt <[email protected]>
* win32: unify strcasecmp definitionsErik Faye-Lund2019-08-151-3/+0
| | | | | | | | | There was two incompatible definitions of strcasecmp, which lead to a compiler warning. Let's clean this up by only leaving one of them, and using that one all the time. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa/main: avoid warning when casting offset to pointerErik Faye-Lund2019-08-151-1/+1
| | | | | | | | This generates a warning on some 64-bit systems, so let's cast to a properly sized integer first. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/gen11: fix genX_bits.h include pathMauro Rossi2019-08-131-1/+1
| | | | | | | | | | | Instead of "genX_bits.h" use "genxml/genX_bits.h" as already done in other similar cases Besides being more correct, it also fixes building error in Android. Fixes: f0d2923 ("i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965/gen11: Emit SLICE_HASH_TABLE when pipes are unbalanced.Rafael Antognolli2019-08-122-0/+88
| | | | | | | If the pixel pipes have a different number of subslices, emit a slice hashing table that will ensure proper workload distribution. v2: Set Mask field to 0xffff for workaround (Ken).
* intel/compiler: Fill a compiler statistics structJason Ekstrand2019-08-126-6/+7
| | | | | | | | | This commit is all annoying plumbing work which just adds support for a new brw_compile_stats struct. This struct provides a binary driver readable form of the same statistics we dump out to stderr when we INTEL_DEBUG is set with a shader stage. Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/gen9: Optimize slice and subslice load balancing behavior.Francisco Jerez2019-08-125-6/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default pixel hashing mode settings used for slice and subslice load balancing are far from optimal under certain conditions (see the comments below for the gory details). The top-of-the-line GT4 parts suffer from a particularly severe performance problem currently due to a subslice load balancing issue. Fixing this seems to improve graphics performance across the board for most of the benchmarks in my test set, up to ~20% in some cases, e.g. from SKL GT4: unigine/valley: 3.44% ±0.11% gfxbench/gl_manhattan31: 3.99% ±0.13% gputest/pixmark_piano: 7.95% ±0.33% synmark/OglTexFilterAniso: 15.22% ±0.07% synmark/OglTexMem128: 22.26% ±0.06% Lower-end platforms are also affected by some subslice load imbalance to a lesser degree, especially during CCS resolve and fast clear operations, which are handled specially here due to rasterization ocurring in reduced CCS coordinates, which changes the semantics of the pixel hashing mode settings. No regressions seen during my tests on some SKL, KBL and BXT configurations. Additional benchmark reports welcome on any Gen9 platforms (that includes anything with Skylake, Broxton, Kabylake, Geminilake, Coffeelake, Whiskey Lake, Comet Lake or Amber Lake in your renderer string). P.S.: A similar problem is likely to be present on other non-Gen9 platforms, especially for CCS resolve and fast clear operations. Will follow-up with additional patches fixing the hashing mode for those once I have enough performance data to justify it. Reviewed-by: Kenneth Graunke <[email protected]>
* st/mesa: don't allocate mipmapped texture for NEAREST_MIPMAP_LINEARMarek Olšák2019-08-121-0/+12
| | | | Reviewed-by: Brian Paul <[email protected]>
* i965/spirv: Lower shared memory laterCaio Marcelo de Oliveira Filho2019-08-102-1/+20
| | | | | | | | | | | Instead of asking spirv_to_nir to lower the workgroup (shared memory) to offsets, keep them as derefs longer, then lower it later on. Because Workgroup memory doesn't have explicit offsets, we need to set those using nir_lower_vars_to_explicit_types before calling the I/O lowering pass. Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Use force_compat_profile driconf optionDanylo Piliaiev2019-08-102-2/+8
| | | | | Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: fix mem leak in error pathEric Engestrom2019-08-101-1/+3
| | | | | | Fixes: 8ae6667992ccca41d088 ("intel/perf: move query_object into perf") Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Mark Janes <[email protected]>
* mesa: be consistent on GL_TRUE/GL_FALSE & TRUE/FALSELionel Landwerlin2019-08-092-3/+3
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* mesa: drop some p_compiler.h typesLionel Landwerlin2019-08-091-2/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* mesa: add stddef include in preparation for dropping p_compiler.hLionel Landwerlin2019-08-092-0/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* i965: don't use p_compiler.h typesLionel Landwerlin2019-08-091-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* meson: define ETIME to ETIMEDOUT if not presentGreg V2019-08-081-3/+0
| | | | Reviewed-by: Eric Engestrom <[email protected]>
* anv,i965,iris: deduplicate setting of total_sharedRhys Perry2019-08-081-2/+0
| | | | | | | | v5: add patch Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* mesa/main: cast away constnessErik Faye-Lund2019-08-081-1/+1
| | | | | | | | This avoids a warning about implicitly casting away the constness of the pointer. Signed-off-by: Erik Faye-Lund <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* i965: Emit a dummy MEDIA_VFE_STATE before switching from GPGPU to 3DDanylo Piliaiev2019-08-081-0/+21
| | | | | | | | | | | | | There is an object-level preemption workaround which requires this. However, even without object-level preemption, we seem to have issues with geometry flickering when 3D and compute are combined in the same batch and this appears to fix it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110395 Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Cc: [email protected]
* st/mesa: eliminate unnecessary redirectionMark Janes2019-08-071-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: make perf context privateMark Janes2019-08-073-39/+27
| | | | | | Encapsulate the details of this data structure. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: print debug informationMark Janes2019-08-071-25/+8
| | | | | | | | | INTEL_DEBUG=perfmon will iterate over the perf queries, printing information about the state of each query. Some of this information will be private to intel/perf, and needs to a dump routine that can be called from i965. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: expose method to create queryMark Janes2019-08-071-10/+3
| | | | | | | By encapsulating this implementation within perf, we can eventually make struct gen_perf_ctx private. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move initialization of pipeline statistics metrics to gen_perfMark Janes2019-08-071-93/+3
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move get_query_data into gen_perfMark Janes2019-08-072-399/+2
| | | | | | | | | | | | | | | This refactor moves several helper functions for get_query_data as well: - accumulate_oa_reports - read_gt_frequency - get_pipeline_stats_data - get_oa_counter_data Functions which are no longer referenced in brw_performance_query.c have been removed. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move delete_query to gen_perfMark Janes2019-08-071-39/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move is_query_ready to gen_perfMark Janes2019-08-071-133/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: move wait_query to perfMark Janes2019-08-071-38/+1
| | | | | | | | | | | | | The following methods have duplicate implementation of read_oa_samples_until in brw_performance_query.c: - read_oa_samples_for_query - read_oa_samples_until They ar still referenced by other methods in the file and will be removed on the subsequent commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_busyMark Janes2019-08-071-4/+5
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for bo_wait_renderingMark Janes2019-08-071-1/+3
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/perf: create a vtable entry for batch_referencesMark Janes2019-08-071-6/+11
| | | | | | | Iris and i965 variants of this method need to be called by perf routines. Reviewed-by: Kenneth Graunke <[email protected]>