summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* nir/spirv: cast shift operand to u32Karol Herbst2018-11-142-0/+31
| | | | | | | | v2: fix for specialization constants as well Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected] Signed-off-by: Karol Herbst <[email protected]>
* nir: replace nir_load_system_value calls with appropiate builder functionsKarol Herbst2018-11-149-30/+27
| | | | | | | | | this helps reduce the overall code changes when a bit_size parameter is added to nir_load_system_value Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* nir: add const_index parameters to system value builder functionKarol Herbst2018-11-141-2/+19
| | | | | | | | | this allows to replace some nir_load_system_value calls with the specific system value constructor Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* radv: make use of nir_move_out_const_to_consumer()Timothy Arceri2018-11-141-0/+4
| | | | | | | | | | | | | | | | | | vkpipeline-db results: Totals from affected shaders: SGPRS: 28400 -> 28576 (0.62 %) VGPRS: 27916 -> 27692 (-0.80 %) Spilled SGPRs: 140 -> 138 (-1.43 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 1534456 -> 1520560 (-0.91 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 3541 -> 3582 (1.16 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Samuel Pitoiset <[email protected]>
* anv: move helper function internallyLionel Landwerlin2018-11-132-22/+22
| | | | | | | | It's only used in anv_image.c Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: use image aspects rather than computed onesLionel Landwerlin2018-11-131-1/+1
| | | | | | | | | | | This shouldn't make any difference but I feel uneasy to use the expanded aspects that do not represent the image in its entirety. If we ever change the implementation of the anv_image_aspect_to_plane() helper, this is safer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: associate vulkan formats with aspectsLionel Landwerlin2018-11-132-41/+69
| | | | | | | | This will make it easier to associate an aspect with a plane number. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/lower_ycbcr: make sure to set 0s on all componentsLionel Landwerlin2018-11-131-5/+5
| | | | | | | | | To play around with debugging, we might want to disable one or the other component. Having 0s as default values makes this work. Otherwise we might have NULL components, leading to crashes. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: remove unused parameterLionel Landwerlin2018-11-131-3/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: simplify internal address offsetLionel Landwerlin2018-11-131-2/+1
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: fix wayland-less buildsEric Engestrom2018-11-131-1/+3
| | | | | | | | | | | | | | | | Those empty variables in the !wayland case are useless and running that meson.build with them breaks the build: [287/850] Generating wayland-drm-client-protocol.h with a custom command. FAILED: src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h client-header ../src/egl/wayland/wayland-drm/wayland-drm.xml src/egl/wayland/wayland-drm/wayland-drm-client-protocol.h /bin/sh: client-header: command not found ninja: build stopped: subcommand failed. Fixes: d1992255bb29054fa5176 "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* gbm: remove unnecessary meson includeEric Engestrom2018-11-131-1/+0
| | | | | | | | | | `inc_wayland_drm` is only used if wayland is built, and it's already added in that case a few lines below. Fixes: a29869e8720b385d3692f "gbm: Don't traverse backwards for includes" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson: only run vulkan's meson.build when building vulkanEric Engestrom2018-11-131-1/+3
| | | | | | | Fixes: d1992255bb29054fa5176 "meson: Add build Intel "anv" vulkan driver" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* xmlpool: update translation po filesEric Engestrom2018-11-136-193/+950
| | | | | | | | | | | These files are close to 4 years out of date; a lot's changed since. Let's just check in a recently-regenerated version. Changes generated by running `ninja xmlpool-{pot,update-po,gmo}`. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Acked-by: Emil Velikov <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen11)Toni Lönnberg2018-11-131-116/+116
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen10)Toni Lönnberg2018-11-131-113/+113
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen9)Toni Lönnberg2018-11-131-117/+117
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added more missing engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen8)Toni Lönnberg2018-11-131-116/+116
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen75)Toni Lönnberg2018-11-131-107/+107
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen7)Toni Lönnberg2018-11-131-83/+83
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen6)Toni Lönnberg2018-11-131-54/+54
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions v4: Added missing engine to MEDIA_GATEWAY_STATE Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen5)Toni Lönnberg2018-11-131-30/+30
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen45)Toni Lönnberg2018-11-131-27/+27
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added addition engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen4)Toni Lönnberg2018-11-131-25/+25
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: tools: Use engine for decoding batch instructionsToni Lönnberg2018-11-138-53/+69
| | | | | | | | | | | | | | | | The engine to which the batch was sent to is now set to the decoder context when decoding the batch. This is needed so that we can distinguish between instructions as the render and video pipe share some of the instruction opcodes. v2: The engine is now in the decoder context and the batch decoder uses a local function for finding the instruction for an engine. v3: Spec uses engine_mask now instead of engine, replaced engine class enums with the definitions from UAPI. v4: Fix up aubinator_viewer (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: tools: gen_engine to drm_i915_gem_engine_classToni Lönnberg2018-11-134-25/+19
| | | | | | | | | | Removed the gen_engine enum and changed the involved functions to use the drm_i915_gem_engine_class enum from UAPI instead. v3: Wrong engine was being used for blocks in video ring v4: Fixed aubinator_viewer.cpp Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Engine parameter for instructionsToni Lönnberg2018-11-132-0/+31
| | | | | | | | | | | | | | | | | | | Preliminary work for adding handling of different pipes to gen_decoder. Each instruction needs to have a definition describing which engine it is meant for. If left undefined, by default, the instruction is defined for all engines. v2: Changed to use the engine class definitions from UAPI v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to engine_mask, added check for incorrect engine and added the possibility to define an instruction to multiple engines using the "|" as a delimiter in the engine attribute. v4: Fixed the memory leak. v5: Removed an unnecessary ralloc_free(). Reviewed-by: Lionel Landwerlin <[email protected]>
* virgl: Add command and flags to initiate debugging on the host (v2)Gert Wollny2018-11-135-0/+37
| | | | | | | | | | | | On the host VREND_DEBUG=guestallow must be set to let the guest override the debug flags. v2: Send flag string instead of flags, this avoids the need to keep the flags in sync. v3: Only request host logging if the host actually understands the command Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Erik Faye-Lund <[email protected]>
* mesa: Reference count shaders that are used by transform feedback objectsGert Wollny2018-11-131-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Transform feedback objects may hold a pointer to a shader program, and at least in Gallium, this must be a valid pointer until ctx->Driver.EndTransformFeedback in glEndTransformFeedback has been called - which is conform with the spec that any program that is part of a current rendering state should only be flagged for deletion by glDeleteProgram. This was not handled properly for the transform feedback objects so that a call sequence glUseProgram(x) glBeginTransformFreedback(...) glPauseTransformFeedback(...) glDeleteProgram(x) glEndTransformFeedback(...) would result in a use after free bug. With this patch the transform feedback object also updates the reference count to the used program thereby keeping the program valid as long as the transform feedback objects links to it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108713 Fixes: 654587696b4234d09a6b471b70e9629cf2887c27 mesa: add end_transform_feedback() helper Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9Samuel Pitoiset2018-11-132-3/+21
| | | | | | | Ported from RadeonSI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: set PA.SC_CONSERVATIVE_RASTERIZATION.NULL_SQUAD_AA_MASK_ENABLESamuel Pitoiset2018-11-131-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: binding streamout buffers doesn't change context regsSamuel Pitoiset2018-11-131-2/+7
| | | | | | Cc: 18.3 <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Don't lower the local work group size if it's variable.Plamena Manolova2018-11-131-6/+18
| | | | | | | | | | If the local work group size is variable it won't be available at compile time so we can't lower it in nir_lower_system_values(). Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* util/ralloc: Make sizeof(linear_header) a multiple of 8Matt Turner2018-11-121-2/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch sizeof(linear_header) was 20 bytes in a non-debug build on 32-bit platforms. We do some pointer arithmetic to calculate the next available location with ptr = (linear_size_chunk *)((char *)&latest[1] + latest->offset); in linear_alloc_child(). The &latest[1] adds 20 bytes, so an allocation would only be 4-byte aligned. On 32-bit SPARC a 'sttw' instruction (which stores a consecutive pair of 4-byte registers to memory) requires an 8-byte aligned address. Such an instruction is used to store to an 8-byte integer type, like intmax_t which is used in glcpp's expression_value_t struct. As a result of the 4-byte alignment returned by linear_alloc_child() we would generate a SIGBUS (unaligned exception) on SPARC. According to the GNU libc manual malloc() always returns memory that has at least an alignment of 8-bytes [1]. I think our allocator should do the same. So, simple fix with two parts: (1) Increase SUBALLOC_ALIGNMENT to 8 unconditionally. (2) Mark linear_header with an aligned attribute, which will cause its sizeof to be rounded up to that alignment. (We already do this for ralloc_header) With this done, all Mesa's unit tests now pass on SPARC. [1] https://www.gnu.org/software/libc/manual/html_node/Aligned-Memory-Blocks.html Fixes: 47e17586924f ("glcpp: use the linear allocator for most objects") Bug: https://bugs.gentoo.org/636326 Reviewed-by: Eric Anholt <[email protected]>
* util/ralloc: Switch from DEBUG to NDEBUGMatt Turner2018-11-121-14/+4
| | | | | | | The debug code is all asserts, so protect it with the same thing that controls assert. Reviewed-by: Eric Anholt <[email protected]>
* nir: add support for removing redundant stores to copy prop varTimothy Arceri2018-11-131-10/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example the following type of thing is seen in TCS from a number of Vulkan and DXVK games: vec1 32 ssa_557 = deref_var &oPatch (shader_out float) vec1 32 ssa_558 = intrinsic load_deref (ssa_557) () vec1 32 ssa_559 = deref_var &oPatch@42 (shader_out float) vec1 32 ssa_560 = intrinsic load_deref (ssa_559) () vec1 32 ssa_561 = deref_var &oPatch@43 (shader_out float) vec1 32 ssa_562 = intrinsic load_deref (ssa_561) () intrinsic store_deref (ssa_557, ssa_558) (1) /* wrmask=x */ intrinsic store_deref (ssa_559, ssa_560) (1) /* wrmask=x */ intrinsic store_deref (ssa_561, ssa_562) (1) /* wrmask=x */ No shader-db changes on i965 (SKL). vkpipeline-db results RADV (VEGA): Totals from affected shaders: SGPRS: 7832 -> 7728 (-1.33 %) VGPRS: 6476 -> 6740 (4.08 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 469572 -> 456596 (-2.76 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 989 -> 960 (-2.93 %) Wait states: 0 -> 0 (0.00 %) The Max Waves and VGPRS changes here are misleading. What is happening is a bunch of TCS outputs are being optimised away as they are now recognised as unused. This results in more varyings being compacted via nir_compact_varyings() which can result in more register pressure when they are not packed in an optimal way. This is an existing problem independent of this patch. I've run some benchmarks and haven't noticed any performance regressions in affected games. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/i965: make use of nir_link_constant_varyings()Timothy Arceri2018-11-131-0/+3
| | | | | | | | | | | | | | | | | | | shader-db results for SLK: total instructions in shared programs: 13106498 -> 13091573 (-0.11%) instructions in affected programs: 1186244 -> 1171319 (-1.26%) helped: 6186 HURT: 0 total cycles in shared programs: 332062633 -> 331961653 (-0.03%) cycles in affected programs: 8537165 -> 8436185 (-1.18%) helped: 5371 HURT: 862 LOST: 6 GAINED: 14 Reviewed-by: Jason Ekstrand <[email protected]>
* egl: Improve the debugging of gbm format matching in DRI configs.Eric Anholt2018-11-121-2/+3
| | | | | | | | | | | | | | | | | Previously the debug would be: libEGL debug: No DRI config supports native format 0x20203852 libEGL debug: No DRI config supports native format 0x38385247 but libEGL debug: No DRI config supports native format R8 libEGL debug: No DRI config supports native format GR88 is a lot easier to understand. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* gbm: Introduce a helper function for printing GBM format names.Eric Anholt2018-11-122-0/+26
| | | | | | | | | | This requires that the caller make a little (stack) allocation to store the string. v2: Use gbm_format_canonicalize (suggested by Daniel) Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* gbm: Move gbm_format_canonicalize() to the core.Eric Anholt2018-11-123-16/+19
| | | | | | | I want it for the format name debugging code. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* mesa: mark GL_SR8_EXT non-renderable on GLESMarek Olšák2018-11-121-0/+1
| | | | | | Fixes: dEQP-GLES3.functional.fbo.completeness.renderable.texture.color0.sr8_ext Reviewed-by: Ilia Mirkin <[email protected]>
* st/mesa: disable L3 thread pinningMarek Olšák2018-11-121-9/+0
| | | | | | | This implementation can have massive drawbacks. Cc: 18.3 <[email protected]> Reviewed-by: Edmondo Tommasina <[email protected]>
* nir: add lowering for ffloorChristian Gmeiner2018-11-122-0/+4
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* util: Fix warning in u_cpu_detect on non-x86Alyssa Rosenzweig2018-11-121-2/+2
| | | | | | | | | regs is only set and used on x86; on other platforms (like ARM), this code causes a trivial warning, solved by moving the regs declaration to the architecture-dependent usage. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno/drm: fix unused 'entry' warningsRob Clark2018-11-121-2/+0
| | | | | | | Looks like importing libdrm_freedreno into mesa crossed paths with e27902a2613. Signed-off-by: Rob Clark <[email protected]>
* i965: add support for sampling from AYUVLionel Landwerlin2018-11-124-0/+11
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* dri: add AYUV formatLionel Landwerlin2018-11-122-0/+2
| | | | | | | | v2: Add a AYUV entry android in the android backend (Tapani) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir/lower_tex: Add AYUV lowering supportLionel Landwerlin2018-11-122-0/+20
| | | | | | | | | | | | | | | Byte ordering is : 0: V 1: U 2: Y 3: A v2: Split refactoring of alpha channel (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> (v1) Acked-by: Eric Engestrom <[email protected]> (v2)
* nir/lower_tex: add alpha channel parameter for yuv loweringLionel Landwerlin2018-11-121-6/+11
| | | | | | | | | We're about to introduce AYUV support which provides its own alpha channel. So give alpha as a parameter and set it to 1 on exising formats. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv: make use of num_good_cu_per_sh in si_emit_graphics() tooSamuel Pitoiset2018-11-121-2/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>