Flutter Analysis and Practice: Design and Practices of the High-Availability Framework

Image for post
Image for post

4.3.1 Why Do We Monitor Flutter Performance?

We now have various Application Performance Monitoring (APM) means. During native development, we have developed many SDKs to monitor online performance data. However, Flutter has made many revolutionary changes compared with native. As a result, all native performance monitoring on Flutter pages is invalid. To solve this problem, Xianyu launched a project called “Flutter High-Availability SDK” in December 2018. This project aims to make Flutter pages as measurable as native pages.

4.3.2 Required SDKs

Since performance monitoring is mature, we have sufficient resources for reference. Xianyu references the performance monitoring SDKs of the high EMAS availability of Taobao Mobile, the Matrix of WeChat, and the Hertz of Meituan to determine the performance metrics to be collected and the features required for the SDKs based on the actual Flutter situation.

4.3.2.1 Performance Metrics

1) Page sliding smoothness

Frames Per Second (FPS) is the main metric for page sliding smoothness. However, the FPS cannot be used to distinguish between a large number of slight frame lag events and a small number of serious frame lag events. These two types of frame lag events impact the user experience differently. Therefore, in addition to the FPS, we also introduced the sliding duration and frame dropping duration to measure the page sliding smoothness.

2) Page loading duration

The time consumed by page loading better reflects the interaction duration of users. The interaction duration is the period from the time when the user starts route redirection to the time when the interaction ends during page loading.

3) Page exceptions

Exceptions are basic metrics for evaluating user experience. We need to determine whether a version is qualified by collecting page exceptions and calculating the page exception rate.

4.3.2.2 SDK Features

1) Accuracy

Accuracy is a basic requirement of a performance monitoring SDK because developers may spend a lot of unnecessary time on troubleshooting due to false positives.

2) Online monitoring

Online monitoring states that the cost of collecting data cannot be too high, and the monitoring cannot affect app performance.

3) Easy to develop

The ultimate goal of an open-source project is that everyone gets involved and contributes to the community. Therefore, the SDK must be easy to develop, and a series of specifications are needed for development.

4.3.3 Overall Design in Terms of a Single Index

This section takes the collection of the instantaneous FPS as an example to describe the overall design of the SDK.

First, the class inherited from needs to be implemented. This class is used to obtain the pop and push time of pages at the business layer, as well as the time (provided by ) when page rendering starts and ends and tap events occur and calculate the source data according to the time. For the instantaneous FPS, the source data is the duration of each frame.

We start from and end at to get the duration of each frame. The collected duration of each frame is encapsulated as and pushed to the . The distributes the data to the Processor that has subscribed to the . Therefore, we need to create an to subscribe to and process the source data.

The collects the duration of each frame and calculates the instantaneous FPS of 1s. Similarly, the calculated FPS is encapsulated as an and then pushed to the . The sends the to the Uploader that has subscribed to it for processing. Therefore, we need to create a to subscribe to and process the data.

The Uploader can select the to be subscribed to by using , and receive and report notifications by using . Theoretically, one Uploader corresponds to one upload channel. Users can report data to different places as needed by using the and .

4.3.4 Overall Structural Design

An SDK can be divided into four layers and uses the publish/subscribe model extensively. The four layers communicate with each other through and , as shown in Figure 4-12. This mode can completely decouple layers and process data more flexibly.

Image for post
Image for post
Figure 4–12

4.3.4.1 API

The API layer contains some externally exposed APIs. For example, needs to be called before , and the business layer needs to call the method to provide some calling opportunities for the SDK.

4.3.4.2 Recorder

The Recorder layer uses the opportunities provided by events to collect source data and submits the data to the Processor that has subscribed to the data for processing. For example, the duration of each frame in the FPS collection is a piece of source data. This layer is designed to use the source data in different places. For example, the duration of each frame can be used to calculate the FPS and the number of lagging seconds.

Before the source data is used, the must be inherited to select the event for subscription through . Then, it processes received event through .

4.3.4.3 Processor

At this layer, the source data is processed into the final data that can be reported and is submitted to the Uploader that has subscribed to the data for reporting. For example, in the FPS collection, the FPS over a period of time can be obtained based on the collected duration of each frame.

The must be inherited to select the subscribed data type through . Then, it processes the received data through .

4.3.4.4 Uploader

This layer is implemented by users because they want to report data to different places. Therefore, the SDK provides a base class. You only need to follow the specifications of the base class to obtain the subscribed data.

The must be inherited to select the subscribed data type through . Then, it processes the received through .

4.3.4.5 PerformanceDataCenter

This layer receives BaseData (source data) and (processed data), and distributes the call opportunities to the Processor and Uploader that have subscribed to them for processing.

In the and constructors, the register method of the is called for registration. This operation stores Processor and Uploader instances in two corresponding maps of the . This data structure allows one to map multiple subscribers.

As shown in Figure 4–13, when the method is called to push data, data is distributed based on the data type, and is sent to all the Processors or Uploaders that have subscribed to the data type.

Image for post
Image for post
Figure 4–13

4.3.4.6 PerformanceEventCenter

The design idea of this layer is similar to the layer. However, this layer receives events (the opportunities) provided by the business layer and distributes these opportunities to their Recorders that have subscribed to them for processing. The main types of events are:

  • App Status: An app is switched between the frontend and backend.
  • Page Status: The frame rendering starts or ends.
  • Business Status: Pop or push events occur on pages, pages slide, or exceptions occur in the business.

4.3.5 Use of the SDK

We need to know that the Flutter High-Availability SDK only collects data and that subsequent data reporting and data presentation must be customized based on actual conditions. Based on this, the following briefly introduces how to use the high-availability SDK.

To use the SDK, you only need to focus on the API and Uploader layers and perform the following operations:

  • Reference the high-availability SDK in Pubspec
  • Call the method to initialize the SDK before calling the method
  • In the business code, use the method to provide some necessary opportunities for the SDK, such as the pop and push events of the route
  • Customize an Uploader class to report the data to the data collection platform in your desired format

4.3.6 Implementation of the SDK

Xianyu has optimized the data accuracy of the Flutter High-Availability SDK many times, solved many problems in exception scenarios, and implemented a refactor. So far, the SDK has been running stably on Xianyu. No stability issues caused by the high-availability SDK have occurred, and the data collection accuracy has stabilized after several optimizations.

Flutter high availability focuses on the collection of performance data. Therefore, Xianyu needs to use the existing capabilities of the Alibaba Group for data reporting and data presentation. Xianyu uses the EMAS backend data processing and frontend data presentation capabilities of Taobao Mobile to report and display the data collected online by the high-availability SDK. This allows Flutter pages to compete with native pages.

Original Source:

Written by

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store