Flutter Analysis and Practice: Same Layer External Texture Rendering

In 2013, we were working on a group video call project and multi-channel video rendering was a major performance bottleneck. This was because the high-speed, on-screen operation of each channel (PresentRenderBuffer or SwapBuffer displaying the result of the rendering buffer on the screen) consumed a lot of CPU and GPU resources.

At the time, we separated the render and on-screen operations and abstracted multiple channels into a rendering tree. We then traversed the tree and performed rendering. After rendering was completed, the Vertical Synchronization (VSync) signal triggers on-screen operations all at once, instead of doing so one by one. This greatly reduces performance overhead.

We considered rendering the entire UI using OpenGL to further reduce the performance overhead of animations, such as sound spectrum and breathing effects. on the UI. However, due to various limitations, we did not put it into practice.

2.2.1 Flutter Rendering Framework

Figure 2–7 shows a simple Flutter rendering framework.

LayerTree is a tree data structure output by Dart Runtime. Each leaf node on the tree represents a UI element such as buttons or images.

Skia is a cross-platform rendering framework released by Google. It uses OpenGL to render at the backend. However, its Vulkan support is limited and does not support Metal at all.

Shell is a part of the platform, which includes implementations for iOS and Android, such as EAGLContext management, on-screen operations, and external texture implementations.

As shown in Figure 2–7, each leaf node on LayerTree is traversed in the pipeline after Dart Runtime completes the layout and outputs LayerTree. Each leaf node eventually calls the Skia engine and completes the rendering of UI elements. After the traversal is complete, glPresentRenderBuffer (iOS) or glSwapBuffer (Android) is called to complete the on-screen operation.

Based on this basic principle, Flutter implements UI separation on the native and Flutter engine, so developers do not need to care about platform implementation when writing UI code, making cross-platform apps possible.

2.2.2 Known Issues

While Flutter is separated from native, the Flutter engine and native are also completely separated. It is difficult for Flutter to obtain images that use a lot of memory from the native side, such as camera frames, video frames, and album images. React Native and Weex can directly obtain such data through NativeAPI. However, Flutter cannot directly obtain such data due to its basic principles. In essence, the channel mechanism defined by Flutter is a message transmission mechanism for transmitting data, such as images. This inevitably causes high memory usage and CPU utilization.

2.2.3 Solutions

To solve the problem, Flutter provides a special mechanism: external textures, as shown in Figure 2–8.

As shown in Figure 2–8, each leaf node represents a widget written in Dart. The last node is the TextureLayer. This node corresponds to the texture widget in Flutter, which is not the same thing as GPU textures. When a texture widget is created in Flutter, it means the data this widget is displaying must be provided by native.

The process of rendering the TextureLayer node on iOS (similar to Android, with a slight difference in acquiring textures) is:

  1. Call external_texture copyPixelBuffer to obtain CVPixelBuffer.
  2. Call CVOpenGLESTextureCacheCreateTextureFromImage to create an OpenGL texture. This is a real texture.
  3. Encapsulate the OpenGL texture into an SKImage and call DrawImage of Skia to complete the rendering.

The key question is where the externaltexture object came from.

As shown by the code, before the native side calls RegisterExternalTexture, an object that implements FlutterTexture must be created. This object is eventually assigned to the external texture. externaltexture is a bridge between Flutter and native. You can use it to obtain image data throughout the rendering process.

As shown in Figure 2–9, the carrier of data transmitted by Flutter and native is PixelBuffer, and the data source (such as the camera and player) on the native side writes the data to PixelBuffer. Flutter acquires the data from PixelBuffer, converts it into the OpenGLES texture, and submits it to Skia for rendering.

Using this process, Flutter can easily render all the data required by the native side. In addition to dynamic image data from cameras and players, the image data from other image widgets can also be rendered. In particular, if the native side has a large image loading library, it takes time and effort to implement the same thing using Dart on the Flutter side. This process seems to be the perfect solution for displaying large amounts of data from the native side using Flutter. However, there are still many issues.

Figure 2–10 shows the flow of processing video and image data in a project. To improve performance, GPU is usually used on the native side while Flutter uses the copyPixelBuffer API, which means the data is transmitted from GPU to CPU and then to GPU. The CPU-to-GPU memory swap is the most time-consuming operation, which uses more time than processing the entire pipeline.

The Skia rendering engine requires GPU textures and the output of native data processing is precisely that. Can we use that text directly? The answer is yes, if EAGLContext resources are shared. The EAGLContext indicates the context used to manage the current GL environments and ensure resource separation in different environments.

Figure 2–11 shows the thread structure of Flutter.

In general, Flutter creates four runners. A runner, similar to the Grand Central Dispatch (GCD) on iOS, is a mechanism for running tasks in a queue. In most cases, a runner corresponds to a thread. The following runners are related to this article: the GPU runner, the I/O runner, and the platform runner.

  • The GPU runner: is responsible for GPU rendering.
  • The I/O runner: is responsible for resource loading.
  • The platform runner runs on the main thread and is responsible for all interactions between native and the Flutter engine.

An OpenGL-based app thread design has one thread for loading resources (images and textures) and one thread for rendering. However, to ensure that the textures created by the loading thread are available for the rendering thread, both threads share the same EAGLContext. However, this is not secure. If multiple threads access the same object with locking, that impacts performance. Improper code handling may even cause deadlocks. Therefore, Flutter uses another mechanism for EAGLContext: the two threads use their own EAGLContext and share texture data with each other through ShareGroup (shareContext for Android.)

A native module that uses OpenGL will also create its own context under its thread. To deliver the texture data created by the contexts to Flutter and submit it to Skia for rendering, it is necessary to expose its ShareGroup and save the ShareGroup on the native side when Flutter creates two internal contexts. ShareGroup is used to create contexts on the native side. This enables texture sharing between native and Flutter, as shown in Figure 2-12.

This method of using external_texture has two advantages:

  1. It saves CPU time. According to our test results, it takes about 5 ms to read a 720p video in RGBA format from the GPU to the CPU on Android, and another 5 ms from the CPU to the GPU. It still takes about 5 ms after Pixel Buffer Object (PBO) is introduced. This is not acceptable for High Frame Rate (HFR) scenarios.
  2. It saves CPU and memory usage. All data is transmitted between GPUs, which is especially suitable in image scenarios because many images need to be displayed at the same time.

The preceding topic has introduced the basic principles and optimization policies of Flutter external textures. You may wonder why Flutter still uses Pixelbuffer when external textures are so good. To use textures, you need to expose the ShareGroup. This is equivalent to opening the GL environment of Flutter. If the environment is isolated, deleteFrameBuffer does not affect objects in other environments when deleteTexture operations are performed. However, if the environment is opened, these operations may affect objects that are part of the Flutter context. Therefore, the framework designer must ensure the isolation and integrity of the framework.

During the development, Xianyu encountered a strange problem. It took a long time to locate the reason why glDeleteFrameBuffer is called when setCurrentContext is not set for the main thread. As a result, Flutter's FrameBuffer is deleted by mistake, causing Flutter crash during rendering. Therefore, to use this solution, try not to perform GL-related operations on the main thread on the native side, and add setCurrentContext before calling functions for the operations.

In addition, most logic in this article uses iOS implementation as examples. The overall principles of Android are the same, with slight differences in implementation methods. The external texture of Flutter on Android is implemented through SurfaceTexture by replicating data from the CPU to the GPU. OpenGL on Android uses ShareContext instead of ShareGroup to transmit the context. In addition, GL on Android at the Shell layer is implemented using C++. Therefore, the context is a C++ object. To share this C++ object with the Java Context object on the native side of Android, call it at the Java Native Interface (JNI) layer.

Original Source:

Follow me to keep abreast with the latest technology news, industry insights, and developer trends.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store