Intelligently Generate Frontend Code from Design Files: imgcook

Background Analysis

Machine learning is trending in the industry, and AI has become the consensus for the future. Kai-Fu Lee also pointed out in “AI future” that artificial intelligence will replace nearly 50% of human work within 15 years, especially simple and repetitive tasks. Moreover, white-collar employees’ work will become easier to replace than that of blue-collar workers since the work of blue-collar workers may need breakthroughs in robotics and related technologies in both software and hardware. However, only technological breakthroughs in software can replace white-collar workers. Will our frontend “white-collar” work be replaced? When and how much will be replaced?

Competitive Product Analysis

In 2017, Pix2Code, a paper about image to code, attracted the industry’s attention. It describes generating source code directly from the design image with deep learning. Subsequently, similar ideas based on this idea regularly emerged in the community. For instance, Microsoft AI Lab launched Sketch2Code in 2018, an open source tool for converting sketch into code. At the end of the same year, Yotako drew people’s attention as the platform to transfer design drafts to code. As such, machine learning has officially attracted frontend developers.

Problem Resolution

The goal of generating code from the design document is to enable web developers to improve work efficiency and eliminate repetitive work. The general daily workflow is as follows for regular frontend developers, especially client-side developers.

View Code

In view code development, HTML and CSS code is generally written based on a design document. How to improve efficiency here? When facing the repetitive work of UI view development, it is natural to think about solutions like packaging and reusing materials, such as components and modules. Based on this solution, various UI libraries were precipitated. There are even higher-level encapsulations that are the platforms to build websites visually. However, reused materials cannot cover all scenarios. There are a lot of business scenarios needing personalized views. Facing the problem itself, is it possible to generate reliable HTML and CSS code directly?

  • Element Self-adaption: Extensibility of the element itself, alignment between elements, maximum width, and high fault tolerance of elements.
  • Semantic: Multi-level semantics of class names.
  • CSS expression: The background color, rounded corners, lines, etc.
  • The industry has been working in this direction for a long term. The basic information of elements in a design document can be exported through the design tool’s plugin. But the problem remains in the aspect of the high requirement for the design document and poor maintainability of generated code.

Logical Code

Usually, web development also includes logic code, including data binding, dynamic, and business logic codes. The improvable part is to reuse dynamic effect and business logic code, which can be abstracted as basic components.

  • Dynamic effect: The input of this part is a design document. Generally, the delivery forms of dynamic effect are various. Some of which are animated gif demonstrations, while some are text description or even oral. The generation of dynamic code is more suitable for visual generation. There is no reference for direct, intelligent generation, considering that the input-output ratio is not a short-term problem.
  • Business logic: This part of the development is mainly based on PRD, and even the product manager’s logic. If you want to generate this part of logic code intelligently, there is too much input. Specifically, we need to see what problems intelligentization can solve in this sub-field.

Logical Code Generation

The ideal plan is to learn historical data like other artistic fields such as poetry, painting, and music. According to PRD’s input, the new logic code can be generated directly. But can the generated code run directly without errors?

  • We can guess some of the reusable logical points from the design draft. For instance, to bind the image or text data to view, we can use NLP classification or image classification to recognize the elements’ contents.
  • Reusable business logic points: It is intelligently identified based on views. It contains small logic points (one line of expression, or several lines of code that are generally insufficient to be encapsulated into components), basic components, and business components.
  • New business logic that cannot be reused: Structured (visualized) collection of PRD requirements is a difficult task and is still being tried.


We have described the strategies to generate HTML + CSS + part of JS + part of data intelligently from the above analysis. This is the primary process of D2C (Design2Code). The product we developed from this idea is imgcook. In recent years, with the maturity of third-party plugins of popular design tools (Sketch, PS, XD, etc.), the rapid development of deep learning even outperforms human recognition capabilities. This is the vital background for D2C’s birth and continuous evolution.

Technical Solution

Based on the general analysis of the frontend intelligent development mentioned above, we have made an overview and architecture of the existing D2C intelligent technology system, which is mainly divided into the following three parts:

  • Expression ability: Mainly output the data and access the engineering part.
  • Use DSL to make the standard structured description of Schema2Code.
  • Perform Project Access through IDE plugins.
  • Algorithm engineering: To better support the intelligentization D2C requires, high-frequency capabilities are served, mainly including data generation, processing, and model services.
  • Sample generation: Mainly process each channel’s sample data and generate samples.

Intelligent Identification Layer

In the entire D2C project, the core is the recognition capability part. The specific decomposition of this layer is as follows. The subsequent series of articles will focus on these subdivided layers.

  • Layer processing layer: Mainly separate the layers in the design document or image, and combine the previous layer’s recognition results to sort out the layer meta information.
  • Layer reprocessing layer: Further normalize the data from previous layers.
  • Layout algorithm layer: Convert the absolute position to layout relative position and Flex layout.
  • Semantic layer: The layer’s multi-dimensional features are used to make semantic expressions on the generated code.
  • Field binding layer: Bind and map the static data in the layer with the actual backend data.
  • Business logic layer: Generates the business logic codes through the business logic identification and expresser.
  • Output engine layer: Finally, output the code intelligently processed by each layer’s various DSL.

Technical Difficulties

Of course, incomplete recognition and low recognition accuracy have always been a major topic of D2C, and it is also our core technical point. We try to analyze the factors that cause this problem from these perspectives:

Problem Definition

At present, the computer vision models in deep learning are more suitable for solving classification and object detection problems. The premise for us to judge whether the deep model should be used for a recognition problem is whether we can judge and understand the problem by ourselves, whether this kind of problem has ambiguity, and so on. And if we cannot judge accurately, then this recognition problem may not be appropriate.

  • Step 2: Reasonably summarize and classify the types of pictures, which is the easiest place to be controversial. Bad definition and ambiguity will lead to the model’s problem.
  • Step 3: Analyze the features of each type of picture — whether these features are typical or not and whether they are the core feature points — because they are related to the inference generalization ability of subsequent models.
  • Step 4: Whether the data sample source of each type of image is available or not, and if not, whether it can be automatically created or not. If the data sample cannot be available, it is not suitable to use the model. And you can replace the hard rules to see the effect first.

Sample Quality

To improve sample quality, we need to establish standard specifications for these datasets, build multi-dimensional datasets in different scenarios, and uniformly process and provide the collected data. It is expected to establish a set of standardized data systems.

Data sample engineering system


We try to summarize scenarios to improve accuracy for model recall and misjudgment. The samples in different scenarios often have some similar features or some key features that affect local feature points, resulting in misjudgment. This results in a low recall rate. We expect that we can identify models by converging scenarios to improve model accuracy. We converge the scenario to the following three scenarios: wireless client-side marketing scenario, mini-app scenario, and PC scenario. The modes of these scenes have their own characteristics. Designing different recognition models for each scene can efficiently improve the recognition accuracy of a single scene.

D2C scenario

Thoughts of the Process

Since a deep model is used, a more realistic problem is that the model cannot identify data other than the features learned in the training sample. And the accuracy rate cannot be 100% satisfactory to the user. Besides the samples, what can we do?

  • Based on the layer’s context information, you can make some rule judgments to determine whether it is a loop body.
  • Using the layer features of machine learning, you can try to optimize rules.
  • Generate some positive and negative samples of the loop to learn through the deep learning model.

Business Landing: 2019 Double 11

After nearly two years of optimization, the first closed-loop development of the marketing module uses D2C. This includes module creation, view code generation, logical code generation, writing supplementary logical code, and debugging.

D2C code generation user changes

Overall Landing Situation

As of 09 Nov 2019, the data is as follows:

  • The number of users is 4,315, and about 150 new users are newly added every week
  • Number of teams: 24
  • Custom DSLs: 109

Follow-up Planning

  • Continue to reduce the design documents’ requirements, in which the intelligent identification accuracy of grouping and loop are improved, and the manual intervention cost of design document is reduced.
  • The component identification accuracy has been improved. Currently, the accuracy is only 72%, and the business application availability is low.
  • The page-level and project-level restoration capabilities are improved, depending on the accuracy of page segmentation capabilities.
  • Improve page-level restoration of mini-apps and PC programs, and improve overall restoration of complex forms, tables, and charts.
  • Improve the ability to generate code from static images, which can be used in the production environment.
  • Algorithm engineering products are improved, and sample generation channels are more diversified.
  • Open source.

Contact us

Original Source:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store