How Hundreds of Millions of Items are Filtered and Organized on Xianyu
11.11 Big Sale for Cloud. Get unbeatable offers with up to 90% off on cloud servers and up to $300 rebate for all products! Click here to learn more.
By Yi Jing, from Xianyu Technology.
Xianyu is a popular customer-to-customer buy-and-sell platform for secondhand goods in China, connected with the ever-popular Taobao. For Western audiences, Xianyu is somewhat similar to Facebook Marketplace or Craigslist, or more specialized second-hand buy-and-sell platforms like LetGo or Decluttr in America.
In Xianyu, you can make trades relatively freely and easily on the platform. All you need to do to post an item for sale is simply enter the name, price, and the number of items you want to sell. Of course, you may want to also add pictures or a video of your item. Very conveniently, you can modify your item’s specifics such as the price even after posting the item. This sort of flexibility and convenience is one of the reasons for Xianyu’s success in China.
Technically speaking, behind all of this convenience and flexibility is one of the largest pain points of any e-commerce platform, commodity structuralization. Coming up with a solution for commodity structuralization that allows for intelligent recommendations but yet is also seamless, being non-intrusive to the user’s experience, isn’t as easy as you may think.
Why is commodity structuralization so important and yet so difficult? Well, structured information is the basis for any e-commerce platform to understand the items listed on it. The platform can only accurately recommend suitable commodities to target users after it understands all the commodities and all their properties. This is the only way to create value for buyers and sellers through intelligence on the platform.
However, it is difficult to structuralize commodities on the customer-to-customer platform. It’s difficult because the heavy burden of structuralization can de-motivate sellers from using the platform, making it extremely challenging for users to structuralize commodities at minimum costs to their experience. Therefore, it’s important to create a solution that is based on the merchant console. We need a solution that is simple, efficient, and flexible.
The Technical Solution
What can we do to resolve this issue? First, let’s conduct a comparative analysis on solutions based on the entire customer-to-customer commodity publishing cycle, or to use simpler terms the process where customers post items they want to sell.
Approach 1: Offline Solutions
Offline solutions include an algorithm-based association solution and a socialization solution. The algorithm-based association solution is designed to use technical means to analyze items posted by users, and associate identical items, or commodities, or tag them with certain properties. By contrast, the key to the socialization solution is to package commodity structuralization into an activity done by the user. In this way, you can associate commodities by structures according to the way users answer questions. The key disadvantages of the offline solutions are that the associated links are too long and the data backflow is slow. More importantly, the data exported after analysis is not confirmed by users and cannot be used in display fields.
Approach 2: The Manual Association Solution
This approach is used in the whole publishing process, and could be the most intuitive solution. It is designed to provide guidance for users to tag commodities with properties or to associate identical commodities when posting an item. It is a relatively simple and intuitive process. However, it also has clear disadvantages: The cost is completely passed on to the platform users. Specifically for sellers, this means that each additional posting option may cause the user to leave the platform, annoyed by the process. Therefore, this solution can be used as a supplement to structuralization, but it is definitely not the best solution to the problem.
Given these challenges, we must consider whether an efficient and cost-effective solution can be achieved that doesn’t waste the user’s time. The answer is the solution given in this article: the smart publishing solution. If a commodity to be published by a Xianyu user can be automatically associated with a commodity in the library of Taobao-its larger e-commerce platform brother-during the publishing stage, the item, or commodity, can use the same structured information of identical commodities already on the platform. In this way, the commodity structuralization can be resolved easily and quickly.
From this comparison, it is clear that the smart publishing solution is the most cost-effective solution.
First, let’s learn about the core product logic from the following table, in which case posting an item by recording a video is the example.
The Steps Involved
1. Focusing on the Item
At the very beginning of intelligent recognition process, we need to use an algorithm to recognize the item in the video on the client side. At the same time, we also need to track and focus on the item through using a tracking algorithm. This step is to ensure that the commodity recognized is consistent with user’s item.
2. Intelligent Recognition and Guidance
The item can be recognized in real time when the user is shooting the video. In addition, an algorithm can be used provide guidance for users to capture the core information of item in a more efficient manner. The purpose of the guidance here, of course, is to improve the overall effectiveness of the algorithm. Because, as the saying goes, one cannot make bricks without straw. So, if the content shot by the user lacks the necessary image information of the target item, the algorithm won’t be able to make a precise prediction.
3. Result Feedback and User Confirmation
When the user finishes taking the video, they will be provided with some options of identical commodities, so that they can simply choose the item-rather than manually listing it. Once the user selects an option, the structuralized association is completed. Additionally, the user can also decide to choose none of the options-if none are accurate. Through this process, the complex problem of commodity structuralization has been reduced to a few simple steps. The entire process costs little to no time to the user.
According to the above description, the core of the smart publishing solution is to transform the problem of commodity structuralization into a simple process of matching identical commodities with each other.
For this, the core technical challenges are as follows:
- Ensuring real-time commodity recognition at the posting stage.
- Using algorithms to maximize the success rate for matching identical commodities.
Is this technically feasible? From the perspective of Xianyu, we have three technical advantages to address this challenge:
- Mobile AI solutions such as AliNN (Alibaba neural networks) enable AI computing on the client side.
- We have possibly the largest commodity information library (that of Alibaba’s Taobao and Tmall e-commerce platforms) on this planet.
- Alibaba DAMO Academy provides powerful AI capabilities.
We can transfer some AI capabilities to the front end on the client side to greatly improve the relative speed, or timelessness, of the link. In addition, we can combine the AI recognition algorithms with the commodity information library of Taobao and Tmall to match identical commodities.
To achieve the preceding objectives, we construct a complete technical architecture for smart publishing.
First, let us take a look at our logical architecture.
The overall design consists of three layers:
- The User Interface (UI) layer for the presentation and interaction of commodities. The input and result feedback are processed at this layer.
- The logic processing layer. At this layer, the operation logic of intelligent recognition pipelines is controlled, and the processing results of sub-modules are distributed.
- The framework layer. Various core processing sub-modules reside at this layer.
The following aspects are considered in the design:
1. Maximizing R&D efficiency through using a variety of available technologies.
Flutter makes it easy to maintain consistency across multiple devices and platforms. We develop the capabilities at the UI layer by using this advantage of Flutter. In addition, we sink common algorithms to the C++ layer. This can improve the reuse rate and consistency of the logic at both ends.
2. Take advantage of the computing capabilities on the client side.
- Algorithms for fuzzy detection, similarity detection, subject recognition, and tracking are all implemented on the client side. This way, not only are the computing capabilities on the client side fully utilized, but the processing efficiency through camera inputs can be improved. This all helps to minimize dependency on network requests.
- Each uploaded image is compressed to approximately 10 KB with an improved compression algorithm. In this way, the total size of four requests is less than 40 KB. User traffic will not be overloaded.
- Use of the pipeline orchestration system.
Considering that in the future the processing logic of sub-modules will be inevitably adjusted for continuous system optimization, we have designed a flexible pipeline to manage all processing logic. This pipeline can flexibly combine Java/OC and C++ capabilities, and allow you to conveniently adjust the sequence of sub-features and add or delete features. The following figure shows the architecture design using Android as an example.
Pictures used for recognition are encrypted to minimize risks to the user’s privacy. Picture URLs that appear in public fields cannot be directly accessed.
We have also made a lot of optimizations to the algorithm side.
The core algorithm for smart publishing matches identical commodities. We improve the single-frame prediction algorithm to a multi-frame prediction algorithm, and innovatively integrate algorithms and interactions in depth to exhaust the extremes of the algorithm. The following figure shows the process.
If the algorithm finds that the current frame is insufficient to make an accurate algorithm prediction, it transfers the image information backward. In the transfer process, copy is used to provide timely guidance for the user to photograph the information required by the algorithm. The algorithm iterates until the commodity information is fully predicted. The following figure shows the processing logic of the algorithm.
According to our tests, the real-time processing performance is excellent with the recognition process is virtually transparent to users, except for when the messages appear. In addition, there are no performance problems, such as frame losses, which occur during normal shooting.
For the recognition of identical commodities, test results are as follows:
Generally, multi-frame-based recognition is approximately 20% more accurate than single-frame-based recognition.
Due to disclosure restrictions on restricted data, we cannot list the test data by category. However, we can draw these definite conclusions:
According to the tests, we find that the recognition success rate is relatively high in the large categories of cosmetics, perfume, makeup tools, beauty instruments, and toys. The core information of these commodities usually can be easily found and recognized from their packaging alone.
By contrast, the recognition success rate of products like children’s wear and sneakers is relatively low as information about these commodities is usually incomplete from their appearance. For these cases, the algorithm needs to work with the core information captured by the user, such as the brand. As a result, recognition becomes more difficult in these cases.
Visions of the Future
Smart posting will be available in the September release of the Xianyu App. After that, we welcome you to try it out and share your feedback with us. Video posting will be available first, and the release of other scenarios such as picture posting and activities will follow. Through the intelligent recognition program, we believe that the commodity structuralization rate can be continuously improved in Xianyu.
This program not only builds the complete real-time commodity recognition capability, but also enhances a number of core computing algorithms on the client end, such as the image preprocessing and tracking algorithms. This also allows us to implement AI capabilities with higher timeliness in more scenarios, such as scanning a specific commodity or logo to participate in a specific event.
The future release we have envisioned is highly intelligent. With a deep understanding of commodities from the camera, the system can directly provide publishing elements, such as the commodity information, structured tags, recommended prices, and even depreciation rate. All that you need to do is to confirm. Today’s intelligent posting is only the first step in a journey. We will keep working towards even bigger goals.