By Fang Xing.
At Apsara Conference 2019 held in Hangzhou, the technical team for AMAP, Alibaba’s mobile map and real-time navigation app, the Chinese equivalent of Google Maps, shared a great number of topics related to mobile technologies, covered everything from computer vision and machine intelligence, along with route planning and scenario-based and fine-grained positioning, all the way to spatio-temporal data application and the evolution to architectures that deal with tens of billions of visits. The presentations and lectures from the team at the Conference received an overwhelming response from participants. These lectures have now been compiled for online publication. This article is one of them.
The speaker for the content presented in this article is Fang Xing, who is a senior map technology expert at Alibaba. He delivered his presentation with the title of The Evolution to Scenario-based and Fine-grained Positioning Technologies. His presentation mainly discussed AMAP’s explorations and practices in improving the precision of positioning. This article is compiled based on the content he delivered. If you are interested in the implementation of positioning and related technologies at Alibaba, check out the other articles in this series.
The following is a brief transcript of Fang Xing’s speech:
Today I am going to discuss what can be categorized as scenario-based and fine-grained positioning technologies. AMAP’s positioning system is an important system at Alibaba, as it not only serves the AMAP app, but also provides positioning services accessible to all application developers and mobile phone manufacturers. Currently, more than 300,000 apps are using AMAP’s positioning service.
People use positioning services multiple every day, for example, when watching the news, hailing taxis, ordering food delivery, or even shopping. People provide location information through positioning services, and then enjoy various services based on this location information. Certainly, user experience improves as the precision of the location information improves.
AMAP has more than 100 million daily active users. Hundreds of millions of people use AMAP’s positioning service, initiating 100 billion positioning requests per day. Despite such a huge volume of data, AMAP’s positioning service can still respond within milliseconds. None of this would be possible without our hard work. In addition, we provide full-scenario-based positioning, which offers location-based services for mobile phones, on-board navigation equipment, and manufacturers of mobile phones or on-board navigation equipment.
Today, I will describe the evolution of positioning services by looking at four aspects:
- Challenges facing positioning technologies
- AMAP’s full-scenario-based positioning
- Scenario-based improvements to positioning precision
- Future opportunities
Challenges Facing Positioning Technologies
As you may know, although the Global Positioning System (GPS) can provide excellent precision in most cases, the precision of this positioning technology is low in some scenarios. For example, when a user is driving, the precision of GPS-based positioning is only 10 meters. As a result, it is even impossible to identify which side of the road the car is on when only GPS is used for positioning.
For another example, when GPS signals cannot be received indoors, how can we implement precise positioning? In addition, how can we overcome the tradeoff between precision and costs? After all, it is impossible to invest unlimited funds in the pursuit of high precision. The only solution is to improve the quality of algorithms and data by mining a massive volume of big data to continuously optimize the results. Only in this way can we implement precise positioning in all scenarios.
Currently, a wealth of positioning technologies are available for selection. Locations of Wi-Fi hotspots and base stations can be determined based on networks in addition to GPS. In principle, we can scan nearby networks to obtain a list of Wi-Fi hotspots and a list of base stations, measure their signal strengths, and search our database to locate the desired Wi-Fi hotspot and determine the location of the device.
Moreover, positioning can be implemented based on inertial navigation, a relative positioning technique. Specifically, the drifts between the current location and the previous location are continuously calculated. Then, the final location can be obtained by continuously calculating the drifts based on the initial location.
Furthermore, positioning can be implemented based on map matching. For example, if a GPS point is located in a lake, the positioning is obviously incorrect. In this case, the system can locate the nearest road through map matching, which improves precision.
Some other positioning techniques have also become prevalent in recent years. Concepts derived from the vision, radar, laser, and autonomous driving fields have driven the growth of these technologies, achieving varying levels of positioning precision. For example, in practice, vision-based positioning usually requires a large amount of computing and storage overheads.
In most cases, we need to obtain an initial location based on the location of a Wi-Fi hotspot, and then continuously optimize the location by using different data sources in different scenarios to improve the positioning precision.
How Does AMAP Implement Full-Scenario-Based Positioning?
AMAP is mainly applied in two service scenarios: mobile phones and on-board navigation equipment. On mobile phones, positioning is primarily based on GPS and networks. For vehicles, we also support positioning on specific roads by matching the determined location with the map.
In the past, many users reported positioning and navigation failures due to weak GPS signals. Approximately 60% of such failures occur in underground parking garages or tunnels, and 30% occur in places with overhead obstructions, such as under elevated roads or near high-rise buildings. This is due to the fact that GPS signals are obstructed in these scenarios.
What’s more, when we are making calls, although the base station we connect to may be less than a kilometer away, the signal may be occasionally interrupted, not to mention times when the GPS signal needs to travel vertically more than 20,000 meters. Therefore, GPS must be supplemented by other techniques, such as map matching or inertial navigation.
In short, to improve the precision of indoor positioning, we need to figure out how to mine data to determine the locations of Wi-Fi hotspots and base stations.
To improve the precision for the positioning of on-board navigation equipment, we will integrate more vehicle data.
Basic Positioning Capabilities
Network positioning is essentially a closed loop of data. During the positioning operation, the request sent by the user contains a list of base stations that serve the user and a list of Wi-Fi hotspots that serve the user. This data can be used for data training as well as positioning. After data training, two major types of data are output. One is the location of Wi-Fi hotspots and base stations. Through data mining, we can obtain an approximate location, which we call the initial location. However, the precision is relatively low. The other types of output data is a detailed spatial distribution diagram of signal strength. With this diagram, we can obtain a more precise location. Specifically, we can determine the distance from a given Wi-Fi hotspot or base station based on signal strength and then improve the precision of positioning accordingly.
After the closed loop is completed, positive feedback is obtained. More input data leads to more training results and a more accurate positioning result, which will attract more users and generate more data. This closed loop continuously improves the precision through data mining.
We also make efforts to continuously improve our algorithms. First, we apply the typical clustering model to the algorithm. Specifically, a list of base stations and a list of Wi-Fi hotspots are scanned and clustered, and then one of them is selected as the target location. This method is efficient because results can be obtained quickly. However, the precision is very low.
Second, we divide the space into fine-grained grids and collect statistics on some basic features in each grid, such as the number of historically determined locations, the number of positioning times, and the number of Wi-Fi hotspots, to calculate the score of each grid. Then, we sort the grids and select the highest-scoring grid as the one that contains your location. This method increases the percentage of positioning results that are accurate to 30 meters by 15%.
However, when this method is used, the benefits of manual parameter adjustment are limited because the parameter adjustment is limited. Therefore, the third step is to introduce a machine learning algorithm to this positioning process. Specifically, the model and parameters are optimized through supervised machine learning, so that the positioning precision can be significantly improved in specific scenarios. This is most commonly used in scenarios where major errors need to be overcome.
A typical problem is that the detected information may only include the details about one Wi-Fi hotspot and one base station. In addition, the Wi-Fi hotspot and the base station are a long distance away. This indicates that there is a 50% probability that the computing is incorrect, regardless of whether the Wi-Fi hotspot or the base station is selected. Supervised machine learning allows you to mine a massive volume of distribution data to figure out a subtle difference, and then achieve optimal global performance by selecting the base station in some cases and the Wi-Fi hotspot in other cases. This reduces the error rate by 50%.
The preceding figure shows our online neural network model. The neural network is widely used in online services. Here, we use the signal strength and hybrid features of the base stations and the Wi-Fi hotspots as feature input and add historical location features as a sequence to a recurrent neural network (RNN) model, to predict the current location. Further prediction is made based on this prediction result and the features of the base stations and the Wi-Fi hotspots. Lastly, the grids are scored. The grid where the user is most likely located is output.
The biggest challenge facing this method is not the technologies used in the algorithm, but the optimization of the performance and engineering feasibility of the algorithm. AMAP is called hundreds of billions of times a day. Therefore, to ensure a superior user experience, the latency must be less than 10 milliseconds.
In addition, the data volume is huge, and each piece of data contains many features. However, the online storage supports only dozens of TB of data, so it is impossible to store data of this magnitude online. Therefore, we need to make optimizations accordingly.
We have made optimizations in three aspects. First, we implemented hierarchical sorting. Specifically, we transformed the positioning process into steps similar to microscope adjustments. We implemented rough positioning first and then gradually narrowed down to a very precise location. During rough positioning, vast grids and few features are used to quickly filter out impossible locations.
Then, a larger number of fine-grained grids are sorted based on more features. This method can greatly improve computing efficiency and eliminate unnecessary computing.
Second, we simplified the model. Although deep learning achieves good results, it is impossible to use a complex model online. We reduced the floating point precision by reducing layers and nodes.
Third, we compressed the features based on the model. In the past, numerous features were input. After we added an encoding layer, the input features were encoded by the encoding layer to output only two bytes of feature data. In other words, only two bytes are stored online after all the online data and offline data are processed. As such, the data volume of online features is reduced by a factor of 10 to be less than 1 TB. These are the major issues we have resolved.
Precision Improvements in Different Scenarios
In indoor scenarios, the final location often incorrectly falls outdoors. This is related to the training process we have just described. Data is more likely to be collected outdoors and the Wi-Fi hotspot locations obtained after training are also located on roads. Therefore, the determined location is likely to be located on a road, which degrades user experience. For example, when you want to take a taxi, you may send a request from indoors. However, the system determines that you are located across the street, which may be incorrect as well. In reality, the system needs to determine the specific building where you are actually located and the street closest to the building.
How can we resolve this problem? One method is to collect data. In this case, we manually collect data indoors to make the distribution of the training data consistent with that of the actual prediction data. This method certainly ensures high precision. However, the costs are high, which is its major weakness. Currently, such data collection is only performed in busy shopping malls and transportation hubs, so this is definitely not a scalable method.
Instead, we introduce more data to optimize the positioning process. If the relationship between Wi-Fi hotspots and points of interest (POIs) can be determined by mining the map data, we can associate the data to improve the precision of positioning operations. For example, if you detected a Wi-Fi hotspot named KFC, the system determines that you may be located in a KFC. This method is relatively simple. However, the method we actually use is more complex.
We determine locations by performing inverse data mining based on the distribution of Wi-Fi signals. In the first figure, the blue boxes indicate the locations of building complexes, the red points indicate the actual locations of Wi-Fi hotspots, and the green points indicate the collected locations of the Wi-Fi hotspots. A darker green color indicates a stronger signal. We can input this image for image learning, for example, to find the exact location through the convolutional neural network (CNN). Then, we can set up an association between the Wi-Fi hotspots and the building complexes or POIs. In this way, 30% of all the Wi-Fi hotspots can be associated with corresponding POIs or building complexes.
In addition, during online positioning, the system needs to know when the user is indoors or outdoors. We use an algorithm that distinguishes between indoor locations and outdoor locations based on signal strength. The Wi-Fi hotspots and strength detected indoors are different from those detected outdoors. Therefore, we obtained a model by training the data based on these differences. In the second figure, the green points are predicted as indoor locations and blue points as outdoor locations. This method improves positioning precision by 15%.
In driving scenarios, some common problems occur during navigation. The first common problem is positioning failure in an underground parking garage or an obstructed location. The second common problem is drifting, which occurs due to the decrease in positioning precision when GPS is obstructed by buildings or other obstructions. The third common problem is failure for distinguishing main roads from side roads, which can cause drivers to miss turnoffs.
To resolve these problems, we implemented a fusion-based positioning solution that integrates software and hardware. Specifically, the software provides movement-based positioning and map matching. After integrating the software and hardware, we can ensure that more than 90% of GPS-based positioning results are accurate to 10 meters, and implement continuous navigation on main roads under an elevated road and in underground parking garages.
As such, the key concern is how to implement fusion-based positioning. Fortunately, our sensor modules for on-board navigation equipment cost less than 100 CNY (or 15 USD), making them cost-effective. In contrast, other similar products may cost several thousand CNY (or hundreds of USD), which is expensive. Cost-effective devices are easier to popularize. However, both the positioning precision and accuracy are low. To overcome this hardware defect, we need to leverage software.
Our solution consists of three steps. First, we integrate the heading direction of travel. The gyroscope can calculate the relative angle, and the accelerator can calculate the direction of gravity. By combining these results, we can establish a filter equation to continuously output the actual direction of travel. Second, integrate the three-dimensional direction and the GPS results to calculate a corrected location. Third, compare the corrected location with the map. We can know whether the user is going uphill or downhill and whether the user is on or under an elevated road, provided that we know the user’s location and heading direction of travel. Then, we can compare the matched location and the original GPS location. If the difference is very significant, the GPS result may have suffered from drifting. When this is the case, we discard the GPS data and use inertial navigation to deduce the location.
This solution provides several advantages. First, the parameters are dynamically calibrated, so no initial calibration is required for devices. We calculate the three-dimensional direction and report the user’s location through map matching. The key to map matching is matching the location with the map by using the HMM algorithm and deducing the route that covers all the points. In this process, the transmission probability and the movement probability are essential.
Second, we take the angle into account because an angle change also affects the movement probability. Different from the location movement probability, this feature introduces speed as a variable. The turning probability varies with speed. However, a car may slow down or speed up when turning. Therefore, we draw a curve for each speed.
On the top left corner, the upper image shows positioning results, with the red points indicating corrected positioning points and the blue points indicating original GPS points. The lower image shows a scenario where a car runs under an elevated road. As you can see, the original GPS points obtained when the car is under the elevated road are scattered everywhere. After correction, the positioning points almost overlap with the green points. On the bottom left corner, the image shows positioning results for a car running in a parking garage. The blue points disappear when the car enters the parking garage. However, the red points can perfectly restore the user’s continuous trace in the parking garage.
AMAP implements high-precision positioning based on two primary positioning capabilities: image-based positioning and fusion-based positioning. Image-based positioning ensures decimeter-level precision only based on images. Fusion-based positioning introduces two major new positioning technologies: visual simultaneous localization and mapping (VSLAM) and differential GPS. These two positioning capabilities are used to ensure high precision in scenarios with and without GPS respectively. In particular, VSLAM-based positioning results have a smaller deviation because they can be corrected through image-based positioning.
In addition, autonomous driving is a future trend. To achieve autonomous driving, we must transition from aided driving. However, before a complete change in how we drive occurs, more gradual changes may present themselves. Specifically, the navigation services may be fine-grained to provide lane-level navigation, which requires that maps are accurate at least to decimeters.
Now, let’s discuss the future development of positioning technologies. First, we believe that 5G will open up new possibilities for basic positioning capabilities because the higher frequency and shorter wavelength of 5G will allow it to be used for ranging. In contrast, positioning technologies based on base stations and Wi-Fi hotspots were actually based on signal strength. 5G’s support for ranging will ensure a higher degree of precision. So new 5G-based positioning technologies will achieve results similar to GPS.
Second, as new data sources continue to emerge and new algorithms are developed to make full use of them, fusion-based positioning will achieve a much better overall performance. For driving, vision-based positioning and differential GPS technologies will become the most popular choices. Indoor positioning will be improved through ultra-wideband-based positioning and precise positioning based on Bluetooth and Wi-Fi. In addition, according to the latest technical standards, ranging and angle measurement technologies are also supported. In other words, new Bluetooth-or Wi-Fi-based apps may provide some positioning capabilities in the future.
Therefore, in the next 10 years, we may see that these methods are integrated with each other to substantially improve positioning precision. That’s all. Thank you!
Are you eager to know the latest tech trends in Alibaba Cloud? Hear it from our top experts in our newly launched series, Tech Show!