Designing a Cloud-based Architecture for Internet of Vehicles: IoV Series (II)
After analyzing the traditional IDC application architecture, we found that many pain points are caused by the system architecture. To solve these pain points, we decided to migrate data to the cloud. We started to think over how to utilize cloud services to solve the pain points we encountered, for example,
- To tackle with the poor reliability of the bottom-layer infrastructure for the local IDC that we created, we turn to cloud computing services for infrastructure reliability, remote disaster tolerance and data backup, and solves data security issues.
- To solve storage performance bottlenecks and user access experience issues, we turn to the Object Storage Service (OSS) and CDN on the cloud.
- To solve performance expansion bottlenecks of a single database, we turn to the Distributed Relational Database Service (DRDS) on the cloud.
- To solve the data writing latency caused by large scale OBU reporting, we turn to the IoT suite and HiTSDB on the cloud.
- To tackle with daily and holiday traffic peaks, we turn to the Elastic Compute Service (ECS) and Auto Scaling service in Pay-As-You-Go mode to solve the issue at the minimum cost.
- To solve the big data storage bottleneck and reduce big data development and analysis difficulties, we turn to the MaxCompute and HBase cloud services.
- To solve O&M automation issues and improve O&M efficiency, we turn to the CodePipeline, CloudMonitor, Log Service, and Container Service.
- To solve the security defense bottlenecks, we turn to Alibaba Cloud Security products Anti-DDoS Pro, WAF, and Bastion Host cloud services.
- To solve the load balancing and network expansion bottlenecks, we turn to the SLB cloud service.
- To reduce the complexity of migration to the cloud, we turn to the VPC cloud service without changing the original IP addresses.
- To ensure data migration stability and convenience, we use Alibaba Cloud’s data migration tool Data Transmission.
Architecture Alignment on the Cloud
Our new application architecture on the cloud is compatible with some features of the historical application architecture. We also use the new cloud technologies and services to solve our pain points and bottlenecks. In addition, the new architecture on the cloud satisfies our business development plan for the next two to three years and supports our application system architecture in the ten million user level. The following figure shows our application architecture on the cloud:
Security
The security defense performance of our previous IDC is relatively weak. To solve the security defense bottlenecks, we turn to Alibaba Cloud Security, Anti-DDoS Pro, WAF, and Bastion Host cloud services.
Users can configure Anti-DDoS to direct the attack traffic to a protected IP address to ensure the stability and reliability of the origin site. We also choose the Bastion Host to replace the open-source bastion host. Compared with an open-source bastion host, Alibaba Cloud Bastion Host offers auditing compliance, support for multiple protocols, and tracking and replay functions, which are highly efficient and easy to use.
SLB Cluster
To solve the load balancing and network expansion bottlenecks, we turn to the SLB cloud service. The Guaranteed Performance Type V (slb.s3.medium) is recommended for the service volume with 5 million users and 500,000 connections in peak hours. Guaranteed Performance Type V supports a maximum of 500,000 connections, establishment of 50,000 connections per second, and 30,000 QPS, which satisfies the company needs at the present. If the business and user volume increase in the future, it can be expanded online to an SLB instance with higher level specifications. If the user volume reaches the ten million level, you can contact Alibaba Cloud account manager to activate an SLB instance with the 1 million specification.
Application Server Cluster
Alibaba Cloud ECSs are used as the application servers in application environment deployment. As we’ve mentioned, the running environments are mainly Java and PHP environments, with a small part being Node.js.
Java environment: Centos7 + JDK1.7 + Tomcat7
PHP environment: Centos7 + PHP5.6.11
Node.js environment: Centos7 + Node8.9.3
Based on the IoV industry characteristics, ecs.c5.xlarge (quad core, 8 GB) is recommended for frontend Web applications, and ecs.g5.xlarge (quad core, 16 GB) is recommended for backend applications.
Distributed Service Cluster
The distributed service cluster uses the Dubbo + ZooKeeper distributed service framework. Seven ecs.c5.2xlarge (octa core 16 GB 200 GB SSD ) ECS instances are used to construct the ZooKeeper cluster. A ZooKeeper cluster must have an odd number of nodes. In that case, the whole cluster is available when more than half nodes function properly in the cluster.
Cache Cluster
The cache cluster is based on the Alibaba Cloud Redis ApsaraDB. Traditional self-constructed Redis databases are usually difficult to manage and maintain, and feature high complexity in cluster node expansion.
Message Queue Cluster
Alibaba Cloud Message Queue Kafka is used for the message queue services. Alibaba Cloud Message Queue service solves various problems that we encountered when using open source Kafka message queue services.
Streaming Computing Cluster
Alibaba Cloud StreamCompute is used for streaming computing on the cloud allowing you to perform real-time analysis on big data.
Data Storage Clusters
We use Alibaba Cloud ApsaraDB for RDS for MySQL.
HBase Cluster
We use Alibaba Cloud ApsaraDB for HBase.
MongoDB in the traditional architecture stores original data reported by a vehicle, helping data tracing in special cases and data compensation in the case of data loss. Generally, the volume of data to be written is larger than that to be read. A MongoDB cluster would be subject to sharp performance reduction when it reached certain scale due to lack of deep understanding and incorrect use of MongoDB. Now, MongoDB clusters are replaced with ApsaraDB for HBase. ApsaraDB for HBase is ideal for massive data storage, business dashboard, security risk control, and searching thanks to its support for high-concurrent large data volume.
Elasticsearch Cluster
We use Alibaba Cloud Elasticsearch.
A traditional user-created Elasticsearch cluster is typically complex to scale-up and difficult to maintain. Now, we use the Alibaba Cloud Elasticsearch service. This service is preset with abundant plugins and integrates with X-Pack plugin, providing enterprise-level permission control, real-time monitoring, and other powerful functions.
File Storage Cluster
We use Alibaba Cloud OSS for file storage.
The previous user-created NFS file system becomes slower in response to extension and access as the number of files increases. Now, we use Alibaba Cloud OSS+CDN solution with minor modifications to applications.
Big Data Computing Platform
We use Alibaba Cloud MaxCompute as a big data computing platform.
Intelligent IoV platform collects massive vehicle driving data every day, such as engine state, driving behavior, fuel consumption, mileage, and travel track. We need to process and analyze the massive data. For example, we need to prepare daily travel mileage statistics, fuel consumption statistics, and monthly driving behavior report. Data volume is small at the early stage, so Kettle is used to extract data and perform other work. Most of ETL work is completed in the MySQL-based data warehouse. Multiple data sources use Presto (cluster) as the query middleware for data analysis. However, as the business grows crazily, data becomes more and more complex when the size of a single data table reaches several terabytes and the disk size reaches several hundred gigabytes. In such case, the MySQL-based basic data warehouse no longer meets the requirements, so the response time for a query is usually long or even an execution failure occurs due to memory crash, greatly affecting the work efficiency. Therefore, we use MaxCompute to build a big data development and analysis platform.
O&M and Control Clusters
In the past, O&M mainly relied on manual O&M and script-based O&M, featuring extremely low automation level, frequent failures, and difficult fault locating. Our O&M engineers spent a lot of time on repeated bug fixing and troubleshooting. Alibaba Cloud services such as CloudMonitor can help us solve these pain points.
Log Management
We use Alibaba Cloud Log Service for log collection, log analysis, and log search.
Elastic Scaling
We use Alibaba Cloud Auto Scaling to process daily and holiday traffic peaks at a low cost.
An obvious feature of the IoV industry is that the data traffic during morning and evening peak hours is more than 3 times higher than that during normal hours. To process such a high concurrent traffic, the resource input is also more than 3 times greater. In the traditional IDC architecture, we generally prepare server resources enough for processing 1.2 times of peak traffic (as a buffer for special cases), but most of them are idle in normal hours with resource utilization of less than 30%. It means that 100 sets of servers may be enough in 18 normal hours but 360 sets of servers need to be provided to process the traffic in 6 peak hours every day. This is to ensure the system stability and improve the user experience.
Domain Name Management
We use Alibaba Cloud DNS, an all-in-one service for domain name purchasing, management, and filing.
Continuous Integration
Traditional application upgrades mainly relied on manual upgrades or script-based upgrades. Later, we made attempts to build a simple application release system using open-source Jenkins+docker. We expect to retain this release mode in the cloud, so we use CodePipeline.
Container Management
We use Alibaba Cloud Container Service as an all-in-one solution for container lifecycle management and cluster management.
Centralized Configuration
We use Alibaba Cloud Application Configuration Management (ACM) for centralized configuration. In the traditional IDC architecture, configurations of all applications are managed centrally to meet the microservice architecture requirements. Configurations are stored in Zookeeper and managed using a web frontend. An application requests configurations from the server via the local client. In this way, application configurations are centrally stored and configured and easily managed. However, our web configuration management center only provides simple functions and even does not support permission management, snapshot configuration, backup, and restoration. Now, we use Alibaba Cloud ACM for centralized configuration.
Monitoring System
We use Alibaba Cloud CloudMonitor as the monitoring system. In the traditional IDC architecture, the user-created Zabbix monitoring system is used. With the rapid development of business, metrics increases from 500 to 30,000, and monitoring demands are diversified and customized. As a result, the database performance is insufficient, resulting in slow query speed, frequency alarm delay, and more false alarms. The traditional monitoring system lags behind the steps of the rapid business development. Therefore, we use CloudMonitor, a service that monitors Alibaba Cloud resources and Internet applications.
Data Visualization
We use DataV to solve UI design issues regarding O&M and monitoring dashboards.
Database O&M
We use Alibaba Cloud Data Management Service (DMS) for database O&M.
Addressing Design Issues with Cloud Technologies
Issue 1: The access of mass vehicle devices leads to high network latency, difficult device management, and poor security
Solution: Use Alibaba Cloud IoT Hub to manage and report a large amount of vehicle data. IoT Hub is an all-in-one device management platform launched by Alibaba Cloud for developers in the IoT field. It enables users to implement full-stack services, including data collection, computing, and storage, simply by using the rule engine to configure rules on the Web.
Issue 2: Most of IoV application scenarios pose high requirements on real-time data. However, due to insufficient database writing performance, data writing latency often occurs during collection of massive data.
Solution: Use Alibaba Cloud High-Performance Time Series Database (HiTSDB) to solve writing latency of massive data. According to the tests by relevant authorities, one connected vehicle can collect 25 GB data per hour. Conventional databases are not designed to process data of this scale, and relational databases have poor performance in processing big data sets. NoSQL databases can process large volumes of data well, but they are not as good as databases that are fine-adjusted for time series data. By comparison, time series databases are optimized for this purpose.
Issue 3: User-created big data platform is costly, difficult to maintain, and lack of big data engineers.
Solution: MaxCompute + DataWorks + ApsaraDB for HBase
MaxCompute provides improved data import solutions and a variety of typical distributed computing models. Furthermore, DataWorks works well with MaxCompute and provides MaxCompute with an all-in-one toolkit for data synchronization, task development, data workflow development, data management, and data O&M.
Issue 4: A single-host MySQL database is difficult to expand in the face of I/O performance and capacity bottlenecks when the service and user volume keep increasing.
Solution: Alibaba Cloud DRDS. By taking into account how the services are split up, DRDS provides an efficient way to perform the operations, which meets the demand of the online services for relational databases.
Data Migration Strategy
Database Migration Strategy
Database migration is the most important and difficult stage in cloud migration. During database migration, we need to reduce the impact on the business to the minimum possible. Database migration without service interruption and downtime is the best. A detailed plan and migration strategy must be drafted.
- Migration tool: Alibaba Cloud Data Transmission Service is recommended.
- Migration time: The service traffic trough period is recommended, such as 0:00 to 5:00 AM.
- Migration method: Generally, service databases are in master/slave mode. You can synchronize the slave database to the cloud database to reduce the migration impact on the master database. If the condition allows, you can choose a single slave database for full synchronization to the cloud database. Then, you can switch to the master database for incremental data synchronization. In this way, the offline database and online database are consistent. For details about the migration steps, visit https://www.alibabacloud.com/help/product/35732.htm
File System Migration Strategy
A self-constructed NFS file system was used to store images and files. As files continue to increase, the image access speed keeps to decrease. After migration to the cloud, as Alibaba Cloud OSS and CDN services are available, OSS for direct transfer on the web client is constructed in the following architecture:
User request logic:
- The user obtains the upload policy and callback settings from the application server.
- The application server returns the upload policy and callback settings.
- The user sends a file upload request directly to the OSS.
- After the file data is uploaded, the OSS sends a request to the user’s server based on the user’s callback settings before sending a response to the user.
- If the server returns ‘success’, the OSS returns ‘success’ to the user. If the server returns ‘failed’, the OSS returns ‘failed’ to the user. This ensures the application server be notified of all images that the user has successfully uploaded.
- The application server returns information to the OSS.
- The OSS returns the information returned by the application server to the user.
The combined use of OSS and CDN services speeds up file storage and access and improves user access experience.
The process for handling HTTP requests after the deployment of CDN is as follows: