Internet of Vehicles — Window Querying with Alibaba Cloud RDS for PostgreSQL


Internet of Vehicles (IoV) is one of the hottest topic of research in the field of Internet of Things (IoT). A typical scenario for Internet of Vehicles applications is collecting vehicle’s travel tracks, but tracks of vehicles are usually not reported in real time. Several track records may be accumulated or reported at intervals.

A typical data structure is as follows:

Heavy traffic and traffic signals often occur during the vehicle traveling process. The reported track records may be as follows:

That is, multiple records in the same position may be uploaded due to heavy traffic or traffic signals.

Therefore, there is a requirement to clean unnecessary wait records from the database. We keep at most two records for a given point, indicating arrival and departure.

This operation can be executed via window function.

Surely, in terms of providing the best efficiency, it is more reasonable to clean tracks from a terminal. Only two records will be kept for the start point of a position.

Implementation Example

1.Design a table structure

2.Generate 10 million pieces of test data and assume there are 1,000 vehicles (in order to make it easy to repeat the data and test the effect, 25 points are used at the position).

3.Create an index

4.Query a data layout

As shown on the list, several pieces of data are repetitive.

5.Filter the records at a single position via the window. At most two records for arrival and departure from the position will be kept.

Two window functions are used here: lead and lag. Lag indicates the previous record of the current record and Leadindicates the next record of the current record.

The method of determining arrival and departure points is as follows:

•The current position is not equal to the previous position, indicating that the record is the current position’s arrival point.

•The current position is not equal to the next position, indicating that the record is the current position’s leaving point.

•The previous position is empty, indicating that the current record is the first record.

•The next position is empty, indicating that the current record is the last record.

When track cleaning is not performed, the obtained results are as follows:

Lag and lead are used to clear records in the stay process.

Database Optimization for IoV

In a typical scenario, a lot of vehicle IDs are involved in the business. Data gathered for different vehicles is usually written into a database. If no optimization is made, after entering the database, the data of different vehicles may be staggered. That is, the data of different vehicles may be stored in one data block.

A lot of data blocks will be scanned (scanning IO amplification) when the track of a single vehicle is queried. There are two optimization methods to speed up the querying process.

1.Write into the database after the business end gathers grouping and sorting.

For example, after receiving the data submitted by the vehicle terminal, the program groups vehicle IDs, sorts them by time, and writes them into the database (using insert into tbl values (),(),…();). In this way, the data of the same vehicle will fall into the same data block as much as is possible.

2.Use a partition to reorganize the data in the database.

For example, we can store data based on the vehicle ID, every vehicle, or vehicle HASH partition.

The two methods both relate to reorganizing the data based on query requirements to achieve the purpose of decreasing scanning IO. This method is similar to PostgreSQL Securities Industry Database Requirement Analysis and Application (article in Chinese).


Follow me to keep abreast with the latest technology news, industry insights, and developer trends.