By Grace Amondi, Alibaba Cloud Community Blog author.
In this part of this two-part tutorial series, you’re going to learn about how to create your own PostgreSQL spatial database that will be able to take advantage of all the features of PostGIS for improved spatial data management. By the end of this tutorial, you will have a database that you will use to manage spatial queries, which is something you’ll learn about in the second part of this two-part series.
A Bit about PostGIS
Before we get our hands dirty let’s make sure that we understand the terms that appear in this tutorial:
PostGIS is a powerful open-source tool that allows us to develop robust spatial databases. PostGIS can be considered an extension of database management system (DBMS) of PostgreSQL that can manage spatial data. PostGIS enables you to store geographic objects as part of our data tables.
Geographic Information Systems (GIS) are systems that are designed to manage and analyze spatial data. These systems show this data in an intuitive, visual manner.
Why Use PostGIS? What Are Its Advantages?
PostGIS is one of the most reliable open-source DBMS available currently, having been on the market for some time already. It hosts several advantages over other alternatives:
- Its ability to build indexes in any kind of data, having a generic index structure.
- It has many useful spatial functions to search, analyze, convert and manage spatial data.
- It has both vector and raster data support.
Vector data represents phenomena in the world in terms of points, lines and polygons.
Raster data represents the world in terms of cells of predefined, grid-shaped tessellations.
- It’s based on open standards as defined by the Open Geospatial Consortium.
- It’s supported by other well-proven open-source projects, such as Proj4, Geometry Engine — Open Source (GEOS), and Geospatial Data Abstraction Library (GDAL).
- It is compatible with almost all major open- and closed-source GIS Software.
For you to complete this tutorial, you’ll need to have the following:
- A valid Alibaba Cloud account. If you don’t have one already, sign up to the Free Trial to enjoy up to $300 worth in Alibaba Cloud products.
- An ECS instance that is installed with Ubuntu 16.04. You can select your preferred region and configurations; this will not affect the outcome of the server setup.
- A root password for your server.
- An ApsaraDB RDS for PostgreSQL instance connected to your Ubuntu 16.04 instance.
For now, you will need to first begin with the PostgreSQL command line interface, which you will install PosegreSQL on our Alibaba Cloud ECS instance. You will later learn how to connect to your Apsara RDS for PostgreSQL.
Then, when connected to your ECS instance using SSH tunneling on your terminal, enter the following commands to install PostgreSQL.
$ sudo apt-get update$ sudo apt-get install -y postgresql postgresql-contrib
Connecting to ApsaraDB RDS for PostgreSQL
When you connect to an RDS instance through a client, choose to use an intranet or Internet address as follows:
- Use the intranet IP address associated with the ECS instance on which your client is installed and that is located in the same region and the same network type as the RDS instance to be connected to it.
- Use an Internet IP address for the other situations.
Also, include the new ApsaraDB credentials to database configurations:
- Host name/address: Refers to the connection address of the RDS instance. If your application accesses the RDS instance through the intranet, enter the intranet IP address of the RDS instance. If your application accesses the RDS instance through the Internet, enter the Internet IP address of the RDS instance.
- Port: Refers to the port number of the RDS instance. If your application accesses the RDS instance through the intranet, enter the intranet port number of the RDS instance. If your application accesses the RDS instance through the Internet, enter the Internet port number of the RDS instance.
- User: Refers to the initial account name of the RDS instance.
- Password: Refers to the password of the initial account name of the RDS instance.
All these configurations can be found at the RDS Console. You will need to select the region where your instance is located, and click the ID of the instance to visit the Basic Information page. In the Basic Information area of the console, you can find the connection addresses and port numbers of the RDS instance.
Now install PostGIS in the database server. Run the following command:
$ sudo apt-get install -y postgis postgresql-9.5-postgis-2.2
Install Geospatial Libraries
We will be installing a special tool that is part of PostGIS’S infrastructure, a library known as Geospatial Data Abstraction Library (GDAL). It will allow us to covert spatial data from one form to another and to transform and re-project raster data files. You can use this command to install the tool:
$ sudo apt-get install binutils libproj-dev gdal-bin`
Next, you can confirm the installation and version with the
gdalinfo --version command. Then, if everything was doen correct, you'll get the following result:
GDAL 1.11.3, released 2015/09/16
Create a Spatial Database
Now that we have installed PostgreSQL with the PostGIS extension, we are ready to develop a spatial database. A spatial database allows you to be able to create geometric fields to store the geographic data (such as points, lines or polygons).
- A point is represented by a single x- and y-coordinate pair.
- A line is represented by a set of x- and y-coordinate pairs that define a connected path through space.
- A polygon is represented by a boundary consisting of several lines that form a closed loop, or more precisely a shape.
Begin by the entering the
sudo -u postgres createdb mydatabase command. This should create a database named mydatabase. Next log on to a Postgresql database using the following command:
psql -h <your_host> -d mydatabase -U <your_user>
For creating the spatial extension, we will turn our database into a spatial one with spatial functions and analyze options on your data by executing the following command in the psql prompt:
mydatabase=# CREATE EXTENSION postgis;
After that you will see that the process was successful by the following output:
mydatabase=# CREATE EXTENSION postgis;
Install PostGIS Preview
With everything out of the way, we are left with installing PostGIS Preview which is a lightweight node API and front end for quickly previewing PostGIS queries created by NYC Planning. You can check out the repository on Github .
The express.js app has a single endpoint:
/sql that is passed a SQL query q as a url parameter. That query is passed to PostGIS using the pg-promise module. The resulting data are transformed into topojson using a modified dbgeo module (modified to include parsing WKB using the WKX module), and the response is sent to the front end.
The front end is a simple Bootstrap layout with a Leaflet map, CartoDB basemaps, a table, and a SQL pane. The TopoJSON from the API is parsed using omnivore by Mapbox, and added to the map as an L.geoJson layer with generic styling.
1. Go ahead and install PostGIS preview by typing in the following commands:
2. Create a
.env file in the root of the repo based on
.env-sample, setting DATABASE_URL with a connection string for your database, and make sure to use ApsaraDB for PostgreSQL credentials:
3. Install dependencies with the
yarn install command.
4. Run the express app yarn start.
5. Load the frontend at
You should finally have something like this:
We will be using this window later in our tutorial.
Loading GIS Data
We will begin with downloading the data we will be using downloading the data from Open Street Map.
Enter the following command:
And then unzip the file:
$ unzip kenya-latest-free.shp.zip. If you check the files that have been extracted you will notice that they include the following extensions:
.shp: Holds the vector data for the world borders geometries.
.shx: Spatial index file for geometries stored in the .shp.
.dbf: Database file for holding non-geometric attribute data (such as integer and character fields).
.prj: Contains the spatial reference information for the geographic data stored in the shapefile.
We will use ogrinfo to examine spatial data. The GDAL ogrinfo utility allows examining the metadata of shapefiles or other vector data sources. We will specifically be examining that of places shapefile:
$ ogrinfo -so gis_osm_places_a_free_1.shp gis_osm_places_a_free_1
It should display something similar to the code below:
INFO: Open of `gis_osm_places_a_free_1.shp'
using driver `ESRI Shapefile' successful.Layer name: gis_osm_places_a_free_1
Feature Count: 839104
Extent: (33.933116, -4.665723) - (41.877231, 4.544926)
Layer SRS WKT:
osm_id: String (10.0)
code: Integer (4.0)
fclass: String (28.0)
name: String (100.0)
type: String (20.0)
Note that if it were the case of examining a raster data, we would use gdalinfo instead of ogrinfo .
Using shp2pgsql Data Loader
shp2pgsql Data Loader simply converts ESRI Shape files into SQL suitable for insertion into a PostGIS/PostgreSQL database. Let’s see how it works.
Type the following command.
shp2pgsql -g geom gis_osm_places_a_free_1.shp places > places.sql
If everything goes well then a places.sql file should be now be present in the same location as the shapefiles.
Now we can upload the data to the database using the following command:
psql -d mydatabase -f places.sql
If you get a warning such as role does not exist, then it will mean that you have to switch users to postgres using the
sudo su postgres command, then repeat the psql command. Last, you can exit with the
Converting from Shapefile to GeoJSON
We had earlier mentioned that we were going to use PostGIS Preview to view the final output of the sql queries we perform. It only supports data that is in GeoJSON format (format for encoding a variety of geographic data structures).
This therefore means that we will need to convert our shapefile into geojson. We will be using ogr2ogr which is a ogr command line utility that converts one Ogr defined data source to another Ogr data source.
Using this simple command, beginning with the geojson file which we will name it places:
$ ogr2ogr -f geoJSON places.geojson gis_osm_places_a_free_1.shp
Load GeoJSON Data to Database Using ogr2ogr
We can also use ogr2ogr to load the geojson file to the database directly. All we need to do is input the correct datase name, the geojson file and also the table we will be creating.
Since we already have a table named places let’s name this places and load the data by running the following command:
ogr2ogr -append -f "PostgreSQL" PG:"dbname=mydatabase" places.geojson -nln places
If you have followed every step correctly in this file, then you probably have a PostgreSQL database populated with spatial data. You have learned how to populated your database with either a shapefile, sql file or a GeoJSON file. You have also known how to use GDAL to convert a shapefile into a geoJSON as well as check the metadata for a shapefile.
If everything went well you should now proceed to the second part of this tutorial series where you will perform both simple and complex non-spatial and spatial function queries.