By Afzaal Ahmad Zeeshan, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.
Since the early days of NoSQL, many database engines have been developed but MongoDB is no doubt one of the pioneers in the scalable, fault-tolerant NoSQL database. MongoDB itself is document-oriented, meaning you store the documents, not objects and entities. And this the core concept of the NoSQL databases, you don’t manage the relations between the entities, rather, you store them with them. It’s like, instead of assembling Lego pieces as needed and returning them back to their own blocks, you prepare a complete object, and then store it in the box — saves time each time you have to show what you built. This is something that happens when we are trying to store the objects (the documents!), and data inside the MongoDB server. MongoDB internally takes care of several concepts for us, such as (but not limited to) sharding, replication, indexing and much more of similar operational management services.
Recently, I stumbled upon Alibaba Cloud services for databases, their ApsaraDB for MongoDB is a valid candidate to be considered in this case so I went ahead to give it a try. In this post, I want to show you how you can deploy your clusters on Alibaba Cloud using ApsaraDB for MongoDB, and how to you can manage certain operations on it, using the GUI control panel.
Pre-Planning and Concepts
Since this post targets the operations teams more than the development side, we will talk about how to optimize the performance, rather how to build better queries — although this falls under the same category, an optimized query will result in better performance. In this aspect, the major concerns are, since we are going to use the hosted solution for the services, we need to decide what our infrastructure has to support. Thus, let us first talk about the basic strategies one imposes in order for capacity planning of the infrastructure. Certain parts of the architecture design depends on whether you want the server to serve requests that are read-intensive, or write-intensive. MongoDB has built-in services to facilitate the write-intensive, as well as functionality to build read-intensive servers, by loading (or rather dumping) the data into the memory and then gaining the performance improvements from the in-memory data store.
Read or Write Intensive Workloads
General audience considers using MongoDB as a database, where you write, read, and perform all sort of actions on the database, as needed. But in most cases the reason to install and provision the MongoDB database, is to either use it as a front-end for the high-traffic and volume data entries (write-intensive), or in the cases where you need to provide a data store for the users to load the data from (read-intensive). In these both modes, the structure of MongoDB supports flexibility to mold itself as a cache store, or a high-throughput ingestion service. We will take a look at the sharding feature of MongoDB, when we will be checking out the write-intensive operations, and the replication sets when we will talk about read-intensive requirements.
For a quick overview of what I am talking about, and other in-depth details please watch this session, where the Solution Architect at MongoDB shares in-depth insights to fine-tune the database and the infrastructure. The title of the talk is, Hardware Provisioning, the reason I am sharing that is the fact that if you know how much hardware setup you require, will help you in better choosing a good option of ApsaraDB for MongoDB.
Architecture and Service Purchase
Almost everywhere you go, or on any platform where you see the term distributed, sharding, or replication, you will read people recommending to keep the cluster size to 3. The reason is, there is enough fault-tolerance in the cluster to continue to perform, and at the same time elect a new master node for the cluster. I read the same thing for DC/OS, I read the same thing on Stack Exchange on a thread, and that is the same thing that Alibaba Cloud has done internally with ApsaraDB architecture for MongoDB. Although documentation lacks in-depth details, but a good demonstration of the structure can be found here, on the website.
To better understand how the services are working, and how to get started, let’s check the portal out and create the services for ourselves. As already discussed, there are two separate service modes for the MongoDB on Alibaba Cloud platform, one that is better for the read-intensive approach, and another one that is good for the write-intensive approach. For the same amount of resources, the sharding approach is expensive (yes, because it is write-intensive!) and the replica set approach is really cheap. Alibaba Cloud automatically creates a 3-node replica internally, one being primary, one being secondary and one last as a hidden node that the platform can use, if a node goes down or for load balancing. Either way, the approach is to provide the high availability for the service on the platform. I assume, that you have an account with Alibaba Cloud, if not you can always sign up for a free account with them and try out these services, anyways, the following is the portal that shows how we can create the services, and you can see that basically we do not define the configuration for the service once it has been created or at the time of creation. Both the services themselves are separated out, and we have to make a choice whether our server needs to be read-intensive (replicated), or write-intensive (sharded). I am going to use the replicated, and once I am done with it, next will be a MongoDB instance that we all know of.
Figure 1: ApsaraDB for MongoDB instances, replica or sharding.
Since we are talking about the operation side, that is why it is important to know the difference between these options. Once you click on the Create Instance button, you will be redirected to another page where you will pay for the service instance. The page will show you the instance creation form, and you can use the interface provided to create the service. On that page you will be shown the amount you have to pay, as well as the resources for the server, such as the CPU/RAM resources, storage and other services.
Figure 2: ApsaraDB for MongoDB service purchase, as a Pay-as-you-Go offer.
This is the page, you can see that from the operational stand-point, we can see everything that will be made available to us. Just set the password, setup the values for the server such as the amount of data server has to serve, the hardware resources you require and if you want to hide the service behind any private network on the cloud. If all is good, you can go ahead and just buy the service. You will be shown the overall order that you are about to place.
You will then be forwarded to another page, where portal will show that the order has been placed, it will be up in a minute or two. After this, you can go back to the portal and manage the service you have just bought. For me, at this time, the service is stating that it is being created — Creating status — and thus I will need to wait for a minute or two before I can move ahead and show other controls from the panel.
Figure 3: Replica set showing a newly created instance on the portal.
But this does not prevent us from checking out the service, and finding the configurations that have been setup for our server.
Figure 4: ApsaraDB for MongoDB instance overview in the portal.
One thing that I really love in Alibaba Cloud is the concise and grid-like portal, I am more used to of Microsoft Azure, and I like having everything right in front of me, unlike in AWS, where I need to visit pages, and this portal gives me a very brief overview of everything that I have to check, and configure. Anyways, the portal is able to explain everything to me. Now let’s take a look around the portal, and configure a few things.
Operational side of the services, requires a heavy amount of logging, auditing and alerting mechanism in order to keep a track of the service health, and the usage. Alibaba Cloud provides the CloudMonitor service that is used to setup alerts on the services. If you see the image above, under the Basic Configuration you will find Alarm Rules, clicking on it takes you to the CloudMonitor service where you can create the rules for the service.
Another major service provided on the same page is the monitoring service, that monitors the resource usage, including but not limited to, CPU, RAM, Network, IOPS etc.
Upgrading the Service Resources
Another thing that operational side of the database requires is the capability to upgrade the resources, decrease them or what. So far, I have not yet discovered an easy way to automate this, you always have to hardcode this in. Here is a service API reference that you can check and manually upgrade and downgrade the services — CloudMonitor can help you build this service yourself. But until then, its the portal that is ‘going to help us out, and in that case basically we are provided with the same form we used to create the service.
Figure 5: ApsaraDB for MongoDB upgrade process.
The page again shows you everything that you need to see, and provides you with the controls you need. Somewhat this demonstrates that the service scales the instance of the virtual machine that is hosting your MongoDB server.
Last thing from the operational stand point, is the backups that are taken from time to time. We can get into the full details of that part sometime later, until then, the backups can be taken from the portal by clicking the “Backup Instance” button on the top right corner.
Connecting to the Database
I promised, this won’t be a development guide, nor will be. But from a simple and typical management point of view, it would be great to see how Alibaba Cloud provides access to the resources. Like any other cloud vendor, Alibaba Cloud also support multiple ways to expose the services, or hide the services, in the plain-Cloud. By default, mostly the services are made available on the public internet, and most services are hidden behind a private virtual network, that either requires a virtual private network (VPN) tunneling to communicate with the services, or does not allow any access to the public outside that cloud interface.
Oh, and by the way MongoDB service is currently, at the time of this post writing, only available on the private virtual network via ECS or any other service that can connect, it is not available on the internet. I have my app hosted at, “dds-gs5ab6eed1f3fa041.mongodb.singapore.rds.aliyuncs.com”, and this command will always result in a time out.
$ ping dds-gs5ab6eed1f3fa041.mongodb.singapore.rds.aliyuncs.com
Although you never have to do this, you can always check with Alibaba documentation and see what is the current status of connectivity for the service. For example, it is mentioned over here,
Currently, ApsaraDB for MongoDB is only accessible through ECS intranet. If you must locally access ApsaraDB for MongoDB by the public network, you can install rinetd on the ECS Linux server for port forwarding.
Thus, it is useless to create an ECS and then test the connectivity from inside there.
Setting the Whitelist IPs
For the services that have to connect — yes, even from within the cloud network — you need to setup some whitelist IPs and they will be able to connect easily, using the DNS names. This will enable the service to be available outside the pod of its own and the virtual machine that it is currently running in, and will provide the access to the server from multiple IP addresses, or the IP addresses that your corporate has, for example, take a look here,
Figure 6: ApsaraDB for MongoDB network configuration to allow some IP addresses to access the server.
You can read the label warning in the image, and the input I provided will enable every service we have on the network, inside the subnet or outside, to access the service and perform the database-related operations. This can be useful in many cases, where you want to expose the service for almost every virtual machine on the cloud, to be able to communicate. The scaling and sharding feature of this service will make it a good fit for the high-end requirements of your enterprise.
Although I am not planning to write a developer’s guide to this hosted MongoDB instance, because MongoDB development in every runtime has already been covered, I want to cover the operational side of the database engine.
So that pretty much sums up the creation, overview and removal of the database service on Alibaba Cloud. Among the few most important functions that this database engine service provide is the scalability, control and security. Service is good for the internal database engines for the virtual machines, and you can even migrate your existing databases to the ApsaraDB for MongoDB as well.