First we need to understand the items that take up RDS storage. There are five storage types in the RDS console, namely total disk storage, data storage, log storage, temporary file storage, and system file storage. In the following sections, we will look at four of the five storage types that typically causes confusion to RDS users.
Data files refer to the files stored in the storage. In the database, they are tables. The tables are mainly composed of data and indexes. Therefore, when you find that your data files are occupying a great deal of instance storage, you should further check which table is most responsible. You can see which table is holding the most system storage through data dictionaries:
select TABLE_SCHEMA,TABLE_NAME,INDEX_LENGTH/1024/1024 as index_M,DATA_LENGTH/1024/1024 as data_M from TABLES order by (INDEX_LENGTH+DATA_LENGTH) desc limit 10
Preparation prevents poor performance, so we should take certain precautions against excessive storage use starting from the application design phase:
- The size of disk space depends on potential data growth trends;
- Whether to delete or archive data depends on the data retention period;
- When designing tables, select reasonable data types, field sizes, and storage engines for sharding and table sharding.
ApsaraDB for RDS MySQL uses a master/slave M-M high availability architecture. Mater/slave data synchronization relies on binlog. RDS will regularly back up the logs to OSS and then clear the local binlog to diminish the amount of space it takes up. When log storage encounters an error like the one in the figure below, the rate at which binlog grows may exceed the upload speed from RDS to OSS.
This will cause binlog to grow even faster. Under such circumstances, you must optimize the database to reduce the rate at which changes get done.
These are files written from memory to disk when the database performs large file operations, and memory proves insufficient. This may lead to the creation of substantial temporary files during significant database operations (order by, group by, distinct).
This refers to the files created during the installation of the database. These system files are crucial to the normal operation. For MySQL, these files include ibdatal and ib_logfile0. The below graphic shows “other files” taking up a large amount of space. In this situation, you can refer to how to locate the problem causing ibdatal to grow continuously.
For common scenarios to optimize MySQL database storage, you can check in this article.
Related Blog Posts
You can deploy fast, secure and trusted MySQL database instances from Alibaba Cloud. Alibaba has an advanced network of cloud based technologies and their breaking performance and flexible billing has enabled cloud without borders for its over one million paid customers.
Alibaba Cloud has continued to show enormous contribution to the open source communities and has empowered developers worldwide. Alibaba Cloud was the winner of the prestigious 2018 MySQL Corporate Contributor Award and is also a platinum sponsor of the MariaDB foundation.
In this guide, we will take you through the steps of optimizing SQL queries and databases on your Alibaba Cloud Elastic Compute Service (ECS) instance. This will guarantee stability, scalability, reliability and speed of applications and websites running on your Alibaba Cloud instance.
Tip #1: Index All Columns Used in ‘where’, ‘order by’ and ‘group by’ Clauses
Apart from guaranteeing uniquely identifiable records, an index allows MySQL server to fetch results faster from a database. An index is also very useful when it comes to sorting records.
MySQL indexes may take up more space and decrease performance on inserts, deletes and updates. However, if your table has more than 10 rows, they can considerably reduce select query execution time.
It is always advisable to test MySQL queries with a ‘worst case scenario’ sample amount of data to get a clearer picture of how the query will behave on production.
Tip 2: Optimize Like Statements with Union Clause
Sometimes, you may want to run queries using the comparison operator ‘or’ on different fields or columns in a particular table. When the ‘or’ keyword is used too much in where clause, it might make the MySQL optimizer to incorrectly choose a full table scan to retrieve a record.
A union clause can make the query run faster especially if you have an index that can optimize one side of the query and a different index to optimize the other side.
Tip 3: Avoid Like Expressions with Leading Wildcards
MySQL is not able to utilize indexes when there is a leading wildcard in a query. If we take our example above on the students table, a search like this will cause MySQL to perform full table scan even if you have indexed the ‘first_name’ field on the students table.
Tip 4: Take Advantage of MySQL Full-text Searches
If you are faced with a situation where you need to search data using wildcards and you don’t want your database to underperform, you should consider using MySQL full-text search (FTS) because it is far much faster than queries using wildcard characters.
Furthermore, FTS can also bring better and relevant results when you are searching a huge database.
Tip 5: Optimize Your Database Schema
Even if you optimize your MySQL queries and fail to come up with a good database structure, your database performance can still halt when your data increases.
Tip 6: MySQL Query Caching
If your website or application performs a lot of select queries (e.g. WordPress), you should take advantage of MySQL query caching feature. This will speed up performance when read operations are conducted.
MySQL cluster is a distributed database management system that implements multi-master shared-nothing architecture. In this design, data is replicated over a group of servers to avoid single point of failure. Any data node can update the database and the changes are propagated to the rest of computers in real-time.
The clustered MySQL database approach is tolerant to failures due to hardware and software redundancy across multiple servers. Data is mirrored, replicated and transactions logs copied to all servers to ensure high availability.
MySQL server uses NDB (Network DataBase) storage engine to access data from the cluster. NDB or NDBCLUSTER engine is transaction safe (ACID compliant) and has a high degree of uptime and availability.
NDB cluster provides auto-sharding (partitioning of data across nodes to scale-out SQL requests without requiring frontend application changes). The database architecture is quite fast due to in-memory tables and indexes that ensure low-latency and high responsiveness.
Step 1. Defining MySQL NDB Cluster Management Node (ndb_mgmd)
- Download NDB Management Server
- Installing the NDB Management Server
- Configuring the MySQL NDB Management Node
Step 2. Configuring MySQL Cluster Data Nodes on Alibaba Cloud
Step 3. Configuring SQL Cluster Node on Alibaba Cloud
Step 4. Testing MySQL NDB Cluster on Alibaba Cloud
For MySQL instances, you can modify some parameters using the console. Setting inappropriate values for key parameters may reduce instance performance and cause application errors. This article details recommended optimal values for key parameters to help reduce risks when setting parameters.
Data Lake Analytics (DLA) serves as the hub for in-cloud data processing. It allows you to query and analyze the data in a single ApsaraDB for RDS (RDS), Table Store, OSS, ApsaraDB for MongoDB (MongoDB), or ApsaraDB for Redis (Redis) instance through standard Java Database Connectivity (JDBC). To support business growth or data sharding, you may need to create multiple instances of RDS, Table Store, OSS, or other types to store data. In such multi-data source scenarios, you can use DLA to perform joint queries in multiple data sources of the same type.
This topic takes two ApsaraDB RDS for MySQL (MySQL) instances as an example to demonstrate how to use DLA to perform joint queries in multiple database instances.
Data Lake Analytics does not require any ETL tools. This service allows you to use standard SQL syntax and business intelligence (BI) tools to efficiently analyze your data stored in the cloud with extremely low costs.
Cloud native federation analytics across multiple data sources: OSS, PostgreSQL, MySQL (RDS), NoSQL (Table Store), etc.
Alibaba Cloud offers fully managed database services. We monitor, backup, and recover your database automatically so that you can fully focus on your business development. To provide more stable and scalable database services, Alibaba Cloud optimized the source code based on the open-source database engines. Our database services, such as ApsaraDB RDS for MySQL and ApsaraDB RDS for PPAS have lower risk compare to the other services using the community edition.