Docker Container Resource Management: CPU, RAM and IO: Part 1

By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

This tutorial aims to give you practical experience of using Docker container resource limitation functionalities on an Alibaba Cloud Elastic Compute Service (ECS) instance, including:

  • CPU quotas
  • RAM quotas
  • IO bandwidth quotas


You need access to an ECS server with a recent version of Docker already installed. If you don’t have one already, you can follow the steps in this tutorial.

These resource limit tests use 20–30 MB of RAM so even a server with only a total RAM of 512MB will do.

The CPU tests are done on a server with only 2 cores. You will get more interesting results — for one of the tests — if your server has 4 cores or more.

Some of the CPU tests hog all CPUs for 15 seconds. It would be great for your teammates if you did this tutorial directly on your computer and not on the shared development server.

I am writing this tutorial using CentOS. You can use Debian / Ubuntu. 99% of this tutorial will work on any Linux distro since it mostly uses Docker commands.

You need a very basic understanding of Docker, images, containers and using docker run and docker ps -a.

Clean Up Preparation

It will really help if you have only a few ( preferably no ) containers running. That way you can easily find your tutorial container in docker ps -a output lists.

So stop and prune all the containers you do not need running.

You can quickly do that ( in your DEVELOPMENT environment ) using:

To now remove all containers, run

— memory-reservation


Allows you to specify a soft limit smaller than — memory which is activated when Docker detects contention or low memory on the host machine. If you use — memory-reservation, it must be set lower than — memory for it to take precedence. Because it is a soft limit, it does not guarantee that the container doesn’t exceed the limit.

I am running this on a 1 GB RAM server.

Let’s run 5 containers each reserving 250 MB of RAM.

All containers are running even though I over-reserved RAM by 250 MB. So this is hopeless: reservations that does not reserve; and does not prevent over-reservations.

If you run top you will see no virtual RAM allocated. This setting is internal to Docker.

docker stats does not show RAM reservations.

Shows all 5 containers running successfully.

We are finished with these containers. We can stop and then prune them.

— memory and — memory-swap (No Swapping Allowed)


  • -m or — memory= The maximum amount of memory the container can use. If you set this option, the minimum allowed value is 4m (4 megabyte).
  • — memory-swap The amount of memory this container is allowed to swap to disk.
  • If — memory-swap is set to the same value as — memory, and — memory is set to a positive integer, the container does not have access to swap

We are now testing no swapping allowed.

We need a tool to carefully allocate RAM on a MB-by-MB basis — so that we can carefully just overstep our defined RAM limits. I decided on Python. ( You do not need to know Python to understand its 4 lines of code used here. )

In the second part of this tutorial we will use actual benchmark tools.

Download Python Docker image if you do not already have it:

Run our container, limiting RAM: — memory=20m — memory-swap=20m

At the shell prompt, enter python3 to enter interactive Python editor. Cut and paste the code below. In Python spaces has syntax meaning, so be careful not to add any spaces or tabs to the code.

Press ENTER to exit the for statement block. This will run the code.

Expected output :

We allocated 20 MB RAM to this container. Python uses 5 MB. The for loop gets killed when it tries to append 16 MB of ‘1’ characters to the longstring variable.

Three things of note:

  • RAM allocations within limit of 20 MB worked
  • RAM allocation that exceeded limit got killed
  • No swap used: allocations did not quietly continue to work by using swap

Summary: — memory and — memory-swap ( No swapping allowed ) works when both are set to the same value. Based on your knowledge of the applications running in your containers you should set those values appropriately.

We are finished with this container. You can stop and prune it.

— memory and — memory-swap (Swapping Allowed)

By specifying — memory=20m and — memory-swap=30m we allow 10 MB of swap.

Let’s see how that works:

At the shell prompt, enter python3 to enter interactive Python editor. Cut and paste the code below. In Python spaces has syntax meaning, so be careful not to add any spaces or tabs to the code.

Press ENTER to exit the for statement block. This will run the code.

Expected output :

5 MB RAM used by Python. 25 MB RAM allocated above with no errors.

We specified : — memory=20m — memory-swap=30m

We just used 30 MB, meaning 10 MB is swapped. Let’s confirm by running top in another shell.

As expected: 10 MB swap used. ( You will have to show the SWAP field in top. )

Let’s carefully try to use 2 MB more RAM — container should run out of RAM.

Cut and paste this in Python editor. Press ENTER to run.

Expected output :

We are finished with this container. You can stop and prune it.

Summary: — memory and — memory-swap ( swapping allowed ) works when — memory-swap is larger than — memory.

Limits enforced perfectly.

You need to specify appropriate limits for your containers in your production environment.

Investigate current prod system RAM usage. Define limits according to those, adding a large margin for error, but still preventing runaway containers from crashing the prod server.

— oom-kill-disable

So far the automatically enabled out-of-memory functionality killed our runaway Python program.

Let’s see what happens if we disable it.

Note the — oom-kill-disable below:

Enter our unsuspecting container:

Enter python3 editor, paste that code, press ENTER to run it.

The container hangs.

Run top in another shell console:

Our container is in state D : uninterruptible sleep

In another shell:

It hanges too.

Let’s use another shell to get our hanging container’s PID so that we can kill it:

Get the PID.

Use top or kill -9 your-PID to kill it.


Do not use — oom-kill-disable

Your hanged shells now have a Linux prompt back. You can exit those.

— cpu-shares


— cpu-shares: Set this flag to a value greater or less than the default of 1024 to increase or reduce the container’s weight, and give it access to a greater or lesser proportion of the host machine’s CPU cycles.
This is only enforced when CPU cycles are constrained. When plenty of CPU cycles are available, all containers use as much CPU as they need. In that way, this is a soft limit. — cpu-shares does not prevent containers from being scheduled in swarm mode.
It prioritizes container CPU resources for the available CPU cycles. It does not guarantee or reserve any specific CPU access.

The plan: run 3 containers providing them with 100, 500 and 1000 CPU-shares.

The following is a terrible test. Carefully read above descriptions again, then read the next 3 commands and see if you can determine why this will not clearly show those CPU proportions allocated correctly.

Please note these CPU tests assume you are running this on your own computer and not on a shared development server. 3 tests hog 100% CPU for 20 seconds.

Later in this tutorial series we will do these tests using our own bench container using actual Linux benchmark tools. We will specifically focus on running these CPU hogs for very short runtimes and still get accurate results.

However please read and follow these CPU tests so that you can learn to get a feeling of how wrong and slow this quick hack testing is.

Note that dd, urandom and md5sum are not bench tools either.

The problem is not the dd or its timing.

Our CPU stress application: time dd if=/dev/urandom bs=1M count=2 | md5sum

Benchmark explanation:

  • time … measures elapsed time: shows those 3 timer lines
  • dd if=/dev/urandom bs=1M count=2 … copies bs=blocksize one MB of randomness twice
  • md5sum … calculates md5 security hashes ( give cpu a load )

Let’s run it and investigate the results:

Let’s investigate the logs to determine runtimes:

Expected output :

Note all containers used about the same sys cpu time — understandable since they all did the exact same work.

— cpu-shares=100 clearly takes longer, but — cpu-shares=500 only slightly slower than — cpu-shares=1024

The problem is that — cpu-shares=1024 runs very fast, then exits.

Then — cpu-shares=500 and — cpu-shares=100 has full access to CPU.

Then — cpu-shares=500 finishes quickly since it has most CPU shares.

Then — cpu-shares=100 finishes quickly since it has most CPU shares — NOTHING else is running.

Consider this problem and how you could solve it.

Figure it out before reading further.

You are welcome to test your solution.

My solution:

All 3 these containers must run in parallel all the time. The CPU-shares work only when CPU is under stress.

mycpu1024 — count must be set 10 times that of mycpu100
mycpu500 — count must be set 5 times that of mycpu100

This way all 3 containers will probably run roughly same times — based on their CPU-shares they all got CPU-share-appropriate similar workload.

Then divide mycpu1024 runtime by 10 — it got 10 times the workload of mycpu100
Then divide mycpu500 runtime by 5 — it got 10 times the workload of mycpu100

It should be very obvious that Docker divided the CPU-shares appropriately.

Busy Docker administrator shortcut / quick method:

Submit all the above containers to run again.

Have the following ready to run as well.

— cpu-shares=250 and — cpu-shares=200 containers

Then in another shell run docker stats and press ctrl c to freeze display.

It should be obvious the CPU-shares got allocated correctly.

Clean up containers:

— cpu-shares Identically Allocated

— cpu-shares: Set this flag to increase or reduce the container’s weight, and give it access to a greater or lesser proportion of the host machine’s CPU cycles.

This means that equal CPU-shares setting would mean equal CPU shares.

Let’s have 3 containers running, all with CPU-shares = 1024.


As expected, all 3 containers get same percentage CPU times.

Just to confirm that they all ran the same elapsed times

Prune containers, we are done with them.


Follow me to keep abreast with the latest technology news, industry insights, and developer trends.