Docker Container Resource Management: CPU, RAM and IO: Part 3

13 min readMar 25, 2019

By Alwyn Botha, Alibaba Cloud Tech Share Author. Tech Share is Alibaba Cloud’s incentive program to encourage the sharing of technical knowledge and best practices within the cloud community.

This tutorial aims to give you practical experience of using Docker container resource limitation functionalities on an Alibaba Cloud Elastic Compute Service (ECS) instance.

— cpuset-cpus Using Sysbench

Syntax:

CPUs in which to allow execution (0–3, 0,1)

Do the bench test using both CPUs:

docker run -it --rm --name mybench  --cpuset-cpus 0,1 centos:bench /bin/sh -c 'time sysbench --threads=2 --events=4 --cpu-max-prime=800500 --verbosity=0 cpu run'real    0m1.957s
user    0m3.813s
sys     0m0.025s

Do the bench test using CPU 0:

docker run -it --rm --name mybench  --cpuset-cpus 0 centos:bench /bin/sh -c 'time sysbench --threads=2 --events=4 --cpu-max-prime=800500 --verbosity=0 cpu run'real    0m3.789s
user    0m3.740s
sys     0m0.029s

Do the bench test using CPU 1:

docker run -it --rm --name mybench  --cpuset-cpus 1 centos:bench /bin/sh -c 'time sysbench --threads=2 --events=4 --cpu-max-prime=800500 --verbosity=0 cpu run'real    0m3.809s
user    0m3.761s
sys     0m0.025s

Result: 2 CPUs are faster than one.

If your server has more than 2 CPUs you can run more tests using combinations of cpuset-CPUs.

Just note that your — threads= setting must be equal to number of CPUs tested: each CPU must have its separate thread workload.

If you want it to be really fast, you may consider reducing the max-prime number a hundred fold. However, as you can see below, such short runtimes have so much startup CPU processing overhead in it, the elapsed real wall-clock times do not clearly show 2 CPUs twice as fast as one. Therefore do not make tests runs too short.

docker run -it --rm --name mybench  --cpuset-cpus 0,1 centos:bench /bin/sh -c 'time sysbench --threads=2 --events=4 --cpu-max-prime=8005 --verbosity=0 cpu run'real    0m0.053s
user    0m0.023s
sys     0m0.017s

2 CPUs above versus 1 below:

docker run -it --rm --name mybench  --cpuset-cpus 0 centos:bench /bin/sh -c 'time sysbench --threads=2 --events=4 --cpu-max-prime=8005 --verbosity=0 cpu run' real    0m0.066s
user    0m0.025s
sys     0m0.023s

Docker — cpuset-cpus training complete.

sysbench training just started: exec into centos:bench container, and run sysbench — help to continue your training.

— cpu-shares Using Sysbench

Earlier we determined this runs for .277 seconds using 2 threads.

docker run -it — rm — name mybench — cpus 1 centos:bench /bin/sh -c ‘time sysbench — threads=2 — events=4 — cpu-max-prime=100500 — verbosity=0 cpu run’

We only want to test relative CPU time shared ratios, so 1 thread is enough.

We only want to test relative CPU time shared ratios, so 1 cpu is enough.

Run 1: Start another shell and run:

for var in `seq 1 20`; do docker stats --no-stream ; done

Original shell:

docker run -d --name mybench100  --cpus 1 --cpu-shares=100  centos:bench /bin/sh -c 'for var in `seq 1 10`; do time sysbench --threads=1 --events=4 --cpu-max-prime=50500 --verbosity=0 cpu run ; done'
docker run -d --name mybench500  --cpus 1 --cpu-shares=500  centos:bench /bin/sh -c 'for var in `seq 1 10`; do time sysbench --threads=1 --events=4 --cpu-max-prime=50500 --verbosity=0 cpu run ; done'
docker run -d --name mybench1024  --cpus 1 --cpu-shares=1024  centos:bench /bin/sh -c 'for var in `seq 1 10`; do time sysbench --threads=1 --events=4 --cpu-max-prime=50500 --verbosity=0 cpu run ; done'

It does not work — all tests ran too fast. docker stats does not catch running of even one container.

More importantly, shares=100 finished before shares=500 could even start. The idea is to have all 3 containers run concurrently. -cpu-shares — we want to see CPUs being shared proportionately based on each container setting.

New approach: let each container sleep until the same time and then they all run their benchmarks simultaneously. Note sleep commands below:

Run 2: Start another shell and run:

for var in `seq 1 20`; do docker stats --no-stream ; done

Original shell immediately afterwards:

docker run -d --name mybench100  --cpu-shares=100  centos:bench /bin/sh -c 'sleep 4; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench200  --cpu-shares=200  centos:bench /bin/sh -c 'sleep 3; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench300  --cpu-shares=300  centos:bench /bin/sh -c 'sleep 2; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench600  --cpu-shares=600  centos:bench /bin/sh -c 'for var in `seq 1 4`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'

This only provides 2 seconds of useful output:

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
d3744f5a52ee        mybench600          94.14%              1.23MiB / 985.2MiB    0.12%               508B / 0B           13.1MB / 0B         0
5c694381ef2f        mybench300          50.96%              1.34MiB / 985.2MiB    0.14%               508B / 0B           12.4MB / 0B         0
8894176a9b72        mybench200          34.10%              1.215MiB / 985.2MiB   0.12%               508B / 0B           12.4MB / 0B         0
8b783befba46        mybench100          15.71%              1.223MiB / 985.2MiB   0.12%               578B / 0B           12.4MB / 0B         0CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
d3744f5a52ee        mybench600          93.73%              1.348MiB / 985.2MiB   0.14%               578B / 0B           13.1MB / 0B         0
5c694381ef2f        mybench300          55.25%              1.219MiB / 985.2MiB   0.12%               578B / 0B           13.1MB / 0B         0
8894176a9b72        mybench200          33.25%              1.223MiB / 985.2MiB   0.12%               578B / 0B           12.4MB / 0B         0
8b783befba46        mybench100          16.21%              1.227MiB / 985.2MiB   0.12%

The CPU % matches the — cpu-shares we specified.

— cpu-shares prioritizes container CPU resources for ALL the available CPU cycles. Our tests above used 200% = ALL CPU resources during their test runs. If you have an 8 or 16 core server you do not want to do that.

Purpose of run 3: limit our test run to only ONE CPU. Pick any one CPU number based on your server configuration.

Run 3: Start another shell and run:

for var in `seq 1 20`; do docker stats --no-stream ; done

Original shell immediately afterwards:

docker run -d --name mybench100 --cpuset-cpus 0 --cpu-shares=100  centos:bench /bin/sh -c 'sleep 4; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench200 --cpuset-cpus 0 --cpu-shares=200  centos:bench /bin/sh -c 'sleep 3; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench300 --cpuset-cpus 0 --cpu-shares=300  centos:bench /bin/sh -c 'sleep 2; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench600 --cpuset-cpus 0 --cpu-shares=600  centos:bench /bin/sh -c 'for var in `seq 1 4`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'

Condensed expected output :

CONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
61a4f3d31614        mybench600          49.88%              1.215MiB / 985.2MiB   0.12%               648B / 0B           13.1MB / 0B         0
199ee60e6e1d        mybench300          24.92%              1.348MiB / 985.2MiB   0.14%               648B / 0B           13.1MB / 0B         0
ee9b719a6d88        mybench200          16.77%              1.23MiB / 985.2MiB    0.12%               648B / 0B           12.4MB / 0B         0
016fc5d86d68        mybench100          8.26%               1.227MiB / 985.2MiB   0.12%               648B / 0B           12.4MB / 0B         0

Success. We used overall 100% of CPU power ( one CPU out of 2 = 200% ).

Run 4:

Different approach: I want to restrict container to 10% of CPU power, and within that let CPU-shares be destributed as specified. See output below.

Unfortunately this does not work:

docker run -d --name mybench100  --cpus .1 --cpu-shares=100  centos:bench /bin/sh -c 'sleep 2; for var in `seq 1 4`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench200  --cpus .1 --cpu-shares=200  centos:bench /bin/sh -c 'sleep 2; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench300  --cpus .1 --cpu-shares=300  centos:bench /bin/sh -c 'sleep 1; for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'
docker run -d --name mybench600  --cpus .1 --cpu-shares=600  centos:bench /bin/sh -c 'for var in `seq 1 3`; do time sysbench --threads=1 --events=4 --cpu-max-prime=500500 --verbosity=0 cpu run ; done'

— cpus .1 overrides — CPU-shares setting. Each container gets to use 10% CPU.

docker stats --no-stream outputCONTAINER ID        NAME                CPU %               MEM USAGE / LIMIT     MEM %               NET I/O             BLOCK I/O           PIDS
d1efffa6b398        mybench600          9.98%               1.223MiB / 985.2MiB   0.12%               648B / 0B           13.1MB / 0B         0
e5761b1fd9ae        mybench300          9.94%               1.227MiB / 985.2MiB   0.12%               648B / 0B           13.1MB / 0B         0
d25746948b3d        mybench200          9.95%               1.348MiB / 985.2MiB   0.14%               648B / 0B           13.1MB / 0B         0
9003482281bd        mybench100          9.94%               1.227MiB / 985.2MiB   0.12%

— cpuset-mems

Only works on servers with NUMA architecture, for example Opterons and Nelahem Xeons.

Specifies the memory nodes (MEMs) in which to allow execution (0–3, 0,1)

I do not have a high-end server, so I was not able to use these options. Basically cpuset can be used as follows:

— cpuset-cpus are used for non-NUMA servers
— cpuset-mems are used for NUMA servers

— memory and — memory-swap (No Swapping Allowed) Using Stress Bench Tool

Run our container, limiting RAM: — memory=20m — memory-swap=20m

docker container run -ti — rm — memory=20m — memory-swap=20m — name memtest centos:bench /bin/sh

Enter commands as shown at sh-4.2# shell prompt.

Syntax:

stress — vm 1 — vm-bytes 18M

Allocates 1 vm worker
Allocates 18 MB of RAM

sh-4.2# stress --vm 1 --vm-bytes 18M
stress: info: [50] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

This works since its within the — memory=20m limit specified.

Press CTRL-c to exit the stress test.

sh-4.2# stress --vm 1 --vm-bytes 19M
stress: info: [53] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [53](415) <-- worker 54 got signal 9
stress: WARN: [53](417) now reaping child worker processes
stress: FAIL: [53](451) failed run completed in 0s

Allocating 19 MB RAM failed since container is out of RAM. ( container itself has some startup overhead RAM used )

Read stress help.

sh-4.2# stress

--vm-hang N sleep N secs before free (default none, 0 is inf)

Right now if we allocate RAM it gets allocated in a fast loop eating 100% CPU time. We could use this vm-hang option to let stress only allocate RAM every xxx seconds.

Unfortunately our earlier crash uses some RAM, so testing 18M now crashes as well.

Therefore use 17 MB RAM allocation.

sh-4.2# stress --vm 1 --vm-bytes 17M
stress: info: [64] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

Open another shell and run top Sort by RES or SHR and you should find stress near top of list using 100% CPU.

Now run:

sh-4.2# stress --vm 1 --vm-bytes 17M --vm-hang 3
stress: info: [70] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

In top you will see much less CPU usage.

Final test:

sh-4.2# stress --vm 1 --vm-bytes 30M --vm-hang 3
stress: info: [72] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [72](415) <-- worker 73 got signal 9
stress: WARN: [72](417) now reaping child worker processes
stress: FAIL: [72](451) failed run completed in 0ssh-4.2# exit

stress fail since we only have 20 MB of RAM limit for this container.

— memory and — memory-swap are equal meaning no swapping allowed so stress fails.

Summary : set — memory = — memory-swap to restrict container : No swapping allowed

Note how very easy and quick it is to do these tests once you have a Docker image with bench tools inside.

— memory and — memory-swap (Swapping Allowed) Using Stress Bench Tool

Run our container, limiting RAM: — memory=20m — memory-swap=30m

The options above limit our container to 10 MB swap.

docker container run -ti — rm — memory=20m — memory-swap=30m — name memtest centos:bench /bin/sh

Enter commands as shown at sh-4.2# shell prompt.

sh-4.2# stress --vm 1 --vm-bytes 25M --vm-hang 3
stress: info: [9] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd

25 MB allocation works — 5+ MB swap used.

Use top in another shell. Show the SWAP column. Note stress using swap.

Tasks: 125 total,   2 running, 123 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.3 us,  4.9 sy,  0.0 ni, 83.5 id, 11.1 wa,  0.0 hi,  0.2 si,  0.0 st
MiB Mem :  985.219 total,  460.227 free,  166.090 used,  358.902 buff/cache
MiB Swap: 1499.996 total, 1492.684 free,    7.312 used.  636.930 avail Mem  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND             SWAP
 2565 root      20   0   32.3m  18.0m        S   9.0  1.8   0:24.97 stress              7.1m

CTRL-c to exit the stress run.

sh-4.2# stress --vm 1 --vm-bytes 35M --vm-hang 3
stress: info: [11] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [11](415) <-- worker 12 got signal 9
stress: WARN: [11](417) now reaping child worker processes
stress: FAIL: [11](451) failed run completed in 0ssh-4.2# exit

Trying to overallocate RAM fails as expected.

— memory=20m — memory-swap=30m

As expected allocating RAM within 10 MB swap limit works.

— memory-swap Is Not Set

If — memory-swap is unset, and — memory is set, the container can use twice as much swap as the — memory setting, if the host container has swap memory configured. For instance, if — memory=”300m” and — memory-swap is not set, the container can use 300m of memory and 600m of swap.

Let’s see this in action:

Run our container, limiting RAM: — memory=20m — memory-swap UNSET

Based on the docs our container should have access tp The options above limit our container to 10 MB swap.

docker container run -ti --rm  --memory=20m --memory-swap=30m --name memtest centos:bench /bin/sh

Enter commands as shown at sh-4.2# shell prompt.

sh-4.2# stress --vm 1 --vm-bytes 35M --vm-hang 3
stress: info: [9] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
^C

Above: allocating less than twice memory setting works.

sh-4.2# stress --vm 1 --vm-bytes 38M --vm-hang 3
stress: info: [11] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
stress: FAIL: [11](415) <-- worker 12 got signal 9
stress: WARN: [11](417) now reaping child worker processes
stress: FAIL: [11](451) failed run completed in 1s

Above: allocating more than twice memory setting fails. Important: consider container other RAM overheads.

sh-4.2# stress --vm 1 --vm-bytes 37M --vm-hang 3
stress: info: [13] dispatching hogs: 0 cpu, 0 io, 1 vm, 0 hdd
^C

Above: allocating twice memory setting works — if we consider other RAM already allocated ( 3 MB in this case).

sh-4.2# exit

Read/Write : IO Rate Limiting

These settings allow you to limit the IO rates of your containers:

— device-read-bps Limit read rate (bytes per second) from a device
— device-read-iops Limit read rate (IO per second) from a device
— device-write-bps Limit write rate (bytes per second) to a device
— device-write-iops Limit write rate (IO per second) to a device
— io-maxbandwidth Maximum IO bandwidth limit for the system drive (Windows only)
— io-maxiops Maximum IOps limit for the system drive (Windows only)

As you can see the last 2 are Windows-only. I will not be covering them since you can easily test it using the same principles applied here for device-write-bps

Syntax:

— device-write-bps /dev/your-device:IO-rate-limit ( for example: 1mb )

This tutorial will not use named volumes ( a separate Docker topic all by itself ). It would have made it a bit easier — by having a fixed and pre-allocated device name ready for use.

Unfortunately we need to determine our device mapper logical device number for our container before we can set a rate limit. But we only get this number after the container starts.

Docker allocates device mapper logical device number sequentially to anon volumes.

So we have to start up our container, peek at that number, then exit the container.

Then we can start container again — this time using this number in our — device-write-bps setting.

It is easier understandable when doing this: so here goes:

Run:

df -h

Observe bottom of listing, hopefully you have few containers running, with no /dev/dm-xxx numbers.

Run:

docker run -it --rm alpine:3.8 /bin/sh

We now have running container with a new dm number allocated.

In a different shell run:

df -h

Can you find the new dm number allocated — at bottom of list? That is the number you need.

Back at previous shell — enter exit to exit that container. The — rm means the container will auto self destruct.

Replace the 9999 with your number.

I use CentOS 7 as image. You can probably use any Debian / Ubuntu image you have already downloaded.

You cannot use Alpine — its dd command does not support the oflag option.

docker run -it — rm — device-write-bps /dev/dm-9999:1mb centos:7 /bin/sh

Enter dd command as shown:

sh-4.2# time dd if=/dev/zero of=test.out bs=1M count=10 oflag=direct
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 10.0027 s, 1.0 MB/sreal    0m10.009s
user    0m0.001s
sys     0m0.015s
sh-4.2# exit
exit

The dd command copies 10 MB of zeros to test.out ( if = input file, of = output file ).

Real is wall clock time: 1 MB/s times 10 seconds = 10 MB copied.

— device-write-bps works. Data write rate is limited exactly as we specified.

Compose File Resource Limits

This tutorial only gave you experience with CPU and RAM resource limits using the easy RUN command settings.

Please refer to the docs to limit resources in compose files.

https://docs.docker.com/compose/compose-file/#resources

https://docs.docker.com/compose/compose-file/compose-file-v2/#cpu-and-other-resources

— cap-add=sys_nice

Syntax:

renice [-n] priority process-pid
renice alters the scheduling priority of one or more running processes.
The first argument is the priority value to be used.
The other arguments are interpreted as process IDs

Processes inside containers by default are not allowed to change their priorities. You get an error message:

sh-4.2#  renice -n -10 1
renice: failed to set priority for 1 (process ID): Permission denied

If you run the container with — cap-add=sys_nice you allow it to change scheduling priorities.

docker run -it --rm --name mybench --cap-add=sys_nice centos:bench /bin/shsh-4.2# renice -n -10 1
1 (process ID) old priority 0, new priority -10
sh-4.2# renice -n -19 1
1 (process ID) old priority -10, new priority -19
sh-4.2# renice -n 19 1
1 (process ID) old priority -19, new priority 19
sh-4.2# exit
exit

Based on the syntax above we are changing the scheduling priority of the process with pid = 1.

Use top in another shell after every renice. Sort by priority column. You should be able to easily find your sh process with the changed priority.

Fortunately, our container is not doing CPU-intensive work: playing with its priority has no effect on other people using your development server.

Conclusion

In this tutorial series, we’ve explored and experimented with various Docker resource limitation commands. You should use actual Linux benchmarking tools as sysbench and stress as these tools were specifically designed to give you repeatable results.

It is now up to you to determine the correct CPU, RAM and IO limitations you want to define for your containerized production applications. Well-behaved applications do not need to be restricted. Focus your effort on limiting the proven resource hogs.

Reference:https://www.alibabacloud.com/blog/docker-container-resource-management-cpu-ram-and-io-part-3_594579?spm=a2c41.12692137.0.0

Docker Container Resource Management: CPU, RAM and IO: Part 3

— cpuset-cpus Using Sysbench

— cpu-shares Using Sysbench

— cpuset-mems

— memory and — memory-swap (No Swapping Allowed) Using Stress Bench Tool

— memory and — memory-swap (Swapping Allowed) Using Stress Bench Tool

— memory-swap Is Not Set

Read/Write : IO Rate Limiting

Compose File Resource Limits

— cap-add=sys_nice

Conclusion

Written by Alibaba Cloud

No responses yet