The world’s leading publication for data science, AI, and ML professionals.

Building heavy-duty containers

restart flag, init, and supervisor processes

Photo by Pat Whelen on Unsplash
Photo by Pat Whelen on Unsplash

There are cases where software fails in rare conditions that are temporary in nature. Although it’s important to be made aware when these conditions arise, it’s usually at least as important to restore the service as quickly as possible.

When all the processes in a container have exited, that container will enter the exited state. A Docker container can be in one of four states:

  • Running
  • Paused
  • Restarting
  • Exited (also used if the container has never been started)

A basic strategy for recovering from temporary failures is that when a process exits or fails, it will automatically restart. Docker provides some options for monitoring and restarting Containers.


Restarting containers automatically

Docker provides support for this feature through a restart strategy. Using the —- restartflag when the container is created, you can tell Docker to do any of the following:

  • Never restart (default)
  • Attempt to restart when a failure is detected
  • Attempt for some predetermined time to restart when a failure is detected
  • Always restart the container regardless of the condition

Docker does not always try to restart the container immediately. If it did, it will cause more problems than solved. Imagine a container that does nothing but prints the time and then exits. If the container is configured to always restart, and Docker always restarts it immediately, the system will not perform any operation and only restart the container. Instead, Docker uses an exponential backoff strategy to time restart attempts.

The backoff strategy determines how much time should elapse between successive restart attempts. The exponential backoff strategy will double the time spent waiting for each successive attempt. For example, if the container needs to be restarted the first time, Docker waits for 1 second, then on the second attempt, it will wait 2 seconds, the third attempt will wait 4 seconds, the fourth time will wait 8 seconds, and so on. The exponential backoff strategy with a short initial waiting time is a common service recovery technique. You can see that Docker itself adopted this strategy by building a container that always restarts and only prints the time

docker run -d --name backoff-sample --restart always busybox date

Then after a few seconds check out the logs to watch it back off and restart

docker logs -f backoff-sample

The only reason you might not want to directly adopt this feature is that the container is not running during the backoff period. Containers waiting to be restarted are in a restart state. To demonstrate, try running another process in the backoff sample container

docker exec backoff-sample echo testing

Running that command should result in an error message

Error response from daemon: Container 4affc02f445dd426c0b7daf79efc31f83b1e6e3c7397323bd235a8cd80453bb1 is restarting, wait until the container is running

This means you cannot perform any operations that require the container to be running, such as executing other commands in the container. If you need to run diagnostics in a damaged container, that might be a problem. A more complete strategy is to use containers running init or supervisor processes.

Supervisor and startup processes

The supervisor program or initialization program is a program used to start and maintain the state of other programs. On Linux systems, PID#1 is an initialization process. It starts all other system processes and restarts them if they fail unexpectedly. It is a common practice to use a similar pattern to start and manage processes within the container.

If the target process (such as a web server) fails and restarts, using a supervisor inside the container will keep the container running. Several programs may be used in the container. The most popular ones include init, systemd, runit, upstart and supervisord.

There are a lot of Docker images in DockerHub that produce a full LAMP (Linux, Apache, MySQL PHP) stack inside a single container. Containers created this way use supervisord to make sure that all the related processes are kept running. Start an example container

docker run -d -p 80:80 --name lamp-test mattrayner/lamp

You can see what processes are running inside this container by using the docker topcommand

docker top lamp-test
PID                 USER                TIME                COMMAND
5833                root                0:00                {supervisord} /usr/bin/python3 /usr/local/bin/supervisord -n
6462                root                0:00                apache2 -D FOREGROUND
6463                root                0:00                {pidproxy} /usr/bin/python3 /usr/local/bin/pidproxy /var/run/mysqld/mysqld.pid /usr/bin/mysqld_safe
6464                1000                0:00                apache2 -D FOREGROUND
6465                1000                0:00                apache2 -D FOREGROUND
6466                1000                0:00                apache2 -D FOREGROUND
6467                1000                0:00                apache2 -D FOREGROUND
6468                1000                0:00                apache2 -D FOREGROUND
6469                root                0:00                {mysqld_safe} /bin/sh /usr/bin/mysqld_safe
6874                1000                0:00                /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql --plugin-dir=/usr/lib/mysql/plugin --user=www-data --log-error=/var/log/mysql/error.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/run/mysqld/mysqld.sock --port=3306 --log-syslog=1 --log-syslog-facility=daemon --log-syslog-tag=

The top subcommand will display the host PID of each process in the container. You will see supervisord, mysql and apache in the list of running programs. Now that the container is running, you can test the supervisord restart function by manually stopping one of the processes inside the container.

First, let’s get the PID of each process

docker exec lamp-test ps
PID TTY          TIME CMD
1 ?        00:00:00 supervisord
503 ?        00:00:00 apache2
504 ?        00:00:00 pidproxy
510 ?        00:00:00 mysqld_safe
943 ?        00:00:00 ps

Let’s kill the apache2 process

docker exec lamp-test kill 503

Running this command will run the Linux kill program inside the lamp-test container and tell the apache2 process to shut down. When apache2 stops, the supervisord process will log the event and restart the process. The container logs will clearly show these events:

docker logs lamp-test
Updating for PHP 7.4
---------
2021-02-16 12:58:42,878 INFO supervisord started with pid 1
2021-02-16 12:58:43,882 INFO spawned: 'apache2' with pid 503
2021-02-16 12:58:43,885 INFO spawned: 'mysqld' with pid 504
2021-02-16 12:58:45,332 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-02-16 12:58:45,332 INFO success: mysqld entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2021-02-16 13:08:42,334 INFO exited: apache2 (exit status 0; expected)
2021-02-16 13:08:43,345 INFO spawned: 'apache2' with pid 956
2021-02-16 13:08:44,385 INFO success: apache2 entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)

A common alternative to using init or supervisor programs is to use a startup Script that at least checks the prerequisites for the successful startup of the included software. These are sometimes used as default commands for containers. Docker containers run something called an entrypoint before executing the commands. Entrypoints are perfect places to put code that validates the preconditions of a container.

Startup scripts are an essential part of building durable containers and can always be combined with Docker restart policies to take advantage of the strengths of each.

Final cleanup

Ease of cleanup is one of the most important reasons to use containers and Docker. The isolation provided by the container simplifies all the steps you need to perform to stop the process and delete files. With Docker, the entire cleaning process is reduced to one of a few simple commands. In any cleanup task, you must first determine which container to stop and/or delete.

Let’s list all the containers first

docker ps -a

Since the containers created for the examples in this article will no longer be used, you should be able to safely stop and delete all listed containers. If you created a container for your activity, make sure to pay attention to the container to clean up.

All containers use hard drive space to store logs, container metadata, and files that have been written to the container file system. All containers also consume resources in the global namespace, such as container name and host port mapping. In most cases, containers that are no longer used should be deleted.

Let’s delete the lamp-test container:

docker rm -f lamp-test

As this container was running, if we try to run the previous command without the -f flag Docker will throw an error. Docker containers need to be stopped before trying to remove them. You can do this running docker stop command or instead of running two commands add the force flag to the remove command.


Related Articles