Supervisor configurations to ensure that my Python Flask application releases binded port(s) during a supervisor restart

We use Supervisor to help keep our Python based applications running. One of our applications was built on the Python Flask framework to provide a RESTful api to connecting clients.

With continuous integration in place, we need to restart all our Supervisor managed applications whenever there is a change being merged to the master branch in our Git repository.

This post documents the Supervisor configurations to ensure that my Python Flask application releases any port that it had binded to when Jenkins send the command to restart the Supervisor and the processes that it manages.

The initial Supervisor configurations

My initial Supervisor configurations were as follows:

[program:techcoil-flask-app]
directory=/opt/techcoil/flask-app
command=/bin/bash -E -c ./start.sh
autostart=true
autorestart=true

When supervisor sees my configurations file, it will create a subprocess with the current work directory set to the /opt/techcoil/flask-app. After that, it will run start.sh, which in turn runs the python command on app.py which is located in the same directory.

I also indicated to Supervisor that I want it to start my Python Flask application automatically and to restart it in the event when my Python Flask application crashes.

The first run of my Supervisor script turned out fine; my application ran successfully and I was able to send HTTP GET requests to my Python Flask application.

However, when I restarted Supervisor, I noticed the following error in the error log.

Traceback (most recent call last):
  File "./app.py", line 123, in <module>
    app.run(host='0.0.0.0', debug=True, threaded=True)
  File "/usr/local/lib/python3.4/dist-packages/flask/app.py", line 772, in run
    run_simple(host, port, self, **options)
  File "/usr/local/lib/python3.4/dist-packages/werkzeug/serving.py", line 675, in run_simple
    s.bind((hostname, port))
OSError: [Errno 98] Address already in use

Why didn't my Python Flask application release the port that it had binded earlier when it was told to restart

To investigate why my Flask application did not release the port that it had binded in an earlier run when it was told to restart, I first killed off the remnant Python Flask process that was holding on to the port. Once I am sure that the port was released, I tried to emulate a restart through a shell terminal.

To do so, I first ran the same script that my supervisor process had tried to run and then pressed Ctrl+C. After that, I tried to run the script again. This time, the error did not surface; I was able to restart my Python Flask application by manual execution via the shell terminal.

After some read up, I came to know that pressing Ctrl+C would send a SIGINT to my Python Flask program. There was proper termination done by Python Flask in response to receiving a SIGINT. However, the default stop signal that Supervisor had sent to my Flask program was a SIGTERM.

In addition, the Supervisor will not send the stop signal and kill signal to processes as a group. As such, in the event when the running process fails to propagate the stop signal to its children, those child processes will keep on running until the server gets restarted.

The Supervisor configurations to ensure proper process termination

With the information that I had gathered, I edited my Supervisor configuration files:

[program:techcoil-flask-app]
directory=/opt/techcoil/flask-app
command=/bin/bash -E -c ./start.sh
autostart=true
autorestart=true
stopsignal=INT
stopasgroup=true
killasgroup=true

This time round, I explicitly set the stop signal to INT, so that Supervisor will send a SIGINT instead of a SIGTERM to my Python Flask application when a restart was requested. Additionally, I instructed Supervisor to stop my Python Flask application as a group and where necessary, to kill the process as a group.

With the new Supervisor configurations in place, I am able to restart my Python Flask application via Supervisor without facing port binding issues. This enables my Jenkins server to trigger a restart of the Python Flask application after it pulls the latest changes from the remote Git repository.

About Clivant

Clivant a.k.a Chai Heng enjoys composing software and building systems to serve people. He owns techcoil.com and hopes that whatever he had written and built so far had benefited people. All views expressed belongs to him and are not representative of the company that he works/worked for.