3 Tips for Building a Lasting Jupyter Server

Hank Chan
Cubesole
Published in
4 min readFeb 23, 2020

--

There are many tutorials on building your Jupyter server, but few talks about building a lasting one, that is — how to ensure Jupyter continues to run even after ssh connection breaks up or system reboots.

ML engineers have on more than one occasion wasted hours of training because of the unstable ssh connections. Here we are going to summarize three solutions for a forever live Jupyter server (if you can afford one) that had been tested on Ubuntu (Let me know if any CentOS user runs into problem). This post assumes you have experience spinning up a Jupyter server.

Tmux

The easiest solution comes to rescue. We get that not all machine learning researchers have experience with server provisioning — that is why I recommend tmux as our first pick. Both Ubuntu and CentOS come with tmux out of the box, so there’s no need for sudo to install. What I love about tmux is it doesn’t require writing a configuration file, nor does it need to manage deep learning dependencies — something both AWS and GCP have images built for, with DLAMI and GCP DLVM respectively.

Using tmux is as easy as it gets — start a tmux session, and run a Jupyter server on it. So long as you don’t exit the tmux session, the server will continue running, even after your ssh connection has broken up.

# Start a tmux session
$ tmux
# Run a jupyter server
$ jupyter notebook
# Leave tmux session without exiting the running jupyter server
Ctrl+b + d
# Check if jupyter server is running
$ ps aux | grep jupyter
# Reenter tmux session
$ tmux attach
# Exit tmux session while you are in it
$ exit

Downside: tmux doesn’t survive system reboot. But the fact it survives unstable ssh connection should be enough for most use cases that’d otherwise risk losing their work in progress.

Supervisor

supervisor can be slightly more complicated, but it beats tmux at one point — it restarts the specified program after system reboot, so the program could seemingly run forever.

# Install supervisor
$ sudo apt install supervisor
$ sudo service supervisor start

To task supervisor to run your Jupyter server, we need to first create a configuration file and place it in the /etc/supervisor/conf.d. Below is an example of that configuration file (which must be named with .conf extension):

/etc/supervisor/conf.d/jupyter.conf[program:jupyter] 
command = jupyter notebook --no-browser --config=/path/to/config
directory = /path/to/working/directory
user = ubuntu # or whoever
autostart = true
autorestart = true
stdout_logfile = /var/log/your_log_file.log
redirect_stderr = true

After saving the configuration file to the directory /etc/supervisor/conf.d, don’t forget to ask supervisor to read in the configuration file and kickstart the program:

$ sudo supervisorctl reread 
$ sudo supervisorctl update
$ sudo supervisorctl status # Check if Jupyter is running

Downside: The need for writing a damn config file lol, though supervisor can also be super helpful for other automation tasks.

p.s. The supervisor section of the post takes reference from Albert Yang’s Post. I do not claim credit for it. Albert’s post has a more detailed walk-through for setting up supervisor as well as using nginx as reverse proxy, which provides easier solutions for routing and SSL connections.

Docker

Enter DevOps’ favorite choice — Docker. Docker’s advantage may not be best reflected in AWS’ DLAMI and GCP’s DLVM for the dependencies has been taken care of. That said, Docker will still come in handy when you have to start off with a plain server environment. Almost all vendor-provided Linux images today come with Docker pre-installed, including DLAMI and DLVM; even if it doesn’t, there are many tutorials for installing docker (See here to install Docker on Ubuntu 18.04).

There are many approaches for using Docker in deep learning, but here we are only concerned with running a lasting Jupyter server.

First we pull a pre-built deep learning image from Docker Hub, and run it at port 8888. For the sake of simplicity, we are pulling the Tensorflow image from the official Jupyter repository on Docker Hub.

$ docker pull jupyter/tensorflow-notebook
$ docker run -d -p 8888:8888 -e JUPYTER_ENABLE_LAB=yes -v "$PWD":/home/ubuntu jupyter/tensorflow-notebook

Most docker images have their own setups, this one included. Since the container’s working directory doesn’t have our Jupyter notebook folder, we need to symlink the folder to the working directory.

# Find your container name
$ docker ps
# Enter into the container
$ docker exec -it your_container_name bash
# Inside the docker container, symlink the folder to the working directory
$ ln -s /your/jupyter/notebook/folder /home/jovyan/whatever

Voila! A Jupyter server is now serving your folder at port 8888.

Downside:

  1. Docker container wouldn’t survive system reboot either.
  2. You are likely going to need different docker images for different dependencies, and each image size can be huge (~5GB).
  3. Third-party image comes at a cost — the more it customizes, the less applicable to general use. Of course you could build your own docker image, or even build image per project (Check out repo2docker), but that’s outside the scope of this tutorial.

Summary

So here it goes — tmux > supervisor > Docker in easiness, tmux < supervisor < Docker in ability to customize. Even though we present these tools in separate approaches, they can sometimes be used together — in fact, my personal favorite is docker in conjunction with supervisor.

/etc/supervisor/conf.d/docker.conf[program:docker]
command = docker container start your_container_name
directory = /home/ubuntu
user = ubuntu
autostart = true
startsecs = 1
startretries = 0
exitcodes=0
stdout_logfile = /var/log/your_log.log
redirect_stderr = true

The caveat here is supervisor is designed to run a running job, not a one-time job. Therefore, we need to specify startretries = 0and exitcodes = 0 to tell supervisor to stop issuing retries after command had exited, though this feels more like a workaround to me.

The three solutions above are by no means the only three for spinning up a lasting Jupyter server. If you find other ways to implement or other tools that are just as convenient, please do share with us.

--

--