Jupyter and CUDA on NSF Jetstream

NSF ACCESS is a fantastic high-performance computing program. If you’re affiliated with a university or other institution, you can upload a CV, write a paragraph describing your project, and get approved for several thousand CPU-hours within just a few days. If you don’t have existing grants it’s totally free. It even works if you’re a grad student, although then you’ll need your advisor to provide a form letter saying that you’re legit. But they even have suggested language for that.

Once you’re in, you can pick from over thirty systems. The one I’m using right now, Jetstream2, was built for ACCESS, and it’s a really interesting hybrid of traditional HPC and modern “serverless” cloud philosophy. You create “instances,” virtual machines that you can access through a web shell or a browser-based remote desktop, and run them on whatever hardware you prefer. If you need more power or less, scaling an instance up or down only takes a minute or so. They’re like high-end DigitalOcean droplets. The end result is that you get to start a compute session whenever you want, leave it running for days or weeks, and then shut it right down as soon as you’re done to avoid paying for unused time – all without the hassle of packaging and submitting jobs via a scheduler.

Jetstream also has the advantage of persistence. You can set up your environment once, even installing software or carrying out other operations as root, and you never have to do it again. This post is to share my particular setup, because I imagine that my use case is fairly common for applied mathematicians, and getting there is a little unintuitive.

Specifically, I want to run Jupyter with a Python backend, using a bunch of different GPU calculation packages.

Stage 1: Jupyter

Jetstream2 provides Jupyter out of the box, as part of a Conda install. They even have a script that lets you directly access Jupyter notebooks from your own browser, rather than having to use a remote desktop. The script launches the Jupyter notebook interface, not my preferred JupyterLab, and it complains if it can’t find the anaconda module. Since I only ever loaded that module just before I launched Jupyter, and in order to use JupyterLab, I rewrote the script to the following:

#!/bin/bash

module load anaconda

JUPYTER=$(which jupyter)

if [ -z "$JS2_PUBLIC_IP" ]; then
JS2_PUBLIC_IP=$(curl -s http://169.254.169.254/latest/meta-data/public-ipv4)
fi

$JUPYTER lab --no-browser --ip=0.0.0.0 2>&1 | sed s/127.0.0.1/${JS2_PUBLIC_IP}/g

(If you’re curious, 169.254.169.254 is an IP address where cluster managers will often host an internal API for individual servers to query things about themselves and the cluster. In this case it’s being used to ask for an instance-specific public IP address.)

Now all I need to do is launch a web shell, run this script, and copy the resulting URL into another tab in my browser. This does require pressing CTRL+C, but Jupyter only exits on multiple CTRL+Cs in quick succession, so it’s fine. Since module load is idempotent, it’s fine to run the script twice.

Importantly, if you do this, you do not need a web desktop on your instances. Turn it off in the settings when you create them. All you need is the web shell, and that only to start Jupyter. Once that’s done, you can do everything you need, including running terminals, through the Jupyter web interface.

Unless you really want to, you can forget about the fact that you’re running in a Conda environment. The default pip will work just fine for packages. It’ll even respect the ones installed by Conda by default.

Stage 2: CUDA

CUDA is also available as a module, but since modules are actually stored on a network drive somewhere in the cluster and module load just updates your filesystem to point to them, it’s hard to use it with anything that expects to find CUDA files in any of the default places. I’d recommend the CUDA module for anyone who wants to run nvcc or similar on their own source code. For everyone else, there are two options:

If you just want to use PyTorch, you can install it with pip install torch and it will bring in its own CUDA binaries automatically. It works out of the box.
If you want to use a library that doesn’t do this, like the quite fabulous PyKeOps, you can install CUDA yourself. Directly follow the network installation instructions from NVIDIA’s website. I’ve linked the ones for Ubuntu, since that’s Jetstream2’s default. Use the ubuntu2204/x86_64 version as of September 2024.

Each of these options will bring in a new CUDA install, which is 10-15GB, so doing both will use up almost half of your default allocated disk space. That still leaves you with 30GB, though, and you can add network storage too, so it’s not the end of the world. I’ve looked into it but I don’t think there’s a great way to get around the doubling-up: PyTorch is going to bring its own CUDA whatever you do, and PyKeOps is going to need you to provide it.

One of the sysadmins explicitly warned me to be careful to install the cuda-toolkit Ubuntu package and not the nvidia-cuda-toolkit package. You’ll do this correctly if you follow NVIDIA’s current instructions, but it has to be worth repeating if it was repeated to me with such vigor as it was.

Stage 3: Science!

That was about all I needed to get set up. Other pure-Python packages installed themselves just fine through pip, and my GPU computations are running just as well as they ever did in Colab. They even seem to be just a little bit faster on Jetstream, although that’s a very anecdotal result and I don’t want to say anything firm one way or the other.

Update: I recently got annoyed about having to keep two tabs – the web shell and the Jupyter notebook – open at once. If I closed the web shell, it would log me out, ending the Jupyter session. The solution was to launch the Jupyter process in the background, thus:

$ ./jupyter-launch-script.sh & disown

The script will still print the connection information to the web shell, which you can then close any time. To avoid trying to run more than one Jupyter instance at a time, I shelve my VM when I’m finished with it, which (I believe) shuts down all its processes gracefully.

I’d like to thank the Jetstream office hours folks for their help in getting me set up. They’re great.

Written on September 18, 2024