Jupyter notebooks

The Jupyter notebook is a web-based notebook environment for interactive computing. Running a Jupyter notebook server remotely on Viking and connecting to it from your local web browser is certainly possible. There are a few steps to ensure that your notebook is running on a compute node (not a login node) and you can remotely connect to it.

The connection would look something like this:

Your computer ➡️ Viking login node ➡️ Viking compute node ➡️ Jupyter notebook

One way to do this would be to start an interactive session on a compute node, from that session run the Jupyter notebook server:

you can adjust the ‘cpus-per-task’ and ‘time’ to suit your needs
$ srun --nodes=1 --ntasks=1 --cpus-per-task=5 --time=08:00:00 --pty /bin/bash
srun: job 76415 queued and waiting for resources
srun: job 76415 has been allocated resources
Creating user dir for 'Local Scratch'
flight start: Flight Direct environment is already active.
[abc123@node112[viking2] ~]$

Here we have an interactive bash session on the node112 compute node, with five CPU cores, for eight hours. Next we load the Jupyter module and run the server:

Attention

In this example we’re using the Jupyter module on Viking. If you are using Conda or another virtual environment to install Jupyter alongside other packages you should not run the module load JupyterLab/3.1.6-GCCcore-11.2.0 command below, skip that command and the rest should still be valid. Instead you should activate your virtual environment where you have Jupyter installed.

[abc123@node112[viking2] ~]$ module load JupyterLab/3.1.6-GCCcore-11.2.0
[abc123@node112[viking2] ~]$ jupyter notebook --no-browser
[I 15:01:51.756 NotebookApp] Writing notebook server cookie secret to /users/abc123/.local/share/jupyter/runtime/notebook_cookie_secret
[I 2023-11-03 15:01:52.190 LabApp] JupyterLab extension loaded from /opt/apps/eb/software/JupyterLab/3.1.6-GCCcore-11.2.0/lib/python3.9/site-packages/jupyterlab
[I 2023-11-03 15:01:52.190 LabApp] JupyterLab application directory is /opt/apps/eb/software/JupyterLab/3.1.6-GCCcore-11.2.0/share/jupyter/lab
[I 15:01:52.194 NotebookApp] Serving notebooks from local directory: /users/abc123
[I 15:01:52.194 NotebookApp] Jupyter Notebook 6.4.0 is running at:
[I 15:01:52.194 NotebookApp] http://localhost:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
[I 15:01:52.194 NotebookApp]  or http://127.0.0.1:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
[I 15:01:52.194 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:01:52.198 NotebookApp]

    To access the notebook, open this file in a browser:
        file:///users/abc123/.local/share/jupyter/runtime/nbserver-2295028-open.html
    Or copy and paste one of these URLs:
        http://localhost:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
     or http://127.0.0.1:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c

The server is running and listening on port 8888. We cannot access that port directly from our local computer so we set up the an ssh tunnnel to the login node then then from there to the compute node, where the Jupyter notebook server is actually running. You can do this with one command locally on your computer:

remember to change ‘node112’ to the node you are running Jupyter from
$ ssh -L 8888:localhost:8889 viking.york.ac.uk ssh -N -L 8889:localhost:8888 node112

The above command opens up one ssh tunnel, forwarding your local port 8888 to the Viking login node port 8889. Then it opens up another ssh tunnel from the login node’s port 8889 to the compute node112’s port 8888 - where the Jupyter server is listening.

Finally Ctrl + left mouse click on the link from the first terminal session on node112, highlighted above. Either the http://localhost:8888/?token=... or the http://127.0.0.1:8888/?token=... links. Your browser should open and connected to the Jupyter server running on Viking.

Tip

If you’re using a personal computer, you’ll need to tell Viking your username in the above command for example:

ssh -L 8888:localhost:8889 abc123@viking.york.ac.uk ssh -N -L 8889:localhost:8888 node112

Tip

To help find an open port you can try running this command on Viking:

for p in {8000..9000}; do m=$(netstat -l|grep -c localhost:${p}); if [[ $m == 0 ]]; then echo "try $p"; break; fi; done

Thanks to Felix Ulrich-Oltean for this suggestion

Tidying up

The above command is great for getting a lot done in one go, and simplifies setting up two ssh tunnels however, it also logs into Viking and then leaves the second command running the background (in the above example that’s this part: ssh -N -L 8889:localhost:8888 node112). We don’t want to leave them running so after you are finished using Jupyter Notebooks it’s a good idea to kill those processes.

You can do this by looking at your running processes, with either the ps command or perhaps top, noting the Process ID or PID, and then issuing the kill command followed by the PID.

To quickly find any of your running processes with the characters ssh -N -L in the command, on Viking run:

ps -fu $USER | grep "ssh -N -L" | grep -v grep

If there are any to be found, you should see a list, for example:

the second column is the PID or Process ID
[abc123@login2[viking2] ~]$ ps -fu $USER | grep "ssh -N -L" | grep -v grep
abc123    3937363       1  0 13:40 ?        00:00:00 ssh -N -L 8889:localhost:8888 node112
abc123    3938699       1  0 13:40 ?        00:00:00 ssh -N -L 8000:localhost:8888 node020
abc123    3947158       1  0 13:45 ?        00:00:00 ssh -N -L 8000:localhost:8888 node112

You can kill them with the kill command, for example kill 3937363 3938699 3947158 or you can try the following command to kill any it finds:

kill $(ps -fu $USER | grep "ssh -N -L" | grep -v grep | awk '{print $2}')

As Viking has two login nodes you may need to log into both to kill any unused ssh processes. To log into a specific login node you can specify that with the following:

ssh abc123@viking-login1.york.ac.uk
ssh abc123@viking-login2.york.ac.uk

Jupyter notebooks using VSCode

VSCode locally

Using some of the above guide as reference, another way to so this is with VSCode. You do it all in VSCode and the inbuilt terminals in VSCode. If you’re interested in this method it’s similar to the above in many ways:

  1. Install the Jupyter extension in VSCode

  2. Remote ssh connect to Viking from VSCode’s terminal

  3. Start an interactive session with srun eg srun --nodes=1 --cpus-per-task=8 --time=04:00:00 --pty /bin/bash in the terminal of VSCode

  4. Once the interactive session is running, load the Jupyter module and run the notebook, like above

  5. In a new remote terminal on Viking, in VSCode, set up the ssh forwarding, like above (noting the node number from step 4.)

  6. In VSCode, open a new Jupyter notebook: (Ctrl+Shift+P) and type Jupyter: Create New Jupyter Notebook.

  7. In VSCode, press select kernel in the top right then select Existing Jupyter server

  8. Paste in the URL of the notebook, just like the guide above, follow the prompts in VSCode to name the notebook and select the available kernel

VSCode remote ssh connection to Viking

Yet another way to use VSCode is to have VSCode remotely connect to Viking (so you can open and save files to Viking in VSCode), request some resources on a compute node to run the Jupyter Notebook server and then create a notebook and connect to the server which is running on the compute node.

Note

This is a little complex but if you’re happy to give it a go then the following should be considered a starter guide as you will need to try different ports and be happy with a little trial and error.

It’s worth explicitly mentioning where things are running as we’ll need to forward a port later so this may help visualise things. In this example we’ll also use the listed ports (but those will likely be different for you):

Login node

Compute node

VScode

Jupyter Notebook

Port: 8202

Port: 8001

Tip

The above are ports I chose in this example, you will likely have to pick different ports.

  1. Connect VSCode to Viking over ssh

  2. Install Jupyter ext in VSCode, on Viking. Ensure this is installed remotely on the ssh host (Viking)

  3. Start an interactive session with srun eg srun --nodes=1 --cpus-per-task=8 --time=04:00:00 --pty /bin/bash in the terminal of VSCode, make a note of the node (in this example we’ll say it’s node123)

  4. Load the Jupyter module: module load JupyterLab/3.1.6-GCCcore-11.2.0

  5. Start a server on a port, make a note of the port: jupyter notebook --no-browser --port 8001 (see finding port tip for help picking a port)

  6. In a terminal on the login node set up port forward from login node -> compute node eg: ssh -N -L 8202:localhost:8001 node123 and leave it running (again you’ll need to pick an open port on the login node, in this case I chose 8202)

  7. In VSCode create a new Notebook: (Ctrl+Shift+P) and type Jupyter: Create New Jupyter Notebook or open an existing Notebook

  8. In VSCode select the kernel by clicking button in the top right, click Select another kernel... then Existing Jupyter server... and paste in the link (which was given when you ran the Notebook server on the compute node) BUT ensure it’s the port you are forwarding on the login node which in this example was 8202 and the link here looks like: http://127.0.0.1:8202/?token=991782e43816c044d3e0eeecca5258c1b105344fc5ddb990