Jupyter notebooks
The Jupyter notebook is a web-based notebook environment for interactive computing. Running a Jupyter notebook
server remotely on Viking and connecting to it from your local web browser is certainly possible.
There are a few steps to ensure that your notebook is running on a compute node (not a login node) and you can remotely connect to it.
The connection would look something like this:
Your computer ➡️ Viking login node ➡️ Viking compute node ➡️ Jupyter notebook
One way to do this would be to start an interactive session on a compute node, from that session run the Jupyter
notebook server:
$ srun --nodes=1 --ntasks=1 --cpus-per-task=5 --time=08:00:00 --pty /bin/bash
srun: job 76415 queued and waiting for resources
srun: job 76415 has been allocated resources
Creating user dir for 'Local Scratch'
flight start: Flight Direct environment is already active.
[abc123@node112[viking2] ~]$
Here we have an interactive bash
session on the node112
compute node, with five CPU cores, for eight hours. Next we load the Jupyter
module and run the server:
Attention
In this example we’re using the Jupyter module on Viking. If you are using Conda or another virtual environment to install Jupyter alongside other packages you should not run the module load JupyterLab/3.1.6-GCCcore-11.2.0
command below, skip that command and the rest should still be valid. Instead you should activate your virtual environment where you have Jupyter installed.
[abc123@node112[viking2] ~]$ module load JupyterLab/3.1.6-GCCcore-11.2.0
[abc123@node112[viking2] ~]$ jupyter notebook --no-browser
[I 15:01:51.756 NotebookApp] Writing notebook server cookie secret to /users/abc123/.local/share/jupyter/runtime/notebook_cookie_secret
[I 2023-11-03 15:01:52.190 LabApp] JupyterLab extension loaded from /opt/apps/eb/software/JupyterLab/3.1.6-GCCcore-11.2.0/lib/python3.9/site-packages/jupyterlab
[I 2023-11-03 15:01:52.190 LabApp] JupyterLab application directory is /opt/apps/eb/software/JupyterLab/3.1.6-GCCcore-11.2.0/share/jupyter/lab
[I 15:01:52.194 NotebookApp] Serving notebooks from local directory: /users/abc123
[I 15:01:52.194 NotebookApp] Jupyter Notebook 6.4.0 is running at:
[I 15:01:52.194 NotebookApp] http://localhost:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
[I 15:01:52.194 NotebookApp] or http://127.0.0.1:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
[I 15:01:52.194 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 15:01:52.198 NotebookApp]
To access the notebook, open this file in a browser:
file:///users/abc123/.local/share/jupyter/runtime/nbserver-2295028-open.html
Or copy and paste one of these URLs:
http://localhost:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
or http://127.0.0.1:8888/?token=9b0f6d6918f238c0e8543257a842b65cd4671ee1b55a4e3c
The server is running and listening on port 8888
. We cannot access that port directly from our local computer so we set up the an ssh tunnnel
to the login node then then from there to the compute node, where the Jupyter
notebook server is actually running. You can do this with one command locally on your computer:
$ ssh -L 8888:localhost:8889 viking.york.ac.uk ssh -N -L 8889:localhost:8888 node112
The above command opens up one ssh tunnel
, forwarding your local port 8888
to the Viking login node port 8889
. Then it opens up another ssh tunnel
from the login node’s port 8889
to the compute node112’s port 8888
- where the Jupyter
server is listening.
Finally Ctrl + left mouse click
on the link from the first terminal session on node112
, highlighted above. Either the http://localhost:8888/?token=...
or the http://127.0.0.1:8888/?token=...
links. Your browser should open and connected to the Jupyter
server running on Viking.
Tip
If you’re using a personal computer, you’ll need to tell Viking your username in the above command for example:
ssh -L 8888:localhost:8889 abc123@viking.york.ac.uk ssh -N -L 8889:localhost:8888 node112
Tip
To help find an open port you can try running this command on Viking:
for p in {8000..9000}; do m=$(netstat -l|grep -c localhost:${p}); if [[ $m == 0 ]]; then echo "try $p"; break; fi; done
Thanks to Felix Ulrich-Oltean for this suggestion
Tidying up
The above command is great for getting a lot done in one go, and simplifies setting up two ssh tunnels
however, it also logs into Viking and then leaves the second command running the background (in the above example that’s this part: ssh -N -L 8889:localhost:8888 node112
). We don’t want to leave them running so after you are finished using Jupyter Notebooks it’s a good idea to kill
those processes.
You can do this by looking at your running processes, with either the ps
command or perhaps top
, noting the Process ID or PID
, and then issuing the kill
command followed by the PID
.
To quickly find any of your running processes with the characters ssh -N -L
in the command, on Viking run:
ps -fu $USER | grep "ssh -N -L" | grep -v grep
If there are any to be found, you should see a list, for example:
PID
or Process ID[abc123@login2[viking2] ~]$ ps -fu $USER | grep "ssh -N -L" | grep -v grep
abc123 3937363 1 0 13:40 ? 00:00:00 ssh -N -L 8889:localhost:8888 node112
abc123 3938699 1 0 13:40 ? 00:00:00 ssh -N -L 8000:localhost:8888 node020
abc123 3947158 1 0 13:45 ? 00:00:00 ssh -N -L 8000:localhost:8888 node112
You can kill them with the kill
command, for example kill 3937363 3938699 3947158
or you can try the following command to kill any it finds:
kill $(ps -fu $USER | grep "ssh -N -L" | grep -v grep | awk '{print $2}')
As Viking has two login nodes you may need to log into both to kill any unused ssh
processes. To log into a specific login node you can specify that with the following:
ssh abc123@viking-login1.york.ac.uk
ssh abc123@viking-login2.york.ac.uk
Jupyter notebooks using VSCode
VSCode locally
Using some of the above guide as reference, another way to so this is with VSCode. You do it all in VSCode and the inbuilt terminals in VSCode. If you’re interested in this method it’s similar to the above in many ways:
Install the Jupyter extension in VSCode
Remote ssh connect to Viking from VSCode’s terminal
Start an interactive session with
srun
egsrun --nodes=1 --cpus-per-task=8 --time=04:00:00 --pty /bin/bash
in the terminal of VSCodeOnce the interactive session is running, load the
Jupyter
module and run the notebook, like aboveIn a new remote terminal on Viking, in VSCode, set up the ssh forwarding, like above (noting the
node
number from step 4.)In VSCode, open a new
Jupyter
notebook:(Ctrl+Shift+P)
and typeJupyter: Create New Jupyter Notebook.
In VSCode, press
select kernel
in the top right then selectExisting Jupyter server
Paste in the URL of the notebook, just like the guide above, follow the prompts in VSCode to name the notebook and select the available kernel
VSCode remote ssh connection to Viking
Yet another way to use VSCode is to have VSCode remotely connect to Viking (so you can open and save files to Viking in VSCode), request some resources on a compute node to run the Jupyter Notebook server and then create a notebook and connect to the server which is running on the compute node.
Note
This is a little complex but if you’re happy to give it a go then the following should be considered a starter guide as you will need to try different ports and be happy with a little trial and error.
It’s worth explicitly mentioning where things are running as we’ll need to forward a port later so this may help visualise things. In this example we’ll also use the listed ports (but those will likely be different for you):
Login node |
Compute node |
---|---|
VScode |
Jupyter Notebook |
Port: 8202 |
Port: 8001 |
Tip
The above are ports I chose in this example, you will likely have to pick different ports.
Connect VSCode to Viking over ssh
Install Jupyter ext in VSCode, on Viking. Ensure this is installed remotely on the ssh host (Viking)
Start an interactive session with
srun
egsrun --nodes=1 --cpus-per-task=8 --time=04:00:00 --pty /bin/bash
in the terminal of VSCode, make a note of the node (in this example we’ll say it’snode123
)Load the Jupyter module:
module load JupyterLab/3.1.6-GCCcore-11.2.0
Start a server on a port, make a note of the port:
jupyter notebook --no-browser --port 8001
(see finding port tip for help picking a port)In a terminal on the login node set up port forward from login node -> compute node eg:
ssh -N -L 8202:localhost:8001 node123
and leave it running (again you’ll need to pick an open port on the login node, in this case I chose8202
)In VSCode create a new Notebook:
(Ctrl+Shift+P)
and typeJupyter: Create New Jupyter Notebook
or open an existing NotebookIn VSCode select the kernel by clicking button in the top right, click
Select another kernel...
thenExisting Jupyter server...
and paste in the link (which was given when you ran the Notebook server on the compute node) BUT ensure it’s the port you are forwarding on the login node which in this example was8202
and the link here looks like:http://127.0.0.1:8202/?token=991782e43816c044d3e0eeecca5258c1b105344fc5ddb990