Set up a Jupyter Notebook for Julia, R, and Python.

It’s time to set up a development environment and the cool new way to work is through “notebooks”, which are web-based interactive programming environments. I wouldn’t go so far as to say that notebooks replace the traditional IDE, but they bring with them a number of benefits and neat scenarios. The substance of a notebook is it’s layout. The code, the documentation, and the results are all in one long pedagogical feed that fosters collaboration and sharing. Though there are several notebooks out there right now including Beaker and Zeppelin, I will focus on the most popular and functional one right now, Jupyter, and get it to work with Julia, R, and Python. Jupyter can even be extended to support dozens of languages like Scala, Haskell, Go, F#, Matlab, and many more.
Jupyter
As with most things open source this one is not “turnkey” either. I am the type that wishes to keep the system tidy and install as few packages as possible, but in the case of data science one must learn to live with a mishmash of many tools. In the future we can keep the environment marginally more clean by distributing things through docker containers. Let us begin then:

sudo dnf install gcc gcc-c++ redhat-rpm-config python3-devel
sudo pip3 install --upgrade pip
sudo pip3 install notebook

Jupyter comes with a Python3 “kernel” pre-installed, so we need to install Julia and R manually and make them available to all users on the system. In addition, we will install JupyterHub so that users can log on to the system using their directory credentials and start a notebook.

sudo dnf install julia R czmq-devel npm
sudo R
> install.packages(c('rzmq','repr','IRkernel','IRdisplay'),
                   repos = c('http://irkernel.github.io/', getOption('repos')),
                   type = 'source')
> IRkernel::installspec(user = FALSE)
> q()
julia
> Pkg.add("IJulia")
> quit()

Before anything else, you should make sure you have SSL certificates ready since jupyterhub includes authentication and allows arbitrary code execution. If you don’t already have your own cert, an easy way to get one is using the letsencrypt package or – in a pinch – sign your own certificates as I do below.

sudo mv .local/share/jupyter/kernels/julia-0.4 /usr/local/share/jupyter/kernels/
sudo pip3 install jupyterhub
sudo pip3 install ipywidgets
sudo npm install -g inherits
sudo npm install -g configurable-http-proxy
sudo mkdir /etc/jupyterhub
cd /etc/jupyterhub
sudo mkdir -m0700 keys
sudo touch keys/ssl.key
sudo chmod 0600 keys/ssl.key
sudo openssl genpkey -algorithm RSA -out keys/ssl.key -pkeyopt rsa_keygen_bits:4096
sudo openssl req -key keys/ssl.key -x509 -new -days 3650 -out keys/ssl.pem

Here comes a tricky part for two reasons. First, we don’t want to run a service facing the web as root, that’s just not very security conscious. Secondly, Fedora makes use of SELinux which would think JupyterHub spawning lots of new processes with different UIDs as too suspicious to allow. So we will go a bit out of our way to do things the right way and give JupyterHub a new unprivileged user. We will also create a group %jupyter that people can join to have permission to spawn Jupyter servers.

sudo pip3 install git+https://github.com/jupyter/sudospawner
sudo groupadd jupyter
sudo usermod -a -G jupyter r3tex #ADD USERS (like r3tex) TO THIS GROUP
sudo useradd jupyterhub
sudo chown -R jupyterhub /etc/jupyterhub
sudo visudo
Cmnd_Alias JUPYTER_CMD = /bin/sudospawner
jupyterhub ALL=(%jupyter) NOPASSWD:JUPYTER_CMD #TAKES ALIAS OR GROUP
sudo groupadd shadow
sudo chgrp shadow /etc/shadow
sudo chmod 0640 /etc/shadow
sudo usermod -a -G shadow jupyterhub
sudo vi /etc/jupytehub/jupyterhub_config.py
c.JupyterHub.spawner_class=sudospawner.SudoSpawner
c.JupyterHub.port = 443
c.JupyterHub.ip = '10.0.0.4' #YOUR AZURE SUBNET IP
c.JupyterHub.cookie_secret_file = u'/etc/jupyterhub/jupyterhub_cookie_secret'
c.JupyterHub.db_url = u'/etc/jupyterhub/jupyterhub.sqlite'
c.JupyterHub.ssl_key = u'/etc/jupyterhub/keys/ssl.key'
c.JupyterHub.ssl_cert = u'/etc/jupyterhub/keys/ssl.pem'
sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node
sudo firewall-cmd --permanent -add-service=https
sudo vi /lib/systemd/system/jupyterhub.service
[Unit]
Description=Jupyterhub
[Service]
User=root
ExecStart=/bin/jupyterhub --f /etc/jupyterhub/jupyterhub_config.py
[Install]
WantedBy=multi-user.target
chcon -u system_u jupyterhub.service
sudo systemctl daemon-reload
sudo systemctl enable jupyterhub
sudo systemctl start jupyterhub

I would advise against opening up port 443 to the external web. Instead you should set up a VPN connection to your Azure resource group. Check out Part 4 of this series to find out how.
Jupyter uses PAM to authenticate by default, so you can log in with your UNIX credentials.
Now start coding some R, Julia, and Python!
Azure3