Set up a Jupyter Notebook for Julia, R, and Python.
It’s time to set up a development environment and the cool new way to work is through “notebooks”, which are web-based interactive programming environments. I wouldn’t go so far as to say that notebooks replace the traditional IDE, but they bring with them a number of benefits and neat scenarios. The substance of a notebook is it’s layout. The code, the documentation, and the results are all in one long pedagogical feed that fosters collaboration and sharing. Though there are several notebooks out there right now including Beaker and Zeppelin, I will focus on the most popular and functional one right now, Jupyter, and get it to work with Julia, R, and Python. Jupyter can even be extended to support dozens of languages like Scala, Haskell, Go, F#, Matlab, and many more.
As with most things open source this one is not “turnkey” either. I am the type that wishes to keep the system tidy and install as few packages as possible, but in the case of data science one must learn to live with a mishmash of many tools. In the future we can keep the environment marginally more clean by distributing things through docker containers. Let us begin then:
sudo dnf install gcc gcc-c++ redhat-rpm-config python3-devel sudo pip3 install --upgrade pip sudo pip3 install notebook
Jupyter comes with a Python3 “kernel” pre-installed, so we need to install Julia and R manually and make them available to all users on the system. In addition, we will install JupyterHub so that users can log on to the system using their directory credentials and start a notebook.
sudo dnf install julia R czmq-devel npm sudo R > install.packages(c('rzmq','repr','IRkernel','IRdisplay'), repos = c('http://irkernel.github.io/', getOption('repos')), type = 'source') > IRkernel::installspec(user = FALSE) > q() julia > Pkg.add("IJulia") > quit()
Before anything else, you should make sure you have SSL certificates ready since jupyterhub includes authentication and allows arbitrary code execution. If you don’t already have your own cert, an easy way to get one is using the letsencrypt package or – in a pinch – sign your own certificates as I do below.
sudo mv .local/share/jupyter/kernels/julia-0.4 /usr/local/share/jupyter/kernels/ sudo pip3 install jupyterhub sudo pip3 install ipywidgets sudo npm install -g inherits sudo npm install -g configurable-http-proxy sudo mkdir /etc/jupyterhub cd /etc/jupyterhub sudo mkdir -m0700 keys sudo touch keys/ssl.key sudo chmod 0600 keys/ssl.key sudo openssl genpkey -algorithm RSA -out keys/ssl.key -pkeyopt rsa_keygen_bits:4096 sudo openssl req -key keys/ssl.key -x509 -new -days 3650 -out keys/ssl.pem
Here comes a tricky part for two reasons. First, we don’t want to run a service facing the web as root, that’s just not very security conscious. Secondly, Fedora makes use of SELinux which would think JupyterHub spawning lots of new processes with different UIDs as too suspicious to allow. So we will go a bit out of our way to do things the right way and give JupyterHub a new unprivileged user. We will also create a group %jupyter that people can join to have permission to spawn Jupyter servers.
sudo pip3 install git+https://github.com/jupyter/sudospawner sudo groupadd jupyter sudo usermod -a -G jupyter r3tex #ADD USERS (like r3tex) TO THIS GROUP sudo useradd jupyterhub sudo chown -R jupyterhub /etc/jupyterhub sudo visudo Cmnd_Alias JUPYTER_CMD = /bin/sudospawner jupyterhub ALL=(%jupyter) NOPASSWD:JUPYTER_CMD #TAKES ALIAS OR GROUP sudo groupadd shadow sudo chgrp shadow /etc/shadow sudo chmod 0640 /etc/shadow sudo usermod -a -G shadow jupyterhub sudo vi /etc/jupytehub/jupyterhub_config.py c.JupyterHub.spawner_class=sudospawner.SudoSpawner c.JupyterHub.port = 443 c.JupyterHub.ip = '10.0.0.4' #YOUR AZURE SUBNET IP c.JupyterHub.cookie_secret_file = u'/etc/jupyterhub/jupyterhub_cookie_secret' c.JupyterHub.db_url = u'/etc/jupyterhub/jupyterhub.sqlite' c.JupyterHub.ssl_key = u'/etc/jupyterhub/keys/ssl.key' c.JupyterHub.ssl_cert = u'/etc/jupyterhub/keys/ssl.pem' sudo setcap 'cap_net_bind_service=+ep' /usr/bin/node sudo firewall-cmd --permanent -add-service=https sudo vi /lib/systemd/system/jupyterhub.service [Unit] Description=Jupyterhub [Service] User=root ExecStart=/bin/jupyterhub --f /etc/jupyterhub/jupyterhub_config.py [Install] WantedBy=multi-user.target chcon -u system_u jupyterhub.service sudo systemctl daemon-reload sudo systemctl enable jupyterhub sudo systemctl start jupyterhub
I would advise against opening up port 443 to the external web. Instead you should set up a VPN connection to your Azure resource group. Check out Part 4 of this series to find out how.
Jupyter uses PAM to authenticate by default, so you can log in with your UNIX credentials.
Now start coding some R, Julia, and Python!