Data Innovation Summit 2016

Large enterprises in every industry have caught wind of the trends in big data and machine learning but are unable to readily capitalize on them for a number of reasons. One reason is how hard it is to secure competent resources. People with a background in SQL, SAS,...

Goodbye Data Warehouse, We Hardly Knew Ye.

I recently held the main stage at Data Innovation Summit and spoke about the emerging discipline of data science. It was an outstanding event with delegates from many countries spanning multiple industries. Hadoop and ideas of a “data lake” were given...

GPU Accelerated Machine Learning

  If you’re serious about big data and machine learning, you’re already taking advantage of GPU, MIC, and FPGA powered analytics tools. This new breed of software can allow a single workstation to outperform a 100-node compute cluster in tasks like machine...

Breaking down the silos

What is the answer to this question? Caroline and Ron are happily married since 19 years. Caroline is 45 and Ron is 44. They have four children. Emily, Douglas, Emma and Oliver. Emily is twice as old as Oliver and Emma is their youngest child. Douglas is 4 years older...

A data scientist is a data scientist – not a sales guy

Data Scientists are normally not the best sellers and communicators. They are simply – data scientists. If you are serious about transforming your business using Big Data, machine learning, advanced analytics, and sophisticated automated data driven workflows...

Building Your Lab – Part 4 – VPN Access

Setting up a point-to-site VPN connection to your Azure network. We need to set up a way of accessing all the browser-based management tools without exposing them needlessly to the internet. The most convenient way is the point-to-site VPN service in Azure, which...

Why Deep Learning?

Your competitors are probably not using it yet. Software from well-established analytics vendors has always been 5 to 10 years behind leading machine learning technology. Consider the following example: 2001 – The Random Forest algorithm was formalized by Leo...

Building Your Lab – Part 3 – JupyterHub

Set up a Jupyter Notebook for Julia, R, and Python. It’s time to set up a development environment and the cool new way to work is through “notebooks”, which are web-based interactive programming environments. I wouldn’t go so far as to say that...

Big Data Solves Small Problems

Big Data has been on the Gartner Hype Cycle for quite some time already and there is not a single CxO or business decision maker who has not attended a conference or read an article about the Big Opportunity with Big Data. Things are certainly happening in the field...

Building Your Lab – Part 2 – Azure Infrastructure

Upload your Fedora Server VM and run it on Azure. So you’ve prepped your VHD file and want to give it some Xeon processors and some generous RAM? Let’s deploy it on Azure, and set up a development environment. The first step is to use PowerShell to upload...