Scala creating an executable jar with sbt-assembly
Packaging a Scala program to distribute is not a straight forward affair with the standard library. You will need to pull in your Scala and Java dependencies and have a script that adds these to your classpath before launching the program. There is fortunately an easier way…
Install latest version of Python on Ubuntu
When using Linux devs and data scientists often end up using the default Python version included in the package repositories. This can lead you to having to wait for a long time to try out Python’s new features! The following post describes how to compile and install an extra Python version without interfering with the system Python and creating a virtual environment to use the new Python version.
Install Jupyter extensions
Jupyter extensions are a great way of increasing your productivity when using notebooks. In this post I will show how to install them and a configuration tool, as well as include information about some of my favourites and my personal configuration file.
Install TiddlyWiki server mode (Ubuntu)
Install TiddlyWiki, an awesome non-linear notebook on your own server, so that you can access it anywhere.
Run Jupyter notebook server in Docker
Running your Jupyter notebooks from Docker means you can easily recreate your environment on another computer.
Introduction to Docker
Docker is a helpful tool that allows you to recreate an environment very quickly. It can replace many use cases that were previously solved using virtual machines & Docker does it better, requiring less resources and being able to process instructions natively at full speed.
PySpark - create DataFrame from scratch
These snippets show how to make a DataFrame from scratch, using a list of values. This is mainly useful when creating small DataFrames for unit tests. Imagine we would like to have a table with an
idcolumn describing a user and then two columns for the number of cats and dogs she has.
subscribe via RSS