Using Docker for Deep Learning

Struggling to recreate your results? So was I

Deep Learning projects are always pretty big. They involve huge datasets, huge amounts of processing power but annoyingly, they also require a huge amount of investigation as more often then not, they just don’t work in the beginning. It’s after a bunch of feature selection, model fitting and parameter tuning do you get something that looks half decent.

In saying that though, running this all in the form of a container can be a bit daunting because not only do you need to deploy a server somewhere, you need to have tests in place to make sure that what’s happening inside your cordoned off container is expected. You need to make sure that your machine learning algorithm is converging in the right way, and you need to be sure that any errors are picked up.

Despite these (and many more) difficulties, using containers for Deep Learning comes with great rewards. The below are but a few of the main difficulties that are overcome by using containers.

Reason 1: `Isolation`

Just like how multiple containers can exist on a ship — computerised containers can exist in harmony on a server. The benefit of this is that resources are predefined, so machine learning algorithms can run in parallel on the same machine without competing for resources.

The only thing more demanding than one machine learning project is two, so if these are set up correctly, isolated projects can co-exist pretty well in a highly distributed environment.

Reason 2: Avoiding `Dependency` Conflicts

Machine Learning projects are often pretty large projects that use a wide variety of underlying tech. However, the problem comes when your classification algorithm uses the latest version of Numpy and your regression algorithm uses a more recent version — which should exist on your system?

In reality, these libraries can exist in parallel but require isolated systems. Just how your deep learning system may rely on conflicting libraries, containers allow for isolation so through time, you’re less likely to run into these really annoying problems.

Reason 3: `Portability`

In the process of creating a container, you generally predefine everything that goes into it (code and all). Given that, you can save the ‘image’ of the container and share it easily. It’s so useful that companies exist in this space (Note DockerHub).

Reason 4: `Microservices`

The awesome part of containers is that you’re able to expose certain parts of them. This means that if you want to create something like an API, then you can expose parts of the containers that allow you to run an API in a more controlled way. Specifically, this is great if your API involves a large computational process, or, something with a lot of moving parts. By separating workflows, you can identify where systems are failing much quicker.

Check the following reference to learn a bit more:3 reasons to always use containers for microservices-based applications
Microservices are the emerging application platform: It is the architecture that will serve as the basis for many…techbeacon.com

Reason 5: `Reproducibility`

I think the ultimate reason why containers are useful for Deep Learning is because they aid in reproduction. It’s pretty common for researchers to make ground-breaking discoveries when in research mode. However, once the algorithm is re-trained, they may find that a bug had caused the amazing results or they can’t recreate the results!

This is obviously an issue as you don’t know what gave you the great results so it makes sense to have some form of framework that allows you to robustly reproduce results. Code tracking engines obviously work (like GitHub) but its worth going that one step further to really push forward.

Code for Dockerising Deep Learning

If you want some simple hands on code to implement a Dockerised Deep Learning project, then this link here is great for just that.

If you find it a bit confusing, then this following project (using an example from the world of fashion) may be a little bit easier to digest.

The above were 5 quick reasons why containers are great for Deep Learning, and two references were given to help you learn how to implement it.

Docker is pretty powerful and despite being a couple of years old now (and a number of great iterations/improvements being developed), it’s still awesome to be able to get the functionality that one desires: that being isolation and reproducibility.

Thanks for reading! If you have any messages, please let me know!

Keep up to date with my latest articles here!

Struggling to recreate your results? So was I

Reason 1: Isolation

Reason 2: Avoiding Dependency Conflicts

Reason 3: Portability

Reason 4: Microservices

Reason 5: Reproducibility

Code for Dockerising Deep Learning

Share this:

Related

Leave a comment

Reason 1: `Isolation`

Reason 2: Avoiding `Dependency` Conflicts

Reason 3: `Portability`

Reason 4: `Microservices`

Reason 5: `Reproducibility`