What and Why?
Docker is an open-source platform that allows you to automate the deployment and management of applications using containerisation. Containers are lightweight, isolated environments that package everything needed to run an application, including the code, runtime, libraries, and dependencies.
As a ML engineer or data engineer, using Docker offers several benefits. Firstly, it provides a consistent and reproducible environment, ensuring that your application runs the same way across different machines and operating systems. This eliminates the "it works on my machine" problem.
How to install Docker?
Check whether it is already installed.
Check with this command:
docker --version
If docker is not installed then
Docker Docker Desktop Application from official docker website.
Install Docker.exe(Windows) or Docker.dmg(Mac) etc..
In terminal run the command again
docker --version
Can't see any version? -- troubleshoot yourself. (go to stackOverflow)
run this command to see any container running or not
docker ps
Docker Images, Containers, Volumes
Docker Images: Docker images are read-only templates that you use to create Docker containers. An image includes everything needed to run an application, including the code, a runtime, libraries, environment variables, and config files.
Docker Containers: A Docker container is a runnable instance of a Docker image. You can start, stop, move, or delete a container using the Docker API or CLI.
Docker Volumes: While Docker lets you store your container's data within the container itself. It allows you to share data between containers and protect your data from being lost when the container is stopped or deleted.
Pulling a Docker Image from Docker Hub
There are a lots of images available to pull in docker hub. To install a docker image we have to run this simple command.
docker pull <image-name>:<version>
for e.g.
docker pull python:3.9
This will pull the image from the docker hub and store it inside the local machine.
It contains the runtime, libraries to run python, environment variables all of that, its a self sufficient python environment to be specific.
After the installation you can check the list of installed images
docker images
Run a docker container
You can run that docker container via its name and tag with below command
docker run -it <container_name>:<tag>
Here the -it
is for interactive mode
you can run it in -d
or detach
mode also, in that way it'll run in the background (ideal for running databases, backend API s inside docker)
Create Docker file
Rather than pulling, running the container and installing all the dependencies needed for that docker container one by one, we can define all of these inside a Dockerfile
.
In that way we can build our own custom containers from a base image. Here is a quick template how to do that.
Dockerfile
FROM <base_image>:<tag>
WORKDIR /app
RUN # scrips to run
ENTRYPOINT ["bash"] # or you can make any other entry points
to build this image
docker build -t <name>:<tag> <path_to_docker_file >
then run the docker container
docker run -it <name>:<tag >
Interact with it, and explore more.
If you are more interested in Docker, CI/CD, DevOps and MLOps you can follow me to get more articles like this.