Introduction
This article is Part 1 of a multi-part series on Container Orchestration.
Part 1 is a conceptual introduction for beginners (if you have little or no conceptual experience with these topics.) Part 1 gives a conceptual foundation on core topics including Portability, Docker, Containers, and Images.
Part 2 will be a tutorial explanation used to introduce some more advanced Docker topics (volumes, networking, and Docker Compose.) If you want to get your hands dirty, Part 2 would be a good starting point.
In Part 3, we’ll extend the Part 2 tutorial and refactor it from a single-host project deployed through Docker Compose to a multi-host project deployable through a Kubernetes cluster. We’ll explain how and why the conceptual model changes when you go from assuming a single host machine to assuming a cluster of coordinating hosts.
What’s the big deal with Docker?
Docker has become an industry standard solution for software portability. Portability means your application’s behavior shouldn’t depend on the environment (except in the ways you know and require.)
Why is portability important? Because it removes guesswork. It helps you ship. None of your application’s features matter if you can’t ship.
Publishing an update today? Don’t be this guy. Use Docker. (Source: https://imgflip.com)
Motivating Example: Before Docker
The usual use case for Docker is something like deploying server code to a cloud environment. But in principle, it’s a general-purpose way to make software more portable.
For example, let’s say you’ve got a process (a database, or a web server.) Let’s say you need to replicate this process on three other machines: Larry, Mandy and Walter. Larry uses Linux, Mandy uses Mac, and Walter uses Windows. (In this analogy, these names could refer to coworker’s workstations just as well as to production cloud servers.)
How should you document these instructions? What would they look like?
Well, without Docker (or some other type of containerization/virtualization) you’re forced to worry about the environment. You’d need to write install docs and user guides with a section for each different operating system. (Below code is for sake of example—no need to run it.)
- Linux user Larry needs to run the following commands:
sudo apt-get update && sudo apt-get install -y nginx
services nginx start
- Mac user Mandy needs to install homebrew, and then run these commands:
brew install nginx
ln -sfv /usr/local/opt/nginx/*.plist ~/Library/LaunchAgents
launchctl load ~/Library/LaunchAgents/homebrew.mxcl.nginx.plist
- Windows user Walter needs to install WgetForWindows and then run the follow commands:
wget https://nginx.org/download/nginx-1.17.3.zip
unzip nginx-1.17.3.zip
cd nginx-1.17.3
start nginx
This should be a simple and rote procedure, but already it’s turning into a mess. All this just to install and start a webserver. (We haven’t even configured it to do anything yet!)
You ask, we answer! Reserve your FREE 30-minute consultation.
Motivating Example: After Docker
Docker is about portability. The documentation shouldn’t depend on the environment (operating system.) The documentation should only involve two instructions – one to “download” and the other to “run”:
docker image pull nginx
docker run –name my-web –rm -p 80:80 -d nginx
How is this more portable?
- First, these commands standardize the procedure – instead of documenting three procedures, we only document one.
- Second, these commands will produce identical results, regardless the environment.
Next, let’s unpack these two lines and see how they work.
The first line (“docker image pull nginx”) handles the download and installation. Specifically, it tells Docker to go out to a repository (by default, Docker Hub) and pull down a Docker image. Think of an image as a snapshot of a guest environment (in this case, the “nginx” image is a snapshot of an environment with Nginx installed.)
The second line (“docker run –name my-web –rm -p 80:80 -d nginx“) handles starting the webserver. Specifically, it creates a container based on the nginx image. A Docker container is defined by a process running inside a virtual guest environment. (The container’s initial state is controlled by the Docker image.)
World’s Shortest Docker Tutorial
1. Before moving on, let’s get our hands dirty. Install Docker from one of the following links:
Linux (ubuntu) https://docs.docker.com/install/linux/docker-ce/ubuntu/
Mac OSX https://docs.docker.com/docker-for-mac/install/
Windows 10 https://docs.docker.com/docker-for-windows/install/
2. Next, download the official public Nginx image from Docker Hub:
docker image pull nginx
3. Next, we run the “nginx” Image in a new Docker container named “my-web”:
docker run –name my-web –rm -p 80:80 -d nginx
4. Next, test that it’s working (Open https://localhost in your browser.)
5. Finally, stop the “my-web” container:
docker stop my-web
Creating Images: Pulling vs Building
Let’s summarize what we’ve covered so far.
- Docker is a tool for making code portable.
- It does this using two central concepts: containers and images.
- When you run code in Docker, that code is running inside a container.
- Containers are somewhat like virtual machines, and to create containers, you use images.
- Images are somewhat like a virtual machine snapshot.
But where do images come from?
The answer is images come from Dockerfiles. We’ll go into Dockerfiles shortly, but first we need to finish explaining images.
There are two ways to get your hands on a Docker image: Pulling and Building. There’s a trade-off between these options – pulling someone else’s image is less work, but building your own image is more flexible.
Pulling an image means using something someone else made. Docker Hub has a ton of public images for containerized system components (things like message queues, databases, or web servers.) You can download and run these out of the box—although configuration can be a bit more work. (We’ll get into best practices for configuration in Part 2 of this series.)
The alternative is when the public image isn’t configured the way you want, or where the idea of a public image doesn’t even apply (like if you want to containerize your own application code.) For these use cases, you’ll need to build your own image.
Building your own image means writing your own Dockerfile. Pulling someone else’s image means pulling their Dockerfile. Simple as that.
So what’s a Dockerfile?
A Dockerfile is basically a step-by-step installation script for setting up the environment. Every Dockerfile begins by referring to a parent (base) image using a statement like “FROM <base image>”.
For example, if your application code is written in Python 3.7, then you need to pick a base image. You *could* use “ubuntu:18” as a base image, but it wouldn’t necessarily include Python 3.7–a better choice would be to start your Dockerfile with “FROM python:3.7-buster”.
The idea of the base image is it’s a starting point – an environment that’s fairly close to what your software requires. From there, the rest of your Dockerfile is used to differentiate your specific image from the general base image.
For application code, your Dockerfile statements can be broken up into a few typical stages.
- First stage, you install system dependencies (“apt-get install…”)
- Second stage, you’ll copy files (from the host filesystem into the guest filesystem.)
- Third stage, you’ll install language-specific dependencies (“pip install -r requirements.txt”.)
- Final stage is to execute your project’s build tool (something like Maven or Gradle, if your code’s a Java project. Not all languages or projects would have a stage like this.)
Whatever your Dockerfile looks like, its responsibilities remain the same. The Dockerfile is solely responsible for defining your image. And your image is responsible for defining the initial state of your containers. Docker combines these concepts of container and image to ensure portability.
Limitations of Docker
By this point, you should understand what I mean when I say Docker solves portability.
But you might be wondering something else: There’s a ton of hype around “Docker” and “containerization.” Aren’t these concepts supposed magic solutions to all your problems?
(Source: https://imgur.com/a/o5TDtXk)
Here’s the thing: Docker is only useful at ensuring portability. (And portability is just one thing that can go wrong in an application.) The reason I repeat the mantra “Docker is for portability” is because it implies “Docker isn’t for everything.”
Docker’s limitation is it can’t coordinate multiple containers. Coordination needs to be solved at a higher level. By layering additional tools (orchestrators) on top of Docker, you can define systems of related containers. These orchestrators then let you perform operations on the entire system of containers.
The reason for all the hype around Docker and containerization is because it’s a necessary step along the way to orchestration—and orchestration is important because it’s how you tackle very complex problems (like scalability, distributed computation, rules for automatic failure recovery.)
Next Steps
So we’ve learned what Docker’s good for. We’ve learned some basic concepts. And we’ve seen the limitations – what Docker (and containerization) can’t do on their own.
In Part 2, we’ll learn how to coordinate multiple containers on the same host machine. We’ll cover some more advanced Docker concepts that deal with coordination and inter-container dependencies (things like volumes, networks, and Docker Compose.)
For any further questions, or if you’d like to schedule a free consultation, please reach out to us at info@syntelli.com.