Docker from Scratch, Part 3: Entrypoints and Ports

 

In the last post we used the RUN statement in our Dockerfile to setup and install software in our container. This saves us the trouble of creating build scripts that we need to run each time we start the container. We still have to specify the command to run when we start the container. What we want to do is eliminate that so we have less to remember each time we use the container.

Background vs. Foreground Processes

When a container is run, only a single process is run within it. Containers are less like VMs in this way and more like sandboxed applications. The problem is, RUN only operates during the build phase of the container, not the run phase. Each time we have a RUN statement in our Dockerfile, Docker will start a new container from the most recent image, run the command, and if successful, create a new image overlaying the previous image. When we run the container, we used the Docker “run” command:

$ docker run -i -t aa0 /bin/bash

Where “aa0” are the first three characters of the image ID, and “/bin/bash” is the command to execute when running the container. We had used the ADD statement in the Dockerfile to install the Apache web server. What we want to do, however, is run the web server, not a bash shell. On the Debian Linux distribution we can run Apache using the “apachectl” command. Typically it takes the following form:

$ apachectl start

The command completes after a moment, and Apache is then running in the background. Now let’s try to run it with the image we built previously:

$ docker run -i -t aa0 apachectl start
apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.7 for ServerName

That’s a non-fatal error Apache is giving us, so it should be up and running.

$ docker ps -a   
CONTAINER ID    IMAGE      COMMAND              CREATED             STATUS                    
9d0eab0934fe    aa0        "apachectl start"    2 minutes ago       Exited (0) 2 minutes ago

Why isn’t it running? When we told Docker to run the “apachectl” command, the Apache server did successfully start in the container. Then, apachectl exited and returned “0”. Since the command we specified is no longer running, Docker stopped the container. Remember, a container is “just a process.”

So how do we keep the container running? The solution is that we need to run Apache like we would a bash script -- in the foreground. While this isn’t something a normal sysadmin would do, you can run Apache in the foreground easily with the “-D” switch:

$ docker run -i -t aa0 apachectl -D FOREGROUND

This time, the command will not return immediately. If we open a new terminal prompt and list all running containers, we’ll find that we indeed have one container running now:

$ docker ps
CONTAINER ID     IMAGE    COMMAND                CREATED             STATUS              
059950c67e5b     aa0      "apachectl -D FOREGR   About a minute ago   Up About a minute
This teaches us an important difference between building containers vs. building VMs. Since we’re only running one process rather than a whole operating system, we need to run just the process we need, not a command to start a daemon that runs in the background.

Running Containers in the Background

Right now our container is running in a terminal session on the host OS. We don’t want that. We’d rather just start the container in the background. Note, we want to start the container in the background, not the process running in the container! To do that we need to change the way we run our Docker command. So let’s Control-C out of the running container like we would any foreground CLI application and try again.

We originally used the “-i” and “-t” switches on the Docker “run” command to run the container interactively. This time, we need to run the command with “-d”, or “detach”. This will run the container in the background:

$ docker run -d aa0 apachectl -D FOREGROUND
86414f5548dcecb83d80dec4ec9351a7697b23d11664738f8668167a127ec70e

The “run” command spits out the running container ID for us. We can verify that with the “ps” command:

$ docker ps 
CONTAINER ID        IMAGE         COMMAND                CREATED             STATUS
86414f5548dc        aa0           "apachectl -D FOREGR   36 seconds ago      Up 34 seconds

Nifty! Now we have the container in the background, and we have a free terminal session. But… how do we stop the container? As with processes, there’s the Docker “kill” command used to stop a running container:

$ docker kill 864
864

Docker repeats the container (partial) ID for every container we stop using the “kill” command.

Entrypoints

We still have to specify which command we want to start in the container on the CLI each time we start it up. That’s annoying! More importantly, it makes our container less repeatable. We want to just give our team a Dockerfile, and tell them to build and run it. We don’t want include instructions or READMEs. The solution, of course, is to add more stuff to the Dockerfile.

In addition to FROM, MAINTAINER, and ADD, there’s also CMD and ENTRYPOINT. CMD and ENTRYPOINT are more or less the same statement. You can only have one of each in your Dockerfile. CMD has a few different forms: it can specify an executable, or just specify parameters for the executable in ENTRYPOINT. ENTRYPOINT always specifies the executable as its first argument.

CMD [“apachectl”, “-D”, “FOREGROUND”]

or

CMD [“-D”, “FOREGROUND”]
ENTRYPOINT [“apachectl”]

As you can see, each token in the full command is specified as a different argument to CMD and ENTRYPOINT and wrapped in double quotes. All arguments are enclosed within brackets.

So what’s the difference between one CMD and a CMD and an ENTRYPOINT? It turns out, Docker maintains a default entrypoint, /bin/sh. When we used the Docker “run” command, it actually started a shell process for us, then ran the command we gave to Docker. So, our previous “run” command…

$ docker run -i -t aa0 apachectl -D FOREGROUND

…actually did this in the container:

$ /bin/sh -c “apachectl -D FOREGROUND”

We don’t need to run the default entrypoint if we don’t want to. We can replace it with our own using the ENTRYPOINT statement.

Running a Container with an Entrypoint

At this point, our Dockerfile should specify what to execute when running the container:

FROM debian:wheezy
MAINTAINER your_email@example.com

RUN apt-get update
RUN apt-get install -y apache2

CMD ["-D", "FOREGROUND"]
ENTRYPOINT ["apachectl"]

We can easily update our images using the Docker “build” command:

$ docker build .

Sending build context to Docker daemon 2.048 kB
Sending build context to Docker daemon 
Step 0 : FROM debian:wheezy
 ---> 60c52dbe9d91
Step 1 : MAINTAINER your_email@example.com
 ---> Using cache
 ---> 64b0d7e8eef9
Step 2 : RUN apt-get update
 ---> Using cache
 ---> eefa2fcb15e7
Step 3 : RUN apt-get install -y apache2
 ---> Using cache
 ---> aa084388ac30
Step 4 : CMD -D FOREGROUND
 ---> Running in 32b9f66cf1e3
 ---> 239ae06b7d68
Removing intermediate container 32b9f66cf1e3
Step 5 : ENTRYPOINT apachectl
 ---> Running in 6ac785739695
 ---> 72b3cac68438
Removing intermediate container 6ac785739695
Successfully built 72b3cac68438

And now we can run it without specifying an entrypoint:

$ docker run -d 72b  
988bf0c8cdfdf88c42e3673a65597f78693c07b14070cd535962b13b36884560

$ docker ps 
CONTAINER ID        IMAGE         COMMAND                CREATED             STATUS              
988bf0c8cdfd        72b           "apachectl -D FOREGR   4 seconds ago       Up 3 seconds

So much easier!

Finding the Container IP

There’s still one more problem. If we try to visit the web server running in the container, what address do we navigate to? While the container -- the “guest” -- is running on our host machine, the guest doesn’t know that. It thinks it’s running on its own unique hardware. With a VM, we would point the web browser on the host OS to the IP address of the VM. When Docker is running on a Linux host, there is no VM.

The answer is actually pretty simple: it’s localhost. Remember, Docker containers are more like sandboxed applications than VMs. So if a process in the container needs to be network accessible, you would address the host system. Since we’re running this on our own workstation, localhost would be the correct answer.

What if you’re not running Docker on Linux? What if your host OS is Mac OS X or Windows? In that case, when you installed Docker you also installed a virtual machine -- boot2docker. The boot2docker VM is a lightweight Linux VM that can run all your docker containers. The answer as to what IP to use becomes obvious: it’s the boot2docker IP, just like you would with any VM.

How do you find the boot2docker IP? On installation, a DOCKER_HOST environment variable was defined for you, pointing to the IP of the boot2docker VM. This environment variable can become outdated, so it’s best to simply ask boot2docker for its IP:

$ boot2docker ip
192.168.59.103

Typing in the IP address all the time is annoying, though. On Mac OS and Linux, you can modify the /etc/hosts file to create an alias for the container:

192.168.59.103	docker.dev

Windows has the same hosts file, but buries it much more deeply in the operating system, at “C:\Windows\System32\Drivers\etc”. The format is the same. Once defined, you can easily go to http://docker.dev and view the web server running in your container.

Exposing Ports of a Container

When we navigate to the IP, however, we get a 404. What’s happening? Yes, our container is running, but we haven’t instructed Docker to expose the HTTP port to the host OS. To do that, we need to add an EXPOSE statement to our Dockerfile:

EXPOSE 80

After running “build” and “run” again, we can visit http://docker.dev and we finally get the “It works!” message from a default Apache installation. Awesome!

Summary

We came a long way in this post. We went from a minimal Dockerfile, to a complete, ready-to-use container with a handful of extra commands. We learned a little bit about Docker’s networking infrastructure. Next time, we’ll introduce a new layer of Docker technology that will make managing containers even easier, and use it to mount some files on our host OS in the process.

Read Part 4.