Inspecting filesystem changes per docker build step

Advertisement

While creating the required docker images for a ci-pipeline I needed to figure out why my docker images where so large in size. The following describes the method I used to inspect the filesystem changes during each build step.

Let's take the following docker file and build it:

# base 
FROM debian:jessie

# labels
LABEL version="1.0.0-SNAPSHOT" id="tests-debian-jessie" description="docker image for testing purposes"

# author 
MAINTAINER tupadr3

# deps
RUN echo "===> basics ..." && \  
    apt-get update -y  && \
    apt-get install --no-install-recommends -y openssh-client curl ca-certificates && \
    echo "alias ll='ls $LS_OPTIONS -lha'" >> /root/.bashrc

# user & group
RUN echo "===> app user ..." && \  
    mkdir /data && \
    groupadd -r app -g 1000 && \
    useradd -u 1000 -r -g app -m -d /data/app -s /sbin/nologin -c "app user" app && \
    chmod 755 /data/app

# cleanup
RUN echo "===> housekeeping ..." && \  
    apt-get autoclean -y && \
    apt-get autoremove -y && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# user & working dir 
WORKDIR /data/app  
USER app

# command.
CMD ["bash"]  

The build should output something like this:

$ docker build --no-cache-t tests-debian-jessie .
Sending build context to Docker daemon 60.93 kB  
Step 1/9 : FROM debian:jessie  
 ---> 73e72bf822ca
Step 2/9 : LABEL version "1.0.0-SNAPSHOT" id "tests-debian-jessie" description "docker image for testing purposes"  
 ---> Running in 6437dfd26bc0
 ---> 9d429548d05e
Removing intermediate container 6437dfd26bc0  
Step 3/9 : MAINTAINER tupadr3  
 ---> Running in 6adc2b66bbc6
 ---> bae759107bb8
Removing intermediate container 6adc2b66bbc6  
Step 4/9 : RUN echo "===> setting up basics ..." &&     apt-get update -y  &&     apt-get install --no-install-recommends -y openssh-client curl ca-certificates &&     echo "alias ll='ls $LS_OPTIONS -lha'" >> /root/.bashrc  
 ---> Running in 9bb1b56aa3d5
===> setting up basics ...
Get:1 http://security.debian.org jessie/updates InRelease [63.1 kB]  
....

During the build process docker creates a new container and executes the command. The changes are than saved into image layers. The layers of an image can be viewed using docker history.

$ docker history tests-debian-jessie
IMAGE               CREATED                  CREATED BY                                      SIZE  
220dffb4937b        Less than a second ago   /bin/sh -c #(nop)  CMD ["bash"]                 0 B  
cd8030a834e3        Less than a second ago   /bin/sh -c #(nop)  USER [app]                   0 B  
dd6065ca4fab        Less than a second ago   /bin/sh -c #(nop) WORKDIR /data/app             0 B  
38b89d402b65        Less than a second ago   /bin/sh -c echo "===> housekeeping ..." &&...   0 B  
7dd44155f99f        Less than a second ago   /bin/sh -c echo "===> setting up app user ...   335 kB  
41027c98a7a6        Less than a second ago   /bin/sh -c echo "===> setting up basics .....   28.1 MB  
bae759107bb8        Less than a second ago   /bin/sh -c #(nop)  MAINTAINER tupadr3           0 B  
9d429548d05e        Less than a second ago   /bin/sh -c #(nop)  LABEL version=1.0.0-SNA...   0 B  
73e72bf822ca        4 weeks ago              /bin/sh -c #(nop)  CMD ["/bin/bash"]            0 B  
<missing>           4 weeks ago              /bin/sh -c #(nop) ADD file:41ea5187c501168...   123 MB  

The command docker diff produces the wanted output but can only be used in conjunction with a container. In order to inspect the changes per build step we need to instruct docker to keep the intermediate containers around:

# build the again image but keep intermediate containers
$ docker build --rm=false --no-cache -t tests-debian-jessie .

# list all containers after the bulild excluding all noop's
$ docker ps -as --filter 'status=exited'  --format "table {{.ID}}\t{{.Size}}\t{{.Command}}"
CONTAINER ID        SIZE                       COMMAND  
4226d7f6c178        0 B (virtual 151 MB)       "/bin/sh -c 'echo ..."  
4320cbd07b63        335 kB (virtual 151 MB)    "/bin/sh -c 'echo ..."  
f60b3c549361        28.1 MB (virtual 151 MB)   "/bin/sh -c 'echo ..."

# let's see what files changed during each step
$ docker diff 4226d7f6c178
C /tmp  
C /var/cache/apt/archives/lock  
C /var/lib/apt/lists  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie-updates_InRelease  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie-updates_main_binary-amd64_Packages.gz  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_Release  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_Release.gpg  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_main_binary-amd64_Packages.gz  
D /var/lib/apt/lists/lock  
D /var/lib/apt/lists/partial  
D /var/lib/apt/lists/security.debian.org_dists_jessie_updates_InRelease  
D /var/lib/apt/lists/security.debian.org_dists_jessie_updates_main_binary-amd64_Packages.gz  
C /var/lib/dpkg/lock  
....

With the output above it is possible to determine what files have changed and what files can be removed in order to trim down the size of an image.