Inspecting filesystem changes per docker build step


While creating the required docker images for a ci-pipeline I needed to figure out why my docker images where so large in size. The following describes the method I used to inspect the filesystem changes during each build step.

Let's take the following docker file and build it:

# base 
FROM debian:jessie

# labels
LABEL version="1.0.0-SNAPSHOT" id="tests-debian-jessie" description="docker image for testing purposes"

# author 

# deps
RUN echo "===> basics ..." && \  
    apt-get update -y  && \
    apt-get install --no-install-recommends -y openssh-client curl ca-certificates && \
    echo "alias ll='ls $LS_OPTIONS -lha'" >> /root/.bashrc

# user & group
RUN echo "===> app user ..." && \  
    mkdir /data && \
    groupadd -r app -g 1000 && \
    useradd -u 1000 -r -g app -m -d /data/app -s /sbin/nologin -c "app user" app && \
    chmod 755 /data/app

# cleanup
RUN echo "===> housekeeping ..." && \  
    apt-get autoclean -y && \
    apt-get autoremove -y && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

# user & working dir 
WORKDIR /data/app  
USER app

# command.
CMD ["bash"]  

The build should output something like this:

$ docker build --no-cache-t tests-debian-jessie .
Sending build context to Docker daemon 60.93 kB  
Step 1/9 : FROM debian:jessie  
 ---> 73e72bf822ca
Step 2/9 : LABEL version "1.0.0-SNAPSHOT" id "tests-debian-jessie" description "docker image for testing purposes"  
 ---> Running in 6437dfd26bc0
 ---> 9d429548d05e
Removing intermediate container 6437dfd26bc0  
Step 3/9 : MAINTAINER tupadr3  
 ---> Running in 6adc2b66bbc6
 ---> bae759107bb8
Removing intermediate container 6adc2b66bbc6  
Step 4/9 : RUN echo "===> setting up basics ..." &&     apt-get update -y  &&     apt-get install --no-install-recommends -y openssh-client curl ca-certificates &&     echo "alias ll='ls $LS_OPTIONS -lha'" >> /root/.bashrc  
 ---> Running in 9bb1b56aa3d5
===> setting up basics ...
Get:1 jessie/updates InRelease [63.1 kB]  

During the build process docker creates a new container and executes the command. The changes are than saved into image layers. The layers of an image can be viewed using docker history.

$ docker history tests-debian-jessie
IMAGE               CREATED                  CREATED BY                                      SIZE  
220dffb4937b        Less than a second ago   /bin/sh -c #(nop)  CMD ["bash"]                 0 B  
cd8030a834e3        Less than a second ago   /bin/sh -c #(nop)  USER [app]                   0 B  
dd6065ca4fab        Less than a second ago   /bin/sh -c #(nop) WORKDIR /data/app             0 B  
38b89d402b65        Less than a second ago   /bin/sh -c echo "===> housekeeping ..." &&...   0 B  
7dd44155f99f        Less than a second ago   /bin/sh -c echo "===> setting up app user ...   335 kB  
41027c98a7a6        Less than a second ago   /bin/sh -c echo "===> setting up basics .....   28.1 MB  
bae759107bb8        Less than a second ago   /bin/sh -c #(nop)  MAINTAINER tupadr3           0 B  
9d429548d05e        Less than a second ago   /bin/sh -c #(nop)  LABEL version=1.0.0-SNA...   0 B  
73e72bf822ca        4 weeks ago              /bin/sh -c #(nop)  CMD ["/bin/bash"]            0 B  
<missing>           4 weeks ago              /bin/sh -c #(nop) ADD file:41ea5187c501168...   123 MB  

The command docker diff produces the wanted output but can only be used in conjunction with a container. In order to inspect the changes per build step we need to instruct docker to keep the intermediate containers around:

# build the again image but keep intermediate containers
$ docker build --rm=false --no-cache -t tests-debian-jessie .

# list all containers after the bulild excluding all noop's
$ docker ps -as --filter 'status=exited'  --format "table {{.ID}}\t{{.Size}}\t{{.Command}}"
CONTAINER ID        SIZE                       COMMAND  
4226d7f6c178        0 B (virtual 151 MB)       "/bin/sh -c 'echo ..."  
4320cbd07b63        335 kB (virtual 151 MB)    "/bin/sh -c 'echo ..."  
f60b3c549361        28.1 MB (virtual 151 MB)   "/bin/sh -c 'echo ..."

# let's see what files changed during each step
$ docker diff 4226d7f6c178
C /tmp  
C /var/cache/apt/archives/lock  
C /var/lib/apt/lists  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie-updates_InRelease  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie-updates_main_binary-amd64_Packages.gz  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_Release  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_Release.gpg  
D /var/lib/apt/lists/deb.debian.org_debian_dists_jessie_main_binary-amd64_Packages.gz  
D /var/lib/apt/lists/lock  
D /var/lib/apt/lists/partial  
D /var/lib/apt/lists/security.debian.org_dists_jessie_updates_InRelease  
D /var/lib/apt/lists/security.debian.org_dists_jessie_updates_main_binary-amd64_Packages.gz  
C /var/lib/dpkg/lock  

With the output above it is possible to determine what files have changed and what files can be removed in order to trim down the size of an image.