The world’s leading publication for data science, AI, and ML professionals.

The Fallacy of Complacent Distroless Containers

Making containers smaller is the most popular practice when reducing your attack surface. But how real is this sense of security?

Image generated with Leonardo AI
Image generated with Leonardo AI

Building Docker images is an easy and accessible practice, however, perfecting them is still an art that is challenging to master. In pursuit of the smallest, most secure and yet functional container images, developers face themselves with distroless practices that usually involve complex tooling, deep distro knowledge and error-prone trimming strategies. In fact, such practices often neglect the use of package managers, contributing to a security abyss, as most vulnerability scanners rely on package manager metadata to detect the software components within the container image.

Building container images

When you build a container image, you’re packaging your application, together with its dependencies, in a portable software unit that can later be deployed in isolation, without the need to virtualize an entire operating system.

Building container images is actually a very accessible practice nowadays. There’s an abundance of tools (e.g. Docker, Rockcraft, Buildah…) specifically for that purpose.

But, in the process of packing your application and everything it needs in order to run, could you possibly be adding more that what’s needed?

Most of the time, the answer is yes!

Here’s a very simple Dockerfile:

FROM ubuntu:24.04

RUN apt update && apt install -y --no-install-recommends nginx 
  && rm -rf /var/lib/apt/lists/*

ENTRYPOINT ["nginx"]
CMD ["-g", "daemon off;"]

In this example, we’re packing Nginx on top of an Ubuntu 24.04 image. But,

  • ubuntu:24.04 will be in our final image. Do we actually need it? Most likely not. With it, a bunch of unnecessary software (e.g. utilities like apt) will be kept and thus increase the image’s attack surface;
  • even though we were careful not to install recommendations and clean the apt lists, we still installed the whole Nginx package and all its dependencies. Do we need all that? This is a trickier one to answer as it depends a lot on the use case, but we surely know we don’t want things like Nginx’s man pages, for example.

Distroless containers

"Distroless" images contain only your application and its runtime dependencies.

They do not contain the typical additional libraries or utilities from a Linux distribution.

This has been the most advocated practice in the space of container security for the past 7 years. And although conceptually right, what’s the cost of building these smaller and "more secure(?)" distroless Containers?

  • Easy to build? Not really. It can be a hard craft to master as you may need to use specialized tooling and require deep distro knowledge to effectively "remove the distro".
  • Error-prone? Yes. Some of the most advocated strategies for building distroless images involve following a "top-down" approach – i.e. bloating a base container with your application, and then manually cherry-picking the desired contents into a "scratch" environment.

Correlation is not causation

It’s not because your container is smaller, that it will necessarily be more secure! In fact, the making of distroless containers is prone to the creation of blind spots.

A 2022 Rezilion report by Yotam Perkal tested the reliability and consistency of different vulnerability scanners by scanning 20 of the most popular container images and comparing the resulting vulnerability reports. Besides the abundance of HIGH and CRITICAL misidentifications, the report also shows an 82% average precision from these tools, with a significant portion of the resulting being comprised of both False Positives and False Negatives.

To be honest, I’m ok with False Positives – it’s like being told you’re sick, when in reality it was just an examination error – it’s scary, but not truly dangerous.

False Negatives, on the other hand, are much worse! It’s like having a problem you’re not aware of – a blind spot!

The main cause for security blind spots

One of the main reasons why vulnerability scanners are unable to detect certain vulnerabilities is because most of them rely on package metadata and are thus unable to detect software components not managed by package managers.

Don’t believe me? Let me show you.

For demonstration purposes, let’s just take a popular and vulnerable Docker image from Docker Hub, and a popular vulnerability scanner.

Let’s say:

  • Trivy as the scanner, and
  • [ubuntu:lunar](https://hub.docker.com/layers/library/ubuntu/lunar/images/sha256-ea1285dffce8a938ef356908d1be741da594310c8dced79b870d66808cb12b0f) as the Docker image.

At the time of writing this, the chosen Docker image is already EOL, and vulnerable. According to Trivy, this image has a total of 11 CVEs:

$ trivy image ubuntu:lunar
...
ubuntu:lunar (ubuntu 23.04)

Total: 11 (UNKNOWN: 0, LOW: 2, MEDIUM: 9, HIGH: 0, CRITICAL: 0)

BUT, this is a Debian-based container image, so what does Trivy say if we delete the image’s package metadata? Let’s see…

$ echo '''
FROM ubuntu:lunar

# Whiteout the dpkg status file
RUN rm /var/lib/dpkg/status
''' | docker build -t ubuntu:lunar-tampered -

Drumroll please… 🥁

$ trivy image ubuntu:lunar-tampered
...
ubuntu:lunar-tampered (ubuntu 23.04)

Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

Zero, zip, zilch, nada…no vulnerabilities! Or so it looks. But we know there are still 11 CVEs. We just deleted the package metadata Trivy relies on to perform the scan.


What now?

Vulnerability scanners behave differently and may rely on information within the container image itself in order to produce accurate reports!

So here’s a checklist you can use to ensure the containers you build and consume are not carrying hidden vulnerabilities:

  1. it’s not because it’s small and Distroless that the container you’re planning to use is secure.
  2. look beyond the vulnerability scanner. As we saw in the example above, a single missing file can cause the scanners to fail to identify CVEs. So don’t turn a blind eye on this! Yes, use scanners! But also try looking around for hints that blind spots may exist. How?

    • some scanners, like Trivy, will actually issue a warning when the files they rely on (like dpkg/status above) are missing. E.g. Trivy will say: | WARN No OS package is detected. Make sure you haven’t deleted any files | that contain information about the installed packages. | WARN e.g. files under "/lib/apk/db/", "/var/lib/dpkg/" and "/var/lib/rpm" – some of these tools, Trivy included, can also produce SBOMs. This is a more user-friendly way of double-checking the image’s software components. So try to produce an SBOM (e.g. trivy image --format spdx-json --output result.json <yourImage>). Is this SBOM empty? Is it missing components that you’d expect to see in that image? If so, then the vulnerability scanner will very likely also fail to produce an accurate report.
  3. vulnerability scanners vary, so don’t rely just on one scanner. Try choosing the ones with better support for the type of software ecosystem that is packed inside the image you want to use.
  4. avoid "dead drops" when building your container. I.e. cherry-picking the minimum set of files you need to make your application work might sound appealing, but you may unintentionally be leaving out the necessary metadata for the scanners to work properly.
  5. related to the above, favor the use of package managers. Yes, some of them aren’t really well adjusted to work with minimal containers, but as we saw above, some of the metadata they produce is critical for a proper Security scan.

In a coming article, I’ll explore a few techniques for building minimal container images, from the typical multi-stage approach to the distroless-friendly tools like Chisel.


Related Articles