Container Tools, Tips, and Tricks - Issue #3


Hi friends!

Let's talk about desktop container environments today.

When we say "a container", most of the time, we mean a Linux container in the form that was popularized (and later even standardized) by Docker. Everyone knows that Linux containers are made of namespaces and cgroups, so a Linux kernel is a must to run such a container. But how come these containers also seem to work on macOS and Windows?

Desktop container environment architecture

The answer is simple (unlike the implementation) - they run in a Linux virtual machine. Most, if not all, container runtimes in this amazing list by Bret Fisher that are marked as supporting Windows and/or macOS use a very similar architecture:

The above VM-based architecture became a de facto standard for Desktop container environments. And it makes total sense. It allows the inner piece of software like Docker Engine, containerd, or Podman to run (almost?) unmodified compared to the server-side setup. Yes, desktop container environments bring a whole bunch of extra software to spin up the VM and procure good-enough disk- and network connectivity with the host system (and the outside world). But thanks to the VM at the heart, even on Windows and macOS, the actual containers are executed by pretty much the same set of lower-level components as on your production servers (read - containerd and runc).


​Support this newsletter on Patreon and get advice on technical storytelling and drawing no-boring​ diagrams 😎


Most popular implementations

Let’s take a quick look at how different container runtimes implement this architecture.

Docker Desktop uses a lightweight LinuxKit VM that is run by WSL2 on Windows and, since relatively recently (March 2022), by the Apple Virtualization Framework on macOS (it used to be QEMU before that). While the LinuxKit project is open source, the plumbing of the Docker Desktop is (at least partially) proprietary, so we cannot know all the implementation details. But the bird-eye picture is pretty clear. More details on Docker Desktop architecture (Docker blog).

​Lima packs together containerd, BuildKit, and nerdctl to create something that can be used as a replacement for Docker (rather Engine than Desktop, IMO). It uses good old QEMU to spin up a custom Linux VM where these components will run, and the choice of the virtualization technology (QEMU) stays unchanged between the supported host platforms (Linux, macOS, amd64, arm64).

UPD: Just learned that a couple of weeks ago, the initial support of the Apple Virtualization framework has also landed in Lima, so we may see non-QEMU Lima virtual machines on macOS soon.

​Finch is essentially Lima plus a handy installer (plus AWS-specific integrations if needed). So, no difference from the architectural standpoint, I guess.

​Colima is another Lima-based project. Compared to Finch, Colima follows a different strategy of extending Lima - it adds support for different container runtimes (Docker Engine, containerd, and even Kubernetes via K3s). But from the host system point of view, there is not much difference with Lima's original architecture - all the supported runtimes are fully encapsulated in a Linux virtual machine maintained by Lima.

​Rancher Desktop, at first, may look more similar to Docker Desktop than to Finch or Colima (especially from the GUI/UX point of view). But under the hood, it's yet another Lima application. So, again, pretty much the same architecture. However, compared to Lima's own offer, Rancher Desktop supports running Docker Engine (Moby) and/or K3s in the VM, so it's superior to Lima. But in any case, pretty much the same architecture again.

Running cross-platform containers

All the above Desktop container environments start a virtual machine of the same architecture the host system uses. On my Apple Silicon macbook πšπš˜πšŒπš”πšŽπš› πš›πšžπš— πšžπš‹πšžπš—πšπšž πšžπš—πšŠπš–πšŽ -πš– says πšŠπšŠπš›πšŒπš‘πŸΌπŸΊ, and on the Intel macbook, it says 𝚑𝟾𝟼_𝟼𝟺.

This behavior makes total sense because such a choice allows using hardware virtualization so that the performance of a virtual machine is on par with the host system.

At the same time, it’s possible to run amd64 containers on arm64 desktops and vice versa:

$ docker run -d --platform linux/amd64 nginx
$ docker run -d --platform linux/arm64 nginx

​

But if there is just one virtual machine, how does the cross-platform support work?

TL;DR it's a combination of two technologies: binfmt_misc and CPU emulation in user space. The first one allows registering custom executable formats so that the kernel would know what user space application to invoke when a certain file is about to be executed (similarly to the shebang trick like #!πš™πš’πšπš‘πš˜πš—πŸΉ that we use to run python programs like normal shell scripts). And the second one means there is a special (often QEMU but sometimes Rosetta 2 - proof & proof) helper program that can run arm64 binaries on amd64 or the other way around.

Here is what the process tree looks like for the above two nginx containers from inside of a LinuxKit VM (Docker Desktop on Intel):

$ docker run -it --pid host ubuntu ps auxf

​

And here is the process tree produced by Finch on an Apple Silicon macbook (from inside of the VM, of course):

$ finch run -it --pid host ubuntu ps auxf

​

Short security note

Regardless of my host OS choice, I'd use a separate (often vagrant) virtual machine for every project (or a tightly coupled group of projects). Of course, most of the time, this VM would include a Docker Engine, so if needed, I could run a container from inside of the VM. However, at times, I still need to run a container directly from the host system. Not a big problem on macOS, but on a Linux host, I'd need to run a Docker Engine right on the host system. And that would make me extremely nervous because I generally try to avoid running random stuff from the Internet on my host systems, and containers aren't security devices. Luckily, since May, Docker Desktop can be used on Linux hosts too, so I can benefit from the higher isolation provided by its virtual machine. No more Docker Engine running directly on the host for me ❀️‍πŸ”₯

---

Interesting fact: Did you know that there are Windows(-native) containers too? This type of container has nothing to do with WSL2 (aka lightweight Linux VM on Windows), but nevertheless, it’s an OCI-standard form of containers. Although, I’ve no idea how widespread it is.

​

Hope this was an informative one!

Cheers

Ivan

Ivan Velichko

Building labs.iximiuz.com - a place to help you learn Containers and Kubernetes the fun way πŸš€

Read more from Ivan Velichko

Hey there πŸ‘‹ I spent a few weeks deep diving into cgroup v2, and I'm happy to share my findings with you! Everyone knows that Docker and Kubernetes use cgroups to limit the resources of containers and Pods. But did you know that it's very easy to run an arbitrary Linux process in a cgroup using much more basic tools? The only kernel's interface for cgroups is the virtual filesystem called cgroupfs typically mounted at /sys/fs/cgroup. Creating folders there and writing to files in them is...

Hello friends! Ivan's here with the June roundup of all things Linux, Containers, Kubernetes, and Server-Side craft πŸ§™ What I was working on The first two lessons (and a few challenges) of my "Alternative Introduction to Dagger" course have not sparked much interest among my students, so I had to put this work on pause. With a heavy heart, though, because I do like Dagger, and I was enjoying working on the content about it. But no interest means fewer iximiuz Labs Premium subscribers, and I...

Hello friends! It's time for my traditional monthly roundup of all things Linux, Containers, Kubernetes, and Server-Side craft πŸ§™ Before we get started, I want you to know that this newsletter's previous issue (dispatched mid-May) was delivered to only about 1/5th of my usual email audience due to an unfortunate DNS misconfiguration. The good news is that you can still find it and all previous issues on newsletter.iximiuz.com. Also, if you reply to this email, it'd help to restore the domain's...