Ivan Velichko

Ivan on Containers, Kubernetes, and Backend Development

Published about 2 years ago • 6 min read

Hey, hey!

It's Ivan Velichko, a software engineer and a technical storyteller. I brought you a monthly roundup of all things Containers, Kubernetes, and Backend Development from

The main theme of this (rather productive) month is Kubernetes API - I've posted a few long-form write-ups and started one promising Go repository. But first, I'm glad to announce that the newsletter got its very first sponsor - Teleport. And I'm super happy about it - not just it makes the newsletter pay for itself (ConvertKit is no cheap!), but also because of the things I'm asked to include in my emails. Teleport folks have a surprisingly good technical blog, so I don't have to feel guilty about the sponsored content - it's something I could have shared here anyway:

SPONSORED Check out this article where Teleport explores SSH best practices to boost the security of your infrastructure: from changing SSH default options and using bastion hosts to avoiding password-based authentication using short-living SSH certificates. And the best part about it - Teleport makes it simple to implement.

What I Was Working On

Last year I spent a substantial amount of time writing Kubernetes controllers. It was simultaneously a fun and challenging activity. The idea of a controller is simple and powerful - read some objects (desired state), check statuses of others (observed state), and make changes to the world bringing it closer to the desired state; then repeat. However, the devil is in the details.

The Kubernetes API is the basis of any controller, but it comes with its own quirks: Resource Types and Groups, Objects and Kinds, resource versions and optimistic locking, etc. Combined with the statically typed language like Go, it makes the learning curve quite steep. Why there are three different clients in the client-go project? When to use typed client and when to stick with a dynamic client and work with Kubernetes objects as with Unstructured structs (no pun intended). Wtf is RESTMapper and runtime.Scheme?

When you master the API access basics, a whole lot of more advanced questions arise. A naive control loop implementation that literally GETs resources from the API on every iteration is inefficient and prone to all sorts of concurrency issues. To address this problem, the Kubernetes most advanced API client, client-go, brings a bunch of higher-level controller-tailored abstractions: Informers to watch for resource changes, Cache to reduce the API access, Work Queue to line up the changes in one processing flow, etc. But that's not it yet!

Historically, writing Kubernetes controllers involved quite some boilerplate code. So, many repetitive tasks were codified in the controller-runtime package that extends the capabilities of the already advanced API client. Bootstrapping of controllers, including CRD and webhook creation was automated by the kubebuilder project. But Red Hat (or was it CoreOS?) thought it's not enough and introduced the Operator SDK solving more or less the same problem but adding extra capabilities on top. And neither kubebuilder nor Operator SDK (or maybe the last one does?) actually has a runtime footprint! It's still the same controller-runtime in the end, which in turn is just a fancy wrapper around client-go. But how would you know it?

When I dove into this Zoo of concepts, libraries, and projects, I almost sank 🙈 As it turned out, nothing was really complicated. But everything was so entangled! So, the time has come! I'm starting a series of articles (or two?) with an aim to share my Kubernetes API and controllers learning path.

The idea is to start from the Kubernetes API itself and then move to the client, explore its basic and then advanced capabilities and how they are used for writing controllers, and finally touch upon the controller-runtime and kubebuilder projects. Hopefully, with a lot of practical examples on the way. Here's what I've got for now:

It was definitely a good start, and I'm getting a lot of positive and constructive feedback. Let's see what February results into 😉

What I Was Reading

A lot of stuff!

🔥 Tracing the path of network traffic in Kubernetes - a massive one, but given the vastness of the Kubernetes networking topic, the post does really great on condensing it into a single read. Reminded me about a twitter thread I posted some time ago sharing my way of tackling Kubernetes networking. Hopefully, one day I'll turn it into a full-fledged blog post too.

Two reasons Kubernetes is so complex - sounds like it boils down not to the actual complexity of Kubernetes but to the wrong developers' expectations of it. Kubernetes is not a platform to simplify your deployments - it's a full-blown cluster Operating System. And operating a group of potentially heterogeneous servers is a much broader task than launching a bunch of containers. Everything you expect from an OS doing for you to utilize the hardware resources of your laptop, Kubernetes does to groups of servers. On top of that, Kubernetes' design choice to implement everything as "declare the desired state and wait until control loops reconcile it" makes it harder to reason about the behavior of the system. But I've got a feeling that an imperative implementation of a distributed OS would be even a bigger mess :)

Using Admission Controllers to Detect Container Drift at Runtime - consider it as a continuation of the above rant. When we combine the declarative approach with manual ad-hoc changes, the end state of the system becomes much harder to predict. GitOps says the VCS is the only source of truth - you make a change to your code or configs, push it to Git, and wait until a CI/CD pipeline applies it to production. However, when something goes wrong, folks probably turn off their pipelines for the period of troubleshooting and start a good old manual debugging. But how can we make sure the end state of the system is reflected in Git? Do we need a custom control loop making sure all the manual changes are eventually reverted back to the latest state in Git? And if it breaks production one more time, we'll eventually remember to backport the manual adjustments back to our repos.

The Rise of ClickOps - and while we're thinking of how to befriend GitOps and control loops, Corey Quinn already lives in the future. I spent some time with last months, and its UX is probably what is supposed to be called ClickOps - you configure stuff using a fancy UI, but then export the pipelines configs into (unreadable mess of) YAML files and commit them somewhere. Can't say I enjoyed it.

The ROAD to SRE - a set of (IMO, reasonable) principles folks came up with while introducing SRE to their organization. One of the main goals was to avoid creating yet another Ops team. Reminded me of my rant on DevOps, SRE, and Platform Engineering and what makes them different.

And now back to the roots.

The HTTP QUERY Method - surprisingly, a draft of a new HTTP method definition. Think GET with a body. The lack of a body makes GET requests ill-suitable in situations when the query parameters are lengthy. People often use POST lengthy in such cases, but it's not RESTful. So, adding a GET with a body to the HTTP spec makes perfect sense, actually.

Some ways DNS can break - Julia Evans used her immense audience for good one more time and collected a dozen of real-world examples of how DNS can break your stuff. And believe it or not, I was involved in a DNS-related incident this week too. Of course, it didn't look like DNS at all until someone noticed the period of Kube DNS errors matched perfectly with the time when services had troubles. So, it's always DNS. Or should we say Kube DNS these days? 🙈

Tech News I've Come Across

🔥 - this is nuts! Get a disposable Linux box (a container, actually) with systemd and play with it right from your browser.

Dev corrupts NPM libs 'colors' and 'faker' breaking thousands of apps - oops, someone did it again. The way we manage our dependencies is clearly broken, and some ecosystems are broken more than others. Here are some thoughts by Russ Cox about potential improvements that resonated with me.

Introducing Ephemeral Containers - it shouldn't be hard to add the support of ephemeral containers given the way Kubernetes implements Pods. But speaking of the containers drift (see above), I find this feature pretty valuable.

Stay Tuned

Ok, I should probably stop adding stuff to this issue, it's getting much bigger than I expected. Should I start sending the newsletter twice a month, maybe? Let me know in replies, folks!


Ivan Velichko

P.S. If you find this newsletter helpful, please spread the word - forward this email to your friend :)

Ivan Velichko

Software Engineer at day. Tech Storyteller at night. Helping people master Containers.

Read more from Ivan Velichko

Hello there! 👋 Debugging containerized applications is... challenging. Debugging apps that use slim variants of container images is double challenging. And debugging slim containers in hardened production environments is often close to impossible. Before jumping to the DevOps problems that I prepared for you this week, let's review a few tricks that can be used to troubleshoot containers. If the container has a shell inside, running commands in it with docker exec (or kubectl exec) is...

10 days ago • 1 min read

Hey hey! Are you ready for your next DevOps challenge? Last week, we all witnessed yet another terrifying cyber-security event, and this time, it was a direct hit - researchers from Snyk discovered a way to break out of containers! 🤯 The vulnerability was found in the fundamental component of the containerization ecosystem - the most popular implementation of the (low-level) OCI container runtime - runc. Notice how, on the diagram above, most high-level container runtimes actually rely on the...

29 days ago • 1 min read

Hello friends! Ivan's here - with my traditional monthly roundup of all things Linux, Containers, Kubernetes, and Server-Side craft 🧙 What I was working on After my announcement of the iximiuz Labs GA earlier this month, the platform's usage has more than doubled, so I had to focus on the system's stability and UX. As a result, I increased observability and test coverage, added one more bare-metal server, streamlined a bunch of use cases, and fixed a few bugs. The most notable user-facing...

about 1 month ago • 3 min read
Share this post