UPDATE 2022 - controller runtime obvisates the need to build informers manually...
| Aspect | Informers | Controller Runtime |
|---|---|---|
| Ease of Use | Requires understanding of client-go library, more complex setup | Simplified setup, e.g., easier client initialization by VENDORING runtime... |
| Flexibility | Direct access to watch mechanics, like handling cache.ResourceEventHandler |
Predefined structure, e.g., using Reconcile method for events |
| Event Handling | Manual event management, e.g., setting up event handlers for each watch | Automated event handling, e.g., built-in event filtering and queuing |
| Code Verbosity | More verbose, manual creation and management of watches and queues | Less code, e.g., controller setup often requires fewer lines |
| Control Loop | Explicit control loop implementation, e.g., writing custom sync logic | Control loop abstracted, e.g., controller.New abstracts loop details |
| Customization | High, e.g., creating specific list-watchers for tailored behavior | Standardized, though extendable, e.g., using predefined runtime hooks |
| Learning Curve |
Steeper due to manual management of Kubernetes objects and events |
Easier, especially for beginners, due to abstracted Kubernetes interactions original post |
As part of this work, I realized there isn't much intuitive documentation around the Informer framework and how it works. So I'm putting some notes here.
First, I took a look at the unit tests for the controller package. In particular, after wandering around, I started at the top. In controller_test.go.
An intuitive 1000 foot explanation
Sometimes the best way to start understanding a complex system is to read an opinion on it. So.. here's my opinion, only pedagogical purposes, its not my "real" opinion.. because I don't have a "strong" opinion on wether or not a watch architecture is the best way to implement a distributed system.... So... My initial non-opinion here is that when lots of workers continue to monitor a consensus state, and continually evolve to better match that state, you have a system which can scale theoretically very naturally. Specifically.. You just add API Support and storage support for a Golang struct, and create a watch to respond to its changes.... there doesn't necessarily need to be a single master system which is controlling and scheduling all events, no need for pushing messages onto a bus, and so on... The disadvantage is that there is a high communication overhead, where many slaves are continuously having to update and re-check their state.
In any case, before reading on, you need to know that in kubernetes, a controller is something which implements a control loop, continously resynchronizing a system. The remainder of this post explains how these controllers work, at the interface level, concluding with a snippet for the replication controller.
A simple controller that deletes everything.
Now, if you look carefully, you will see that there is a "deletionCounter <- a="" by="" clause.="" confirm="" delete="" has="" is="" key="" nbsp="" occured.="" p="" test="" that="" the="" this="" to="" unit="" used="">
However, looking closer, maybe we can improve on this ?
There are some issues here.
->
- We are manually reading from a queue, i.e. "newest := obj.(cache.Deltas).Newest()",
- Rather than declaring behaviour, we are writing control flow "if its a delete, do this, otherwise, do something else".
- Its generall verbose. In order to delete incoming events and test the deletions, we spent about 80 lines of code.
The Informer framework makes the implementation of a control loop much tighter and cleaner. We can replace 30 lines of code above with the following, much more declarative and functional event based declaration.
![]() |
| The Informer Implementation provides a declarative wrapper to the Process() function which is implemented by the controller. |
![]() |
| The Process() function is the basis for a controller. It is wrapped by the Informer implementation. |
Ah, ok... So the informer framework is really just a wrapper for setting up a controller framework configuration, which calls Process(...) and does the cyclomatic logic for us.
Now, lets go deeper into the architecture.
Once we look into the controller, we see it is composed of the configuration object (i.e. what was created in the first example) + a reflector.
The reflector is the backbone for persistence. Here's essentially how it works.
This is the "guts" of how kubernetes control flow works, and how things like "pods" are created and balanced by replication controllers.
So, how does "storage" work? How do events in the database propogate up to kubernetes controllers?
We have thus far glanced over the concept of "storage", and used the "ListerWatcher" abstraction for granted. How do "ListerWatcher" implementations work? You can think of them (sort of) as database views. Lets go back into the code. We will use the replicationController as an example.
Remember earlier, in the ExampleInformer? We created a "NewFakeControllerSource". This is essentially a mock implementation of a data source that can be used for in memory unit tests. Here is what a "real" informer declaration looks like.
| A real informer needs a real data source. In this case, its a wrapper to the kubeClient, which can query against the API servers registered data types, to list all running ReplicationControllers. |
So... thats pretty much it! There are more details I can add another time. This should be sufficient to give you deeper knowledge of how Informers, Reflectors, Queues, and Watches all interact with ETCD to maintain, define, and rescue the internal state of kubernetes.






Great post, learn a lot from your blog.
ReplyDeletenice!
ReplyDelete