If you don't understand how routing works, have never used dig, or scapy, consider watching https://www.youtube.com/watch?v=hcPWAyxjd6E before reading further :). Alot of new kubernetes users don't realize that, when you're setting up a kube cluster, you need to understand a little bit about the networking model, in order to have the intuition necessary to maintain your cluster (i.e. you need to have a basic understanding, at least, of https, TCP, kube-proxy, flanneld, CNI, docker assigned IP addresses, in order for any of this to make sense).
anyways...
I've been debugging a flannel connectivity issue in a vagrant recipe this week. I noticed there aren't a whole lot of docs around flannel, so here are some screenshots that might help folks.
How flannel works
There aren't alot of great recipes for helping get started with flannel. I use Eric paris' https://github.com/eparis/kubernetes-ansible playbooks as a reference. From these, we can see how its really meant to work.
ETCD + Two layers of routing
Flannel is actually pretty easy to understand at a high level. There are really just two major points.
1) uses a distributed k/v store to write a subnet out for each machine in your cluster. This means you don't need to use some complex networking tool from the dark ages to store data about how your carving up your network. Why? Because flannel will make its OWN network on top of your existing network and store the metadata about it in the k/v store. In this case, the k/v store is etcd, which is basically a distributed hashmap with strong consistency, watch, directory semantics.
2) tells docker nodes running on individual machines to lookup their subnet before assigning IPs. This means no more 172 IPs that need to be port forwarded and so on. EACH container has an IP that can be routed first based on the MACHINE its running on, and then based on its exact container IP, and this is what flannel does for you.
Network interfaces
I like to do EVERYTHING in vagrant ALL THE TIME ! Well, guess what, sometimes you get in trouble. For example, in vagrant VMs, often eth0 is the BRIDGE.
In flannel, a default bridge is selected as the first one. So, if you have 3 nodes, each node will define the bridge IP as the public IP for their subnet.
This leads to a scenario where ALL ndies are using the SAME SUBNET for assigning IPs, which totally breaks the entire model of flannel subnets.
What you should do
Make sure you look at eth0, eth1, and so on each of your machines. Then setup your FLANNEL_OPTIONS (in fedora this will be in /etc/sysconfig/flanneld), like so
FLANNEL_OPTIONS="--iface=eth1"
if you have a network like this
![]() |
| If you have a internal or other network which you are assigning to your machines, use THAT not the first eth0, as flannels preferred iface. |
What you should see
To be sure : If you have two machines, when you bring your machines up, watching the logs in flannel (journalctl -f -u flanneld) should show something similar to what we have below...
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.005921 23344 main.go:247] Installing signal handlers
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.006228 23344 main.go:205] Using 192.168.4.101 as external interface
Jun 22 03:39:30 kube1.ha flanneld[23344]: W0622 03:39:30.006264 23344 device.go:83] "flannel.1" already exists with incompatable configuration: vtep (external) interface: 3 vs 2; recreating device
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.017812 23344 subnet.go:320] Picking subnet in range 80.1.0.0 ... 80.255.0.0
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.020405 23344 subnet.go:83] Subnet lease acquired: 80.99.0.0/16
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.020588 23344 main.go:215] VXLAN mode initialized
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.020597 23344 vxlan.go:115] Watching for L2/L3 misses
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.020604 23344 vxlan.go:121] Watching for new subnet leases
Jun 22 03:39:30 kube1.ha systemd[1]: Started Flanneld overlay address etcd agent.
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.026963 23344 vxlan.go:184] Subnet added: 80.54.0.0/16
Jun 22 03:39:30 kube1.ha flanneld[23344]: I0622 03:39:30.026995 23344 vxlan.go:184] Subnet added: 80.1.0.0/16
Then a quick test
Finally, you can do a quick test. Spinning up two docker container, on on each node, you should see that the first has an IP like 80.1.0.3, the second should have 80.x.y.z (WHERE X.Y isn't 1.0) :) .
WHY IS THIS IMPORTANT
The reason this is important is that flannel NEEDS DIFFERENT SUBNETS PER each machine, because the entire basis for flannel routing is that machines which are in a given subnet all go to the same original place, and then their network packets are decomposed into local packets. So if all machines have the same subnet, docker connections on the flannel overlay won't be routable.
By the way: To quick start, just run etcd from a container one-liner
If all this makes your head spin, and you're having etcd issues - the best thing you can do for testing is just run ETCD from a container, so that you know its working perfectly, as a single node service. sudo docker run -t -i -p 8001:8001 -p 4001:4001 -p 2379:2379 quay.io/coreos/etcd:v2.0.10 --addr 0.0.0.0:4001 --name etcd-node1 --data-dir=/tmp/etcd4FlannelKube. Running etcd this way in a new cluster works fine, and is simple and allows you to focus on other parts of your distributed system.
FYI, a working flannel cluster, config might look like this (generated using contrib/ansible).
/etc/sysconfig/flanneld:# Flanneld configuration options
/etc/sysconfig/flanneld:FLANNEL_ETCD="http://kube-master:2379"
/etc/sysconfig/flanneld:# etcd config key. This is the configuration key that flannel queries
/etc/sysconfig/flanneld:FLANNEL_ETCD_KEY="/cluster.local/network"
/etc/sysconfig/flanneld:# By default, we just add a good guess for the network interface on Vbox. Otherwise, Flannel will probably make the right guess.
/etc/sysconfig/flanneld:FLANNEL_OPTIONS="--iface=eth1"
/etc/firewalld/direct.xml: <rule priority="1" table="filter" ipv="ipv4" chain="FORWARD">-i flannel.1 -o docker0 -j ACCEPT -m comment --comment 'flannel subnet'</rule>
Hacking around with Flannel
Flannel is surprisingly easy to hack around with and build for yourself. You can just clone it down from github, and then run rm -rf bin/flanneld ; ./build ; bin/flanneld . That pretty much is all you need to do. Associated with this post is a minor pr to cleanup flanneld options.

Thanks!
ReplyDeletewell said .Keep sharing Devops Online Training
ReplyDelete