Its pretty easy to do this locally in kubernetes:
First export:
KUBE_ENABLE_CLUSTER_DNS=true
API_HOST=
Now, just grab https://github.com/coreos/etcd/blob/master/hack/kubernetes-deploy/etcd.yml , and run it as a replication controller
Modifying it to provide a loadbalanced endpoint
Note that if you want, you can setup a loadbalanced endpoint, by simply setting 'nodePort' and type:LoadBalancer.
- name: etcd-client-port
port: 2379
protocol: TCP
targetPort: 2379
nodePort: 30000
selector:
app: etcd
type: LoadBalancer
Then, you can simply exec cluster health to see if everybody is happy:
cluster/kubectl.sh exec etcd0 /usr/local/bin/etcdctl cluster-health
And the logs will look something like this:
2017-03-30 20:46:27.754253 I | etcdserver: setting up the initial cluster version to 3.1
2017-03-30 20:46:27.757384 N | etcdserver/membership: set the initial cluster version to 3.1
2017-03-30 20:46:27.757450 I | etcdserver/api: enabled capabilities for version 3.1
2017-03-30 20:46:28.373507 E | rafthttp: failed to dial ade526d28b1f92f7 on stream Message (dial tcp 10.0.0.120:2380: i/o timeout)
2017-03-30 20:46:28.373535 I | rafthttp: peer ade526d28b1f92f7 became inactive
2017-03-30 20:46:28.475156 I | rafthttp: peer ade526d28b1f92f7 became active
2017-03-30 20:46:28.475193 I | rafthttp: established a TCP streaming connection with peer ade526d28b1f92f7 (stream Message reader)
2017-03-30 20:46:28.475427 I | rafthttp: established a TCP streaming connection with peer ade526d28b1f92f7 (stream MsgApp v2 reader)
Getting the Endpoints
In my case, the goal here is to get endpoints for individual etcd instances for scraping and monitoring metrics. This can now be easily done as follows:
cluster/kubectl.sh get endpoints
NAME ENDPOINTS AGE
etcd-client 172.17.0.3:2379,172.17.0.4:2379,172.17.0.5:2379 23m
etcd0 172.17.0.3:2380,172.17.0.3:2379 23m
etcd1 172.17.0.4:2380,172.17.0.4:2379 23m
etcd2 172.17.0.5:2380,172.17.0.5:2379 23m
kubernetes 10.240.0.3:6443 29m
Note that in doing this, kubernetes returns the docker IP addresses.
Debugging: Grabbing the local endpoints.
Interestingly here we can see 10.* addresses were given as variables to the ETCD containers in kube. I'm not entirely sure why 172 is given out by kubectl get endpoints
➜ kubernetes git:(local-up-conformance) ✗ sudo docker inspect 16c5d46a36c8 | grep ETCD1
"ETCD1_PORT_2380_TCP=tcp://10.0.0.223:2380",
"ETCD1_PORT_2380_TCP_ADDR=10.0.0.223",
"ETCD1_PORT_2380_TCP_PROTO=tcp",
"ETCD1_SERVICE_PORT_SERVER=2380",
"ETCD1_PORT_2379_TCP=tcp://10.0.0.223:2379",
"ETCD1_PORT=tcp://10.0.0.223:2379",
"ETCD1_PORT_2380_TCP_PORT=2380",
"ETCD1_SERVICE_HOST=10.0.0.223",
"ETCD1_SERVICE_PORT=2379",
"ETCD1_SERVICE_PORT_CLIENT=2379",
"ETCD1_PORT_2379_TCP_PROTO=tcp",
"ETCD1_PORT_2379_TCP_ADDR=10.0.0.223",
"ETCD1_PORT_2379_TCP_PORT=2379",
➜ kubernetes git:(local-up-conformance) ✗ sudo docker inspect 16c5d46a36c8 | grep ETCD2
"ETCD2_SERVICE_HOST=10.0.0.92",
"ETCD2_SERVICE_PORT_CLIENT=2379",
"ETCD2_PORT=tcp://10.0.0.92:2379",
"ETCD2_PORT_2379_TCP_PROTO=tcp",
"ETCD2_PORT_2379_TCP_ADDR=10.0.0.92",
"ETCD2_PORT_2379_TCP_PORT=2379",
"ETCD2_PORT_2380_TCP=tcp://10.0.0.92:2380",
"ETCD2_PORT_2380_TCP_ADDR=10.0.0.92",
"ETCD2_SERVICE_PORT_SERVER=2380",
"ETCD2_PORT_2379_TCP=tcp://10.0.0.92:2379",
"ETCD2_PORT_2380_TCP_PORT=2380",
"ETCD2_PORT_2380_TCP_PROTO=tcp",
"ETCD2_SERVICE_PORT=2379",
Getting individual endpoints data
One of the most interesting things you can do with a local etcd kubernetes deployment is watch how it scales locally. To do this, you can pull metrics endoints independently, i.e.
curl 172.17.0.3:2379/metrics.
WARNING
Depending on fedora/centos versions you might be using, if you try to reach the endpoints, and get something like this: "404 not found", you may need to restart networking on your system (https://bugzilla.redhat.com/show_bug.cgi?id=1183973)
Measuring ETCD using metrics/
So now that we have a 'real' cluster running, lets start measuring stuff.
First, fire up prometheus:
Create an etcd configuration file that scrapes your etcd /metrics endpoints:
➜ docker-locust git:(master) ✗ cat /home/jayunit100/work/prometheus/conf.yml
# my global configNote that the targets == kubectl.sh get endpoints.
global:
scrape_interval: 2s
evaluation_interval: 10s
# scrape_timeout is set to the global default (10s).
scrape_configs:
- job_name: prometheus
honor_labels: true
# scrape_interval is defined by the configured global (15s).
# scrape_timeout is defined by the global default (10s).
# metrics_path defaults to '/metrics'
# scheme defaults to 'http'.
static_configs:
- targets: ['172.17.0.3:2379','172.17.0.4:2379','172.17.0.5:2379']
Now, mount that file into /etc/prometheus and start it:
sudo docker run -p 9090:9090 -v /tmp/prometheus.yml:/etc/prometheus/prometheus.yml prom/prometheus
Side note: In GCE, you can export prometheus metrics for your etcd cluster like so, without exposing it to the outside world at all :
gcloud compute firewall-rules create prometheus --allow tcp:9090 --source-tags="jayunit100-scale-perf-devserv-0" --source-ranges=0.0.0.0/0 --description="expose prometheus"
Now, do something interesting !
Then use etcddeath/ a really simple suite of shell scripts that can increase load or break networking or other set of stuff to make etcd go crazy. In particular, busycluster.sh keyflood.sh
This is super easy: Just clone down etcddeath ! https://github.com/jdumars/etcdeath/blob/master/stress/busycluster.sh... And export a VICTIM and get started.
VICTIM=172.17.0.5 ./keyflood.sh
There is also ./busycluster.sh which doesn't destroy your disk i/o but simulates large load. Once you start stressing your cluster out, you'll see some changes in snapshotting intervals etc... This can be seen with "etcd_debugging_snap_save_marshalling_duration_seconds_bucket" which is a good metric ofh how long youre snapshotting is taking. Over time, if this continues to increase unbounded, it can lead to pauses in etcd availability.
Results
Indeed you can easily destroy a distributed etcd in containers and learn about how it behaves at scale, without a large cluster.
Etcd marshalling: higher bins increase once write strain exceeds disk i/o.
In another experiment (not visualized here) I also noticed that the etcd_server_proposals_pending metric was useful when the cluster was busy. The other metrics worth looking at are here https://github.com/coreos/etcd/blob/master/Documentation/metrics.md.
More hacking...
If etcd is growing you can quickly grab all the keys using the v3 api. This has changed since V2, so its worth posting here. Note that "ETCDCTL_API=3 etcdctl get / --prefix --keys-only" is pulling *all* keys down with no need for recursion, because its a flattened key space (post 2x). So, the two options we use here to get all keys are quite simple:
1) --prefix can just be anything (i.e. /r /e /..). If you just use / as the prefix, it will return everything in the entire database (again, due to the flat structure).
2) The --key-only part is important here since the values are binary encoded, and you probably don't want all that muck in your terminal or in a plain text file.

perfect explanation about java programming .its very useful.thanks for your valuable information.java training in chennai chennai, tamil nadu | java training in velachery
ReplyDelete