I've been a big advocate of automating everything for a while now... But a recent project I was hacking on changed my mind.
The project is https://github.com/jayunit100/SparkStreamingCassandraDemo .
I initially set this project up to be a POC/Snippet of ETL from Spark into cassandra. You know, one of those one day hacks where you pull code in from a few blog posts and then make sure the integration glue all works.
what next ?
After getting it working I thought "hey this is great ! let me set up a mini jenkins server and a vagrantfile". So I spent a few hours on that. Then jenkins went down. Then I found out that actually the Vagrantfile I created was depending on a somewhat unmaintained centos7 box of my own, in vagrant cloud. So the project very quickly became much more than a POC, and was now a maintainence nightmare - I was essentially building and maintaining my own immutable and reproducible bigdata stack.
it gets worse.
Later on, after I had my vagrantfile working, I actually decided I wanted to containerize everything. So the project was now going to have to change - I would have to update the Vagrantfile and also update a README to tell people to install Vagrant 1.6 and so on... And get docker containers tested and working as well in a reproducible way.
a better approach.
When im hacking on a new project, I'm not too worried about reproducibility. I can trust myself that, if indeed the project needs some reproducible deployments, I can string together a good Dockerfile/Vagrantfile after first just getting it working in a ad hoc sort of way.
so when to dockerize/vagrantize?
I still think there is value in doing this ridiculously early. maybe wait one week of hacking on a project before you automate everything.
but, if your not sure : i say still err on the side of automation.
I never *really* regret automation that much. In general, Im going to come full circle and say, there probably in the long run isn't such a thing as overautomation, but sometimes , in the short term, it sure can *seem to be* be a distraction from the core work your doing.
but *seem to be* != *is* :)
No comments:
Post a Comment