Problems include :
1) Data not being replicated to nodes (i.e. you do a namenode format, and the data nodes are now out of sync).
2) No connection available (i.e. hadoop keeps trying to connect to localhost:9000 and failing) .
3) Other permissions or other types of cryptic exceptions .
The solution is simple - pseudodistributed mode, by default, writes to /tmp (which aliases to /private/tmp on OS X). Thus, to clean up your psuedodistributed hadoop dfs, you can simply :
1) run stop-all.sh (or stop all hadoop services in some other manner).
2) Remove everything in /tmp (Careful here-- im assuming you dont have anything important in /tmp - if you do, just remove everything that looks related to hadoop).
3) hadoop namenode -format : this will format the namenode , starting things over from scratch.
4) Fire up the rest of the "cluster" by running "start-all.sh".
No comments:
Post a Comment