jayunit100: A Phyrric methodology for cleaning out hadoop in psuedodistributed mode.

Every once in a while , hadoop goes totally haywire when I play with it in psuedodistributed mode.

Problems include :

1) Data not being replicated to nodes (i.e. you do a namenode format, and the data nodes are now out of sync).

2) No connection available (i.e. hadoop keeps trying to connect to localhost:9000 and failing) .

3) Other permissions or other types of cryptic exceptions .

The solution is simple - pseudodistributed mode, by default, writes to /tmp (which aliases to /private/tmp on OS X). Thus, to clean up your psuedodistributed hadoop dfs, you can simply :

1) run stop-all.sh (or stop all hadoop services in some other manner).

2) Remove everything in /tmp (Careful here-- im assuming you dont have anything important in /tmp - if you do, just remove everything that looks related to hadoop).

3) hadoop namenode -format : this will format the namenode , starting things over from scratch.

4) Fire up the rest of the "cluster" by running "start-all.sh".

31.3.12

A Phyrric methodology for cleaning out hadoop in psuedodistributed mode.

No comments:

Post a Comment