Since the data is replicated thrice in HDFS, does it mean that any calculation done on one node will also be replicated on the other two? | Hadoop Questions | Hadoop (Big Data) Interview Questions and Answers

Home » Unlabelled » Since the data is replicated thrice in HDFS, does it mean that any calculation done on one node will also be replicated on the other two? | Hadoop Questions

Since the data is replicated thrice in HDFS, does it mean that any calculation done on one node will also be replicated on the other two? | Hadoop Questions

Since there are 3 nodes, when we send the MapReduce programs, calculations will be done only on the original data. The master node will know which node exactly has that particular data. In case, if one of the nodes is not responding, it is assumed to be failed. Only then, the required calculation will be done on the second replica.

3 comments:

Unknown17 December 2015 at 06:28
I think answer is wrong. Correct me if I am wrong.
ReplyDelete
Replies
nitish21 July 2016 at 03:27
yes it is wrong it should replicate on all three not matters u r doing calculation on one or other.
ReplyDelete
Replies
pravin6 August 2017 at 08:18
@nitish :- it is not like that replication is for fault tolerance. The most important here to understand is HDFS file system follow WORM(Write Once Read Many) rule, which means once the file is written in hdfs it cannot be edited. if you are processing a file in HDFS you will never change the original data, you will just extract the meaning of data.
ReplyDelete
Replies

Add comment

Pages

Since the data is replicated thrice in HDFS, does it mean that any calculation done on one node will also be replicated on the other two? | Hadoop Questions

3 comments: