How can a distributed filesystem such as HDFS provide opportunities for optimization of a MapReduce operation?

A. Data represented in a distributed filesystem is already sorted.
B. Distributed filesystems must always be resident in memory, which is much faster than disk.
C. Data storage and processing can be co-located on the same node, so that most input data relevant to Map or Reduce will be present on local disks or cache.
D. A distributed filesystem makes random access faster because of the presence of a dedicated node serving file metadata.

Ans: D

No comments:

Post a Comment