In the previous post, we discussed HDFS and how Hadoop handles Datanode failure. In this post, we will discuss how Hadoop handles Namenode failure.
Hadoop 1.0 -> Namenode is a single point of failure. Failure handling is highly costly and time consuming due to recovery of edit logs.
Hadoop 2.0 -> Introduction of Secondary Namenode
Secondary NameNode is a separate node that periodically reads Hadoop file system snapshot(FS Image) and edit logs to create periodic checkpoints which reduces the amount of logs that needs to be replayed, which can significantly reduce the recovery time. Upon Namenode failure, the Secondary Namenode assumes the role of Namenode.
However, the Secondary NameNode is not a backup for the NameNode and does not provide any redundancy or high availability for the Hadoop NameNode. It only reduces the downtime of the cluster.