Hadoop interview questions and answers - 2

10. Explain what is the purpose of RecordReader in Hadoop?

In Hadoop, the Mapper always understands key value pair. The RecordReader loads the data from its source and converts it into (key, value) pairs suitable for reading by the Mapper.

11. Explain how JobTracker schedules a task?

The TaskTracker sends out heartbeat messages to JobTracker usually every few minutes, to make sure that it is active and functioning.  The message also informs JobTracker about the number of available slots where tasks can be delegated.

When JobTracker receives the requests for MapReduce execution from the client, it talks to the NameNode to determine the location of the data. JobTracker finds the best TaskTracker nodes to execute tasks based on the proximity of the data and the available slots. It monitors the individual TaskTrackers and submits back the overall status of the job back to the client.

A TaskTracker notifies the JobTracker when a task fails. The JobTracker decides what to do then. It mayeven resubmit the job across other nodes.