HADOOP INSTALL SERIES: 7. Starting Hadoop/Hadoop components

2014-06-18 44 Dailymotion

The start-dfs.sh command, as the name suggests, starts the components necessary for HDFS. This is the NameNode to manage the filesystem and a single DataNode to hold data. The SecondaryNameNode is an availability aid that we'll discuss in a later chapter. After starting these components, we use the JDK's jps utility to see which Java processes are running, and, as the output looks good, we then use Hadoop's dfs utility to list the root of the HDFS filesystem. After this, we use start-mapred.sh to start the MapReduce components—this time the JobTracker and a single TaskTracker—and then use jps again to verify the result. There is also a combined start-all.sh file that we'll use at a later stage, but in the early days it's useful to do a two-stage start up to more easily verify the cluster configuration.

HADOOP INSTALL SERIES: 7. Starting Hadoop/Hadoop components

Related Videos