Running sample examples on YARN

Monday, 4 January 2016

Running sample examples on YARN

Running the available sample MapReduce programs is a simple task with YARN. The Hadoop version ships with some basic MapReduce examples.

You can find them inside $HADOOP_HOME/share/Hadoop/mapreduce/Hadoop-mapreduce-examples-<HADOOP_VERSION>.jar .

The location of the file may differ depending on your Hadoop
installation folder structure.

Let’s include this in the YARN_EXAMPLES path:
$export YARN_EXAMPLES=$HADOOP_HOME/share/Hadoop/mapreduce

Now, we have all the sample examples in the YARN_EXAMPLES environmental variable. You can access all the examples using this variable; to list all the available examples, try typing the following command on the console:

$ yarn jar $YARN_EXAMPLES/hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar

An example program must be given as the first argument.

The valid program names are as follows:

aggregatewordcount : This is an aggregate-based map/reduce program that counts the words in the input files
aggregatewordhist : This is an aggregate-based map/reduce program that computes the histogram of the words in the input files
bbp : This is a map/reduce program that uses Bailey-Borwein-Plouffe to compute the exact digits of Pi
dbcount : This is an example job that counts the page view counts from a database
distbbp : This is a map/reduce program that uses a BBP-type formula to compute the exact bits of Pi
grep : This is a map/reduce program that counts the matches of a regex in the input
join : This is a job that affects a join over sorted, equally-partitioned datasets
multifilewc : This is a job that counts words from several files
pentomino : This is a map/reduce tile that lays a program to find solutions to pentomino problems
pi : This is a map/reduce program that estimates Pi using a quasi-Monte Carlo method
randomtextwriter : This is a map/reduce program that writes 10 GB of random textual data per node
randomwriter : This is a map/reduce program that writes 10 GB of random data per node
secondarysort : This is an example that defines a secondary sort to the reduce
sort : This is a map/reduce program that sorts the data written by the random writer
sudoku : This is a sudoku solver
teragen : This generates data for the terasort
terasort : This runs the terasort
teravalidate : This checks the results of terasort
wordcount : This is a map/reduce program that counts the words in the input files
wordmean : This is a map/reduce program that counts the average length of the words in the input files
wordmedian : This is a map/reduce program that counts the median length of the words in the input files
wordstandarddeviation : This is a map/reduce program that counts the standard deviation of the length of the words in the input files

These were the sample examples that come as part of the YARN distribution by default.

Kalyan Hadoop Training in Hyderabad @ ORIEN IT, Ameerpet, 040 65142345 , 9703202345

Pages

Monday, 4 January 2016

Running sample examples on YARN

No comments:

Post a Comment