Kalyan Hadoop Training in Hyderabad @ ORIEN IT, Ameerpet, 040 65142345 , 9703202345

Monday, 4 January 2016

Running a sample Pi example

To run any application on top of YARN, you need to follow this Java command syntax:
$ yarn jar <application_jar.jar> <arg0> <arg1>

To run a sample example to calculate the value of PI with 16 maps and 10,000 samples, use the following command:
$ yarn jar $YARN_EXAMPLES/hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar PI 16 10000

Note that we are using hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar here.

The JAR version may change depending on your installed Hadoop distribution.

Once you hit the preceding command on the console, you will see the logs generated by the application on the console, as shown in the following command. The default logger configuration is displayed on the console.

The default mode is INFO, and you may change it by overwriting the default logger settings by updating hadoop.root.logger=WARN,console in conf/log4j.properties:

Number of Maps = 16
Samples per Map = 10000
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Wrote input for Map #5
Wrote input for Map #6
Wrote input for Map #7
Wrote input for Map #8
Wrote input for Map #9
Wrote input for Map #10
Wrote input for Map #11
Wrote input for Map #12
Wrote input for Map #13
Wrote input for Map #14
Wrote input for Map #15

Starting Job
11/09/14 21:12:02 INFO mapreduce.Job: map 0% reduce 0%
11/09/14 21:12:09 INFO mapreduce.Job: map 25% reduce 0%
11/09/14 21:12:11 INFO mapreduce.Job: map 56% reduce 0%
11/09/14 21:12:12 INFO mapreduce.Job: map 100% reduce 0%
11/09/14 21:12:12 INFO mapreduce.Job: map 100% reduce 100%
11/09/14 21:12:12 INFO mapreduce.Job: Job job_1381790835497_0003 completed successfully
11/09/14 21:12:19 INFO mapreduce.Job: Counters: 44

File System Counters
FILE: Number of bytes read=358
FILE: Number of bytes written=1365080
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=4214
HDFS: Number of bytes written=215
HDFS: Number of read operations=67
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=16
Launched reduce tasks=1
Data-local map tasks=14
Rack-local map tasks=2
Total time spent by all maps in occupied slots (ms)=184421
Total time spent by all reduces in occupied slots (ms)=8542
Map-Reduce Framework
Map input records=16
Map output records=32
Map output bytes=288
Map output materialized bytes=448
Input split bytes=2326
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=448
Reduce input records=32
Reduce output records=0
Spilled Records=64
Shuffled Maps =16
Failed Shuffles=0
Merged Map outputs=16
GC time elapsed (ms)=195
CPU time spent (ms)=7740
Physical memory (bytes) snapshot=6143396896
Virtual memory (bytes) snapshot=23142254400
Total committed heap usage (bytes)=43340769024
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=1848
File Output Format Counters
Bytes Written=98
Job Finished in 23.144 seconds

Estimated value of Pi is 3.14127500000000000000

You can compare the example that runs over Hadoop 1.x and the one that runs over YARN. You can hardly differentiate by looking at the logs, but you can clearly identify the difference in performance. YARN has backward-compatibility support with MapReduce 1.x, without any code change.

Running sample examples on YARN

Running the available sample MapReduce programs is a simple task with YARN. The Hadoop version ships with some basic MapReduce examples.

You can find them inside $HADOOP_HOME/share/Hadoop/mapreduce/Hadoop-mapreduce-examples-<HADOOP_VERSION>.jar .

The location of the file may differ depending on your Hadoop
installation folder structure.

Let’s include this in the YARN_EXAMPLES path:
$export YARN_EXAMPLES=$HADOOP_HOME/share/Hadoop/mapreduce

Now, we have all the sample examples in the YARN_EXAMPLES environmental variable. You can access all the examples using this variable; to list all the available examples, try typing the following command on the console:

$ yarn jar $YARN_EXAMPLES/hadoop-mapreduce-examples-2.4.0.2.1.1.0-385.jar

An example program must be given as the first argument.

The valid program names are as follows:

aggregatewordcount : This is an aggregate-based map/reduce program that counts the words in the input files
aggregatewordhist : This is an aggregate-based map/reduce program that computes the histogram of the words in the input files
bbp : This is a map/reduce program that uses Bailey-Borwein-Plouffe to compute the exact digits of Pi
dbcount : This is an example job that counts the page view counts from a database
distbbp : This is a map/reduce program that uses a BBP-type formula to compute the exact bits of Pi
grep : This is a map/reduce program that counts the matches of a regex in the input
join : This is a job that affects a join over sorted, equally-partitioned datasets
multifilewc : This is a job that counts words from several files
pentomino : This is a map/reduce tile that lays a program to find solutions to pentomino problems
pi : This is a map/reduce program that estimates Pi using a quasi-Monte Carlo method
randomtextwriter : This is a map/reduce program that writes 10 GB of random textual data per node
randomwriter : This is a map/reduce program that writes 10 GB of random data per node
secondarysort : This is an example that defines a secondary sort to the reduce
sort : This is a map/reduce program that sorts the data written by the random writer
sudoku : This is a sudoku solver
teragen : This generates data for the terasort
terasort : This runs the terasort
teravalidate : This checks the results of terasort
wordcount : This is a map/reduce program that counts the words in the input files
wordmean : This is a map/reduce program that counts the average length of the words in the input files
wordmedian : This is a map/reduce program that counts the median length of the words in the input files
wordstandarddeviation : This is a map/reduce program that counts the standard deviation of the length of the words in the input files

These were the sample examples that come as part of the YARN distribution by default.

Pages

Monday, 4 January 2016

Running a sample Pi example

Running sample examples on YARN