September 2013

Sunday 8 September 2013

Hadoop MapReduce Interview Questions

What is MapReduce?

It is a framework or a programming model that is used for processing large data sets over clusters of computers using distributed programming.

What are 'maps' and 'reduces'?

'Maps' and 'Reduces' are two phases of solving a query in HDFS. 'Map' is responsible to read data from input location, and based on the input type, it will generate a key value pair,that is, an intermediate output in local machine.'Reducer' is responsible to process the intermediate output received from the mapper and generate the final output.

What are the four basic parameters of a mapper?

The four basic parameters of a mapper are LongWritable, text, text and IntWritable. The first two represent input parameters and the second two represent intermediate output parameters.

What are the four basic parameters of a reducer?

The four basic parameters of a reducer are Text, IntWritable, Text, IntWritable.The first two represent intermediate output parameters and the second two represent final output parameters.

What do the master class and the output class do?

Master is defined to update the Master or the job tracker and the output class is defined to write data onto the output location.

What is the input type/format in MapReduce by default?

By default the type input type in MapReduce is 'text'.

Is it mandatory to set input and output type/format in MapReduce?

No, it is not mandatory to set the input and output type/format in MapReduce. By default, the cluster takes the input and the output type as 'text'.

What does the text input format do?

In text input format, each line will create a line off-set, that is an hexa-decimal number. Key is considered as a line off-set and value is considered as a whole line text. This is how the data gets processed by a mapper. The mapper will receive the 'key' as a 'LongWritable' parameter and value as a 'Text' parameter.

What does job conf class do?

MapReduce needs to logically separate different jobs running on the same cluster. 'Job conf class' helps to do job level settings such as declaring a job in real environment. It is recommended that Job name should be descriptive and represent the type of job that is being executed.

What does conf.setMapper Class do?

Conf.setMapperclass sets the mapper class and all the stuff related to map job such as reading a data and generating a key-value pair out of the mapper.

What do sorting and shuffling do?

Sorting and shuffling are responsible for creating a unique key and a list of values.Making similar keys at one location is known as Sorting. And the process by which the intermediate output of the mapper is sorted and sent across to the reducers is known as Shuffling.

What does a split do?

Before transferring the data from hard disk location to map method, there is a phase or method called the 'Split Method'. Split method pulls a block of data from HDFS to the framework. The Split class does not write anything, but reads data from the block and pass it to the mapper.Be default, Split is taken care by the framework. Split method is equal to the block size and is used to divide block into bunch of splits.

How can we change the split size if our commodity hardware has less storage space?

If our commodity hardware has less storage space, we can change the split size by writing the 'custom splitter'. There is a feature of customization in Hadoop which can be called from the main method.

What does a MapReduce partitioner do?

A MapReduce partitioner makes sure that all the value of a single key goes to the same reducer, thus allows evenly distribution of the map output over the reducers. It redirects the mapper output to the reducer by determining which reducer is responsible for a particular key.

How is Hadoop different from other data processing tools?

In Hadoop, based upon your requirements, you can increase or decrease the number of mappers without bothering about the volume of data to be processed. this is the beauty of parallel processing in contrast to the other data processing tools available.

Can we rename the output file?

Yes we can rename the output file by implementing multiple format output class.

Why we cannot do aggregation (addition) in a mapper? Why we require reducer for that?

We cannot do aggregation (addition) in a mapper because, sorting is not done in a mapper. Sorting happens only on the reducer side. Mapper method initialization depends upon each input split. While doing aggregation, we will lose the value of the previous instance. For each row, a new mapper will get initialized. For each row, inputsplit again gets divided into mapper, thus we do not have a track of the previous row value.

What is Streaming?

Streaming is a feature with Hadoop framework that allows us to do programming using MapReduce in any programming language which can accept standard input and can produce standard output. It could be Perl, Python, Ruby and not necessarily be Java. However, customization in MapReduce can only be done using Java and not any other programming language.

What is a Combiner?

A 'Combiner' is a mini reducer that performs the local reduce task. It receives the input from the mapper on a particular node and sends the output to the reducer. Combiners help in enhancing the efficiency of MapReduce by reducing the quantum of data that is required to be sent to the reducers.

What is the difference between an HDFS Block and Input Split?

HDFS Block is the physical division of the data and Input Split is the logical division of the data.

What happens in a TextInputFormat?

In TextInputFormat, each line in the text file is a record. Key is the byte offset of the line and value is the content of the line.
For instance,Key: LongWritable, value: Text.

What do you know about KeyValueTextInputFormat?

In KeyValueTextInputFormat, each line in the text file is a 'record'. The first separator character divides each line. Everything before the separator is the key and everything after the separator is the value.
For instance,Key: Text, value: Text.

What do you know about SequenceFileInputFormat?

SequenceFileInputFormat is an input format for reading in sequence files. Key and value are user defined. It is a specific compressed binary file format which is optimized for passing the data between the output of one MapReduce job to the input of some other MapReduce job.

What do you know about NLineOutputFormat?

NLineOutputFormat splits 'n' lines of input as one split.

Saturday 7 September 2013

Java Interview Questions

What is the most important feature of Java?
Java is a platform independent language.

What do you mean by platform independence?
Platform independence means that we can write and compile the java code in one platform (eg Windows) and can execute the class in any other supported platform eg (Linux,Solaris,etc).

What is a JVM?
JVM is Java Virtual Machine which is a run time environment for the compiled java class files.

Are JVM's platform independent?
JVM's are not platform independent. JVM's are platform specific run time implementation provided by the vendor.

What is the difference between a JDK and a JVM?
JDK is Java Development Kit which is for development purpose and it includes execution environment also. But JVM is purely a run time environment and hence you will not be able to compile your source files using a JVM.

What is a pointer and does Java support pointers?
Pointer is a reference handle to a memory location. Improper handling of pointers leads to memory leaks and reliability issues hence Java doesn't support the usage of pointers.

What is the base class of all classes?
java.lang.Object

Does Java support multiple inheritance?
Java doesn't support multiple inheritance.

Is Java a pure object oriented language?
Java uses primitive data types and hence is not a pure object oriented language.

Are arrays primitive data types?
In Java, Arrays are objects.

What is difference between Path and Classpath?
Path and Classpath are operating system level environment variales. Path is used define where the system can find the executables(.exe) files and classpath is used to specify the location .class files.

What are local variables?
Local varaiables are those which are declared within a block of code like methods. Local variables should be initialised before accessing them.

What are instance variables?
Instance variables are those which are defined at the class level. Instance variables need not be initialized before using them as they are automatically initialized to their default values.

How to define a constant variable in Java?
The variable should be declared as static and final. So only one copy of the variable exists for all instances of the class and the value can't be changed also.
static final int PI = 2.14; is an example for constant.

Should a main() method be compulsorily declared in all java classes?
No not required. main() method should be defined only if the source class is a java application.

What is the return type of the main() method?
Main() method doesn't return anything hence declared void.

Why is the main() method declared static?
main() method is called by the JVM even before the instantiation of the class hence it is declared as static.

What is the arguement of main() method?
main() method accepts an array of String object as arguement.

Can a main() method be overloaded?
Yes. You can have any number of main() methods with different method signature and implementation in the class.

Can a main() method be declared final?
Yes. Any inheriting class will not be able to have it's own default main() method.

Does the order of public and static declaration matter in main() method?
No. It doesn't matter but void should always come before main().

Can a source file contain more than one class declaration?
Yes a single source file can contain any number of Class declarations but only one of the class can be declared as public.

What is a package?
Package is a collection of related classes and interfaces. package declaration should be first statement in a java class.

Which package is imported by default?
java.lang package is imported by default even without a package declaration.

Can a class declared as private be accessed outside it's package?
Not possible.

Can a class be declared as protected?
A class can't be declared as protected. only methods can be declared as protected.

What is the access scope of a protected method?
A protected method can be accessed by the classes within the same package or by the subclasses of the class in any package.

What is the purpose of declaring a variable as final?
A final variable's value can't be changed. final variables should be initialized before using them.

What is the impact of declaring a method as final?
A method declared as final can't be overridden. A sub-class can't have the same method signature with a different implementation.

I don't want my class to be inherited by any other class. What should i do?
You should declared your class as final. But you can't define your class as final, if it is an abstract class. A class declared as final can't be extended by any other class.

Can you give few examples of final classes defined in Java API?
java.lang.String, java.lang.Math are final classes.

How is final different from finally and finalize()?
final is a modifier which can be applied to a class or a method or a variable. final class can't be inherited, final method can't be overridden and final variable can't be changed. finally is an exception handling code section which gets executed whether an exception is raised or not by the try block code segment.
finalize() is a method of Object class which will be executed by the JVM just before garbage collecting object to give a final chance for resource releasing activity.

Can a class be declared as static?
We can not declare top level class as static, but only inner class can be declared static.
public class Test
{
static class InnerClass
{
public static void InnerMethod()
{ System.out.println("Static Inner Class!"); }
}
public static void main(String args[])
{
Test.InnerClass.InnerMethod();
}
}
//output: Static Inner Class!

When will you define a method as static?
When a method needs to be accessed even before the creation of the object of the class then we should declare the method as static.

What are the restriction imposed on a static method or a static block of code?
A static method should not refer to instance variables without creating an instance and cannot use "this" operator to refer the instance.

I want to print "Hello" even before main() is executed. How will you acheive that?
Print the statement inside a static block of code. Static blocks get executed when the class gets loaded into the memory and even before the creation of an object. Hence it will be executed before the main() method. And it will be executed only once.

What is the importance of static variable?
static variables are class level variables where all objects of the class refer to the same variable. If one object changes the value then the change gets reflected in all the objects.

Can we declare a static variable inside a method?
Static varaibles are class level variables and they can't be declared inside a method. If declared, the class will not compile.

What is an Abstract Class and what is it's purpose?
A Class which doesn't provide complete implementation is defined as an abstract class. Abstract classes enforce abstraction.

Can a abstract class be declared final?
Not possible. An abstract class without being inherited is of no use and hence will result in compile time error.

What is use of a abstract variable?
Variables can't be declared as abstract. only classes and methods can be declared as abstract.

Can you create an object of an abstract class?
Not possible. Abstract classes can't be instantiated.

Can a abstract class be defined without any abstract methods?
Yes it's possible. This is basically to avoid instance creation of the class.

Class C implements Interface I containing method m1 and m2 declarations. Class C has provided implementation for method m2. Can i create an object of Class C?
No not possible. Class C should provide implementation for all the methods in the Interface I. Since Class C didn't provide implementation for m1 method, it has to be declared as abstract. Abstract classes can't be instantiated.

Can a method inside a Interface be declared as final?
No not possible. Doing so will result in compilation error. public and abstract are the only applicable modifiers for method declaration in an interface.

Can an Interface implement another Interface?
Intefaces doesn't provide implementation hence a interface cannot implement another interface.

Can an Interface extend another Interface?
Yes an Interface can inherit another Interface, for that matter an Interface can extend more than one Interface.

Can a Class extend more than one Class?
Not possible. A Class can extend only one class but can implement any number of Interfaces.

Why is an Interface be able to extend more than one Interface but a Class can't extend more than one Class?
Basically Java doesn't allow multiple inheritance, so a Class is restricted to extend only one Class. But an Interface is a pure abstraction model and doesn't have inheritance hierarchy like classes(do remember that the base class of all classes is Object). So an Interface is allowed to extend more than one Interface.

Can an Interface be final?
Not possible. Doing so so will result in compilation error.

Can a class be defined inside an Interface?
Yes it's possible.

Can an Interface be defined inside a class?
Yes it's possible.

What is a Marker Interface?
An Interface which doesn't have any declaration inside but still enforces a mechanism.

Which object oriented Concept is achieved by using overloading and overriding?
Polymorphism.

Why does Java not support operator overloading?
Operator overloading makes the code very difficult to read and maintain. To maintain code simplicity, Java doesn't support operator overloading.

Can we define private and protected modifiers for variables in interfaces?
No.

What is Externalizable?
Externalizable is an Interface that extends Serializable Interface. And sends data into Streams in Compressed Format. It has two methods, writeExternal(ObjectOuput out) and readExternal(ObjectInput in)

What modifiers are allowed for methods in an Interface?
Only public and abstract modifiers are allowed for methods in interfaces.

What is a local, member and a class variable?
Variables declared within a method are "local" variables.
Variables declared within the class i.e not within any methods are "member" variables (global variables).
Variables declared within the class i.e not within any methods and are defined as "static" are class variables.

What is an abstract method?
An abstract method is a method whose implementation is deferred to a subclass.

What value does read() return when it has reached the end of a file?
The read() method returns -1 when it has reached the end of a file.

Can a Byte object be cast to a double value?
No, an object cannot be cast to a primitive value.

What is the difference between a static and a non-static inner class?
A non-static inner class may have object instances that are associated with instances of the class's outer class. A static inner class does not have any object instances.

What is an object's lock and which object's have locks?
An object's lock is a mechanism that is used by multiple threads to obtain synchronized access to the object. A thread may execute a synchronized method of an object only after it has acquired the object's lock. All objects and classes have locks. A class's lock is acquired on the class's Class object.

What is the % operator?
It is referred to as the modulo or remainder operator. It returns the remainder of dividing the first operand by the second operand.

When can an object reference be cast to an interface reference?
An object reference be cast to an interface reference when the object implements the referenced interface.

Which class is extended by all other classes?
The Object class is extended by all other classes.

Which non-Unicode letter characters may be used as the first character of an identifier?
The non-Unicode letter characters $ and _ may appear as the first character of an identifier

What restrictions are placed on method overloading?
Two methods may not have the same name and argument list but different return types.

What is casting?
There are two types of casting, casting between primitive numeric types and casting between object references. Casting between numeric types is used to convert larger values, such as double values, to smaller values, such as byte values. Casting between object references is used to refer to an object by a compatible class, interface, or array type reference.

What is the return type of a program's main() method?
void.

If a variable is declared as private, where may the variable be accessed?
A private variable may only be accessed within the class in which it is declared.

What do you understand by private, protected and public?
These are accessibility modifiers. Private is the most restrictive, while public is the least restrictive. There is no real difference between protected and the default type (also known as package protected) within the context of the same package, however the protected keyword allows visibility to a derived class in a different package.

What is Downcasting ?
Downcasting is the casting from a general to a more specific type, i.e. casting down the hierarchy

What modifiers may be used with an inner class that is a member of an outer class?
A (non-local) inner class may be declared as public, protected, private, static, final, or abstract.

How many bits are used to represent Unicode, ASCII, UTF-16, and UTF-8 characters?
Unicode requires 16 bits and ASCII require 7 bits Although the ASCII character set uses only 7 bits, it is usually represented as 8 bits.
UTF-8 represents characters using 8, 16, and 18 bit patterns.
UTF-16 uses 16-bit and larger bit patterns.

What restrictions are placed on the location of a package statement within a source code file?
A package statement must appear as the first line in a source code file (excluding blank lines and comments).

What is a native method?
A native method is a method that is implemented in a language other than Java.

What are order of precedence and associativity, and how are they used?
Order of precedence determines the order in which operators are evaluated in expressions. Associatity determines whether an expression is evaluated left-to-right or right-to-left.

Can an anonymous class be declared as implementing an interface and extending a class?
An anonymous class may implement an interface or extend a superclass, but may not be declared to do both.

What is the range of the char type?
The range of the char type is 0 to 216 - 1 (i.e. 0 to 65535.)

What is the range of the short type?
The range of the short type is -(215) to 215 - 1. (i.e. -32,768 to 32,767)

Why isn't there operator overloading?
Because C++ has proven by example that operator overloading makes code almost impossible to maintain.

What does it mean that a method or field is "static"?
Static variables and methods are instantiated only once per class. In other words they are class variables, not instance variables. If you change the value of a static variable in a particular object, the value of that variable changes for all instances of that class. Static methods can be referenced with the name of the class rather than the name of a particular object of the class (though that works too). That's how library methods like System.out.println() work. out is a static field in the java.lang.System class.

Is null a keyword?
The null value is not a keyword.

Which characters may be used as the second character of an identifier, but not as the first character of an identifier?
The digits 0 through 9 may not be used as the first character of an identifier but they may be used after the first character of an identifier.

Is the ternary operator written x : y ? z or x ? y : z ?
It is written x ? y : z.

How is rounding performed under integer division?
The fractional part of the result is truncated. This is known as rounding toward zero.

If a class is declared without any access modifiers, where may the class be accessed?
A class that is declared without any access modifiers is said to have package access. This means that the class can only be accessed by other classes and interfaces that are defined within the same package.

Does a class inherit the constructors of its superclass?
A class does not inherit constructors from any of its superclasses.

Name the eight primitive Java types.
The eight primitive types are byte, char, short, int, long, float, double, and boolean.

What restrictions are placed on the values of each case of a switch statement?
During compilation, the values of each case of a switch statement must evaluate to a value that can be promoted to an int value.

What is the difference between a while statement and a do while statement?
A while statement checks at the beginning of a loop to see whether the next loop iteration should occur. A do while statement checks at the end of a loop to see whether the next iteration of a loop should occur. The do whilestatement will always execute the body of a loop at least once.

What modifiers can be used with a local inner class?
A local inner class may be final or abstract.

When does the compiler supply a default constructor for a class?
The compiler supplies a default constructor for a class if no other constructors are provided.

If a method is declared as protected, where may the method be accessed?
A protected method may only be accessed by classes or interfaces of the same package or by subclasses of the class in which it is declared.

What are the legal operands of the instanceof operator?
The left operand is an object reference or null value and the right operand is a class, interface, or array type.

Are true and false keywords?
The values true and false are not keywords.

What happens when you add a double value to a String?
The result is a String object.

What is the diffrence between inner class and nested class?
When a class is defined within a scope od another class, then it becomes inner class. If the access modifier of the inner class is static, then it becomes nested class.

Can an abstract class be final?
An abstract class may not be declared as final.

What is numeric promotion?
Numeric promotion is the conversion of a smaller numeric type to a larger numeric type, so that integer and floating-point operations may take place. In numerical promotion, byte, char, and short values are converted to int values. The int values are also converted to long values, if necessary. The long and float values are converted to double values, as required.

What is the difference between a public and a non-public class?
A public class may be accessed outside of its package. A non-public class may not be accessed outside of its package.

To what value is a variable of the boolean type automatically initialized?
The default value of the boolean type is false.

What is the difference between the prefix and postfix forms of the ++ operator?
The prefix form performs the increment operation and returns the value of the increment operation. The postfix form returns the current value all of the expression and then performs the increment operation on that value.

What restrictions are placed on method overriding?
Overridden methods must have the same name, argument list, and return type. The overriding method may not limit the access of the method it overrides. The overriding method may not throw any exceptions that may not be thrown by the overridden method.

What is a Java package and how is it used?
A Java package is a naming context for classes and interfaces. A package is used to create a separate name space for groups of classes and interfaces. Packages are also used to organize related classes and interfaces into a single API unit and to control accessibility to these classes and interfaces.

What modifiers may be used with a top-level class?
A top-level class may be public, abstract, or final.

What is the difference between an if statement and a switch statement?
The if statement is used to select among two alternatives. It uses a boolean expression to decide which alternative should be executed. The switch statement is used to select among multiple alternatives. It uses an int expression to determine which alternative should be executed.

What are the practical benefits, if any, of importing a specific class rather than an entire package (e.g. import java.net.* versus import java.net.Socket)?

It makes no difference in the generated class files since only the classes that are actually used are referenced by the generated class file. There is another practical benefit to importing single classes, and this arises when two (or more) packages have classes with the same name.
Take java.util.Timer and javax.swing.Timer, for example. If I import java.util.* and javax.swing.* and then try to use "Timer", I get an error while compiling (the class name is ambiguous between both packages). Let's say what you really wanted was the javax.swing.Timer class, and the only classes you plan on using in java.util are Collection and HashMap.
In this case, some people will prefer to import java.util.Collection and import java.util.HashMap instead of importing java.util.*. This will now allow them to use Timer, Collection, HashMap, and other javax.swing classes without using fully qualified class names in.

Can a method be overloaded based on different return type but same argument type ?
No, because the methods can be called without using their return type in which case there is ambiquity for the compiler.

What happens to a static variable that is defined within a method of a class ?
Can't do it. You'll get a compilation error.

How many static initializers can you have ?
As many as you want, but the static initializers and class variable initializers are executed in textual order and may not refer to class variables declared in the class whose declarations appear textually after the use, even though these class variables are in scope.

What is the difference between method overriding and overloading?
Overriding is a method with the same name and arguments as in a parent, whereas overloading is the same method name but different arguments

What is constructor chaining and how is it achieved in Java ?
A child object constructor always first needs to construct its parent (which in turn calls its parent constructor.). In Java it is done via an implicit call to the no-args constructor as the first statement.

What is the difference between the Boolean & operator and the && operator?
If an expression involving the Boolean & operator is evaluated, both operands are evaluated. Then the & operator is applied to the operand. When an expression involving the && operator is evaluated, the first operand is evaluated. If the first operand returns a value of true then the second operand is evaluated. The && operator is then applied to the first and second operands. If the first operand evaluates to false, the evaluation of the second operand is skipped.

Which Java operator is right associative?
The = operator is right associative.

Can a double value be cast to a byte?
Yes, a double value can be cast to a byte.

What is the difference between a break statement and a continue statement?
A break statement results in the termination of the statement to which it applies (switch, for, do, or while). A continue statement is used to end the current loop iteration and return control to the loop statement.

How are this() and super() used with constructors?
a for statement loop indefinitely?
Yes, a for statement can loop indefinitely. For example, consider the following: for(;;);

To what value is a variable of the String type automatically initialized?
The default value of an String type is null.

What is the difference between a field variable and a local variable?
A field variable is a variable that is declared as a member of a class. A local variable is a variable that is declared local to a method. this() is used to invoke a constructor of the same class. super() is used to invoke a superclass constructor.

What does it mean that a class or member is final?
A final class cannot be inherited. A final method cannot be overridden in a subclass. A final field cannot be changed after it's initialized, and it must include an initializer statement where it's declared.

What does it mean that a method or class is abstract?
An abstract class cannot be instantiated. Abstract methods may only be included in abstract classes. However, an abstract class is not required to have any abstract methods, though most of them do. Each subclass of an abstract class must override the abstract methods of its superclasses or it also should be declared abstract.

What is a transient variable?
Transient variable is a variable that may not be serialized.

How does Java handle integer overflows and underflows?
It uses those low order bytes of the result that can fit into the size of the type allowed by the operation.

What is the difference between the >> and >>> operators?
The >> operator carries the sign bit when shifting right. The >>> zero-fills bits that have been shifted out.

Is sizeof a keyword?
The sizeof operator is not a keyword.

Kalyan Hadoop Training in Hyderabad @ ORIEN IT, Ameerpet, 040 65142345 , 9703202345

Pages

Sunday 8 September 2013

Hadoop MapReduce Interview Questions

Saturday 7 September 2013

Java Interview Questions