Friday, September 4, 2015

Create WordCount example using mapreduce framework in hadoop using eclipse run on windows

Create WordCount example using mapreduce framework in hadoop using eclipse run on windows
1. Open Eclipse
2. Create new java project and named it as - mapreduce_demo
3. Create Java class with name WordCount.java

How map reduce will work in hadoop
Approach:1
input.txt file having the following data..
(map)
--------------------------------------------------------------------------------------
abc def ghi jkl mno pqr stu vwx    P1- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P2- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P3- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P4- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P5- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P6- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P7- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P8- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P9- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
abc def ghi jkl mno pqr stu vwx    P10- abc 1 def 1 ghi 1 jkl 1 mno-1 pqr 1  stu 1 vwx 1
---------------------------------------------------------------------------------------
map
-----
  processes one line at a time, as provided by the specified TextInputFormat
  emits a key-value pair of < , 1>.


Reducer
----------
The Reducer implementation, via the reduce method just sums up the values, which are the occurence counts for each key  
    emit(eachWord, sum)

4. Paste the below code in the WordCount.java file
import java.io.IOException;
import java.util.StringTokenizer;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Reducer;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

public class WordCount {

  public static class Map extends Mapper{

    public void map(LongWritable key, Text value, Context context
                    ) throws IOException, InterruptedException {
      StringTokenizer itr = new StringTokenizer(value.toString());
      while (itr.hasMoreTokens()) {
        value.set(itr.nextToken());
        context.write(value, new IntWritable(1));
      }
    }
  }

  public static class Reduce extends Reducer {
   
   
    public void reduce(Text key, Iterable values,
                       Context context) throws IOException, InterruptedException {
     
      int sum = 0;
      for (IntWritable val : values) {
        sum += val.get();
      }
       context.write(key, new IntWritable(sum));
    }
  }

  public static void main(String[] args) throws Exception {
    Configuration conf = new Configuration();
    Job job = new Job(conf,"mywordcount");
    
    job.setJarByClass(WordCount.class);
    job.setMapperClass(Map.class);
    job.setReducerClass(Reduce.class);
    
    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);
    
    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);
    
    Path outputPath = new Path(args[1]);
    
    FileInputFormat.addInputPath(job, new Path(args[0]));
    FileOutputFormat.setOutputPath(job, new Path(args[1]));
    
    outputPath.getFileSystem(conf).delete(outputPath);
    
    System.exit(job.waitForCompletion(true) ? 0 : 1);
  }
}
5. Set up the build path to avoid errors.
Right Click on the project(mapreduce_demo)->Properties->Java Build Path-> Libraries ->Add External Jars
select jar's from
c:\hadoop\hadoop-2.4.1\share\hadoop\common
 c:\hadoop\hadoop-2.4.1\share\hadoop\common\lib
 c:\hadoop\hadoop-2.4.1\share\hadoop\mapreduce
 c:\hadoop\hadoop-2.4.1\share\hadoop\mapreduce\lib
 
6. Once you completed the above steps, now we need to create jar to run the WordCount.java in hadoop.
7. Create Jar:

Right click on the project(mapreduce_demo)->Export->Jar(Under java)->Click Next->Next->Main Class->
Select WordCount->Click Finish
8. Now we have created jar successfully.
9. Before Executing the jar, need to start the (namenode,datanode,resourcemanager and nodemanager)
10. c:\hadoop\hadoop-2.4.1\sbin>start-dfs.cmd
11. c:\hadoop\hadoop-2.4.1\sbin>start-yarn.cmd
12. Check whether any inut files available in the hdfs system already if not create the same.

Hadoop basic commands:


c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -ls input (If it is not created) then use below command to create directory
c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -mkdir input
Copy any text file into input directory of hdfs
c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -copyFromLocal input_file.txt input
Verify file has been copied to hdfs or not using below command
c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -ls input
Verift the data of the file which you copied into hdfs
c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -cat input.
 
13. Once above steps's done. Now run the mapreduce program using the following command
c:\hadoop\hadoop-2.4.1\bin>
yarn jar c:\hadoop\hadoop-2.4.1\wordcount.jar input/ output/

14. verify the result.

c:\hadoop\hadoop-2.4.1\bin>hdfs dfs -cat output
verify the status of the job details and output through web url

http://localhost:50075
http://localhost:8088/cluster


Friday, August 29, 2014

RPC (Remote Procedure Call)

RPC- Remote Procedure Call
 The term itself defines calling remote methods through client program.
In this post added both server and client in the same package.
Normally we have to write server classes which having business functionality
and client will only will access that methods.

Step 1: Start Eclipse
Step 2: Create new dynamic project(name it as you like Ex: rpc-test
Step 3: Create a server class(Ex:Calculator)under src folder
Step 4: Write methods as of your requirement

Ex:
package com.siva;

public class Calculator 
{
public int add(int i1, int i2) {
return i1 + i2;
}
public int substract(int i1, int i2) {
return i1 - i2;
}
public int multiply(int i1, int i2) {
return i1 * i2;
}
}

Step 5: We have to write configuration of xmlrpcservlet in web.xml
Step 6: Open web.xml, which is under WEB-INF/

XmlRpcServlet
org.apache.xmlrpc.webserver.XmlRpcServlet

enabledForExtensionstrue
Sets, whether the servlet supports vendor extensions for XML-RPC.




XmlRpcServlet
xmlrpc

Step 7: Need to declare server class location in the properties file name called-
XmlRpcServlet.properties under package - org.apache.xmlrpc.webserver
Step 8: XmlRpcServlet.properties file details as mentioned below

Calculator=com.siva.Calculator

The bold Calculator is being used in clent class to access the Calculator methods.

Step 9: Write the Client class

package com.siva;

import java.net.URL;
import org.apache.xmlrpc.client.XmlRpcClient;
import org.apache.xmlrpc.client.XmlRpcClientConfigImpl;

public class SimpleClient {

public SimpleClient() {

try {
System.out.println("Try to call calculator methods via XML-RPC...");
XmlRpcClientConfigImpl config = new XmlRpcClientConfigImpl();

//server ip addess or machine name/project name/servlet url pattern
config.setServerURL(new URL("http://127.0.0.1:8080/rpc-test/xmlrpc"));

XmlRpcClient client = new XmlRpcClient();
client.setConfig(config);

Object[] params = new Object[] { new Integer(2), new Integer(3) };
Integer result = (Integer) client.execute("Calculator.substract", params);

System.out.println("The returned values is: " + result);

} catch (Exception e) {
e.printStackTrace();
}
}

public static void main(String[] args) {
new SimpleClient();
}

}
in the above client class to call the RPC methods, first we have to provide the server ip addrss, url pattern name
While calling the method name which being used for to access the business logic.

Step 10: required jars

commons-logging-1.1.jar
ws-commons-util-1.0.2.jar
xmlrpc-client-3.1.3.jar
xmlrpc-common-3.1.3.jar
xmlrpc-server-3.1.3.jar

Step 11 : Required Tomcat server or any webserver to run the application

Step 12 : Start the server after adding project to the server and run the client application,
Run As Java Application, you will see the result.
If you want to call another methods , change the method name from Clicent class and stop start the
server.

I hope this post will help to get basic knowledge about RPC.



Thursday, July 31, 2014

struts example






Struts example

Below post shows how to build simple application with struts.

open eclipse -> project -> create dynamic Webproject->

provide name as - struts_sample [As you like]

select Dynamic web module version as 2.5

Then click on Finish button.

-> open the web.xml , which available inside WEB-INF folder

copy the below code and paste in side web.xml .
      
             action 
              org.apache.struts.action.ActionServlet     
               
                   config                   /WEB-INF/struts-config.xml              
              1
  
     
		       action   
		       *.do
    


-> Create xml and name it as struts-config.xml inside WEB-INF.
->open the struts-config.xml and paste the below code inside the struts-config.xml

	  
  


		 
		
			 
		
											
		 
		 
			
				
				
			
			
				
				
			
		
	
	


-> Now it's time to start our work like creating jsp files, form classes, action classes and need to add required jars.

-> first we will write jsp page. Here i am writting 4 jsp pages
1. login.jsp
2. welcome.jsp
3. adduser.jsp
4. success.jsp

-> create one folder inside WebContent with name jsp [WebContent/jsp/login.jsp] -- this is for login details which will have user name and password along with validation.
paste the below code inside login.jsp

	<%@taglib uri="http://jakarta.apache.org/struts/tags-html" prefix="html"%><%@taglib
	uri="http://jakarta.apache.org/struts/tags-bean" prefix="bean"%>



Login page



Login


-> create welcome.jsp [WebContent/jsp/welcome.jsp] -- this is for after successful login details .
paste the below code inside welcome.jsp
	
	<%@ page language="java" contentType="text/html; charset=ISO-8859-1"
	pageEncoding="ISO-8859-1"%>



Welcome page


	<%        String message = (String)request.getAttribute("message");    %>
	

Welcome User : <%= message %>

You have successfully logged in.

-> create adduser.jsp [WebContent/jsp/adduser.jsp] -- this contains user details along with required validation.
paste the below code inside adduser.jsp

	<%@taglib uri="http://jakarta.apache.org/struts/tags-html" prefix="html"%><%@taglib
	uri="http://jakarta.apache.org/struts/tags-bean" prefix="bean"%>



Add User























-> create success.jsp [WebContent/jsp/success.jsp] -- this page will display after successful user details.
paste the below code inside success.jsp
	<%@taglib uri="http://jakarta.apache.org/struts/tags-html" prefix="html"%><%@taglib
	uri="http://jakarta.apache.org/struts/tags-bean" prefix="bean"%>



Login page


	

success

successfully Added.
-> We need to create java classes.
-> create a package -> com.siva.form
-> com.siva.action

->create LoginForm.java inside com.siva.form package and paste the below code.
package com.siva.form;

import javax.servlet.http.HttpServletRequest;

import org.apache.struts.action.ActionErrors;
import org.apache.struts.action.ActionForm;
import org.apache.struts.action.ActionMapping;
import org.apache.struts.action.ActionMessage;

/**
 * 
 * @author sivaraju
 *
 */
public class LoginForm extends ActionForm {
	/**
	 * 
	 */
	private static final long serialVersionUID = -1743531020254647308L;
	private String userName;
	private String password;

	public ActionErrors validate(ActionMapping mapping,
			HttpServletRequest request) {
		ActionErrors actionErrors = new ActionErrors();
		if (userName == null || userName.trim().equals("")) {
			actionErrors.add("userName", new ActionMessage("error.username"));
		}
		try {
			if (password == null || password.trim().equals("")) {
				actionErrors.add("password",
						new ActionMessage("error.password"));
			}
		} catch (Exception e) {
			e.printStackTrace();
		}
		return actionErrors;
	}

	public String getUserName() {
		return userName;
	}

	public void setUserName(String userName) {
		this.userName = userName;
	}

	public String getPassword() {
		return password;
	}

	public void setPassword(String password) {
		this.password = password;
	}
}
->create UserForm.java inside com.siva.form and paste below code

  package com.siva.form;

import javax.servlet.http.HttpServletRequest;

import org.apache.struts.action.ActionErrors;
import org.apache.struts.action.ActionForm;
import org.apache.struts.action.ActionMapping;
import org.apache.struts.action.ActionMessage;

/**
 * 
 * @author sivaraju
 *
 */
public class UserForm extends ActionForm {
	/**
	 * 
	 */
	private static final long serialVersionUID = -2087861874662967982L;
	private String userName;
	private String sex;
	private int age;

	public ActionErrors validate(ActionMapping mapping,
			HttpServletRequest request) {
		ActionErrors actionErrors = new ActionErrors();
		try {
				if (userName == null || userName.trim().equals("")) {
					actionErrors.add("userName", new ActionMessage("error.username"));
				}
				if (sex == null || sex.trim().equals("")) {
					actionErrors.add("sex",new ActionMessage("error.sex"));
				}
				if (age <=0 ) {
					actionErrors.add("age",new ActionMessage("error.age"));
				}
			
		} catch (Exception e) {
			e.printStackTrace();
		}
		return actionErrors;
	}

	public String getUserName() {
		return userName;
	}

	public void setUserName(String userName) {
		this.userName = userName;
	}
	public String getSex() {
		return sex;
	}

	public void setSex(String sex) {
		this.sex = sex;
	}

	public int getAge() {
		return age;
	}

	public void setAge(int age) {
		this.age = age;
	}

	
}
-> create LoginAction.java inside com.siva.action pakage and paste the below code
  package com.siva.action;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.struts.action.Action;
import org.apache.struts.action.ActionForm;
import org.apache.struts.action.ActionForward;
import org.apache.struts.action.ActionMapping;

import com.siva.form.LoginForm;

public class LoginAction extends Action {
	private static final String FAILURE = "failure";
	private static final String SUCCESS = "success";

	public ActionForward execute(ActionMapping mapping, ActionForm form,
			HttpServletRequest request, HttpServletResponse response)
			throws Exception {
		String target = null;
		LoginForm loginForm = (LoginForm) form;
		if (loginForm.getUserName().equals("admin")
				&& loginForm.getPassword().equals("admin123")) {
			target = SUCCESS;
			request.setAttribute("message", loginForm.getUserName());
		} else {
			target = FAILURE;
		}
		return mapping.findForward(target);
	}
}
-> create AddUserAction.java inside com.siva.action package and paste below code.
package com.siva.action;

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.struts.action.Action;
import org.apache.struts.action.ActionForm;
import org.apache.struts.action.ActionForward;
import org.apache.struts.action.ActionMapping;

import com.siva.form.UserForm;
/**
 * 
 * @author sivaraju
 *
 */
public class AddUserAction extends Action {
	private static final String SUCCESS = "success";

	/**
	 * 
	 */
	public ActionForward execute(ActionMapping mapping, ActionForm form,
			HttpServletRequest request, HttpServletResponse response)
			throws Exception {
		UserForm userForm = (UserForm) form;
		System.out.println("username["+userForm.getUserName()+"]");
		System.out.println("Age["+userForm.getAge()+"]");
		System.out.println("Sex["+userForm.getSex()+"]");
		//If this details need to store in db. Then need to get the db details from data source object. Store as per requirement.
		return mapping.findForward(SUCCESS);
	}
	
	
}
->create MessageResource.properties file inside src/resource [src/resource/MessageResource.properties] and paste the below code
label.username = User Name
label.password = Password
label.welcome = Welcome 
error.username =Username is required.
error.password = Password is required.


label.age = Age
label.sex = Sex

label.adduser = Add User


error.age = Age is required field
error.sex=Sex is required field 
-> Add the below jars inside lib folder.(Among below jar's few of the jar's may not be required]
    bcprov-jdk14-121.jar
    commons-digester-2.1.jar
    commons-fileupload-1.1.1.jar
    commons-io-1.2.jar
    commons-lang-2.4.jar
    org-apache-commons-logging.jar
    org.apache.commons.beanutils.jar
    servlet-api.jar
    struts.jar
	
	
-> Right click on the project-> Run As-> Run On Server-> [select the server which ever server you want to run the application]. Tomcat is prefered. -> http://localhost:8080/struts_sample/login.jsp [ after submitting it will redirect to welcome.jsp]- username - admin / password - admin123 -> http://localhost:8080/struts_sample/jsp/adduser.jsp [after submitting it will redirec tto success.jsp]. In this way stuts application will work.

Wednesday, February 27, 2013

Convert word document (.docx) to PDF


This post will describes how to convert word document to PDF using Java.

To convert document to Pdf we will have different type of approaches.
But in this post i am using  docx4j. It is one of the good API for conversion from XSLT to PDF and Word Document to PDF etc..


We can convert from document to Pdf with Simple java program.

Steps to follow.

Step1 :open Eclipse and create new java project- provide name as you like.

Step 2: Create new Java class  which ever you like (ex: ConvertDocToPDF )

Step 3: Paste the below lines of code inside main method of created java class


 try {


long start = System.currentTimeMillis();

// 1) Load DOCX into WordprocessingMLPackage

InputStream is = new FileInputStream(new File("test.docx"));
WordprocessingMLPackage wordMLPackage = WordprocessingMLPackage.load(is);
//If your header and body information got over lapped then use the below code
List sections = wordMLPackage.getDocumentModel().getSections();
for (int i = 0; i < sections.size(); i++) {

System.out.println("sections Size" + sections.size());
wordMLPackage.getDocumentModel().getSections().get(i).getPageDimensions().setHeaderExtent(3000);
}

//if you want use any Physical fonts then use the below code.

Mapper fontMapper = new IdentityPlusMapper();

PhysicalFont font = PhysicalFonts.getPhysicalFonts().get("Comic Sans MS");

fontMapper.getFontMappings().put("Algerian", font);

wordMLPackage.setFontMapper(fontMapper);

// 2) Prepare Pdf settings

PdfSettings pdfSettings = new PdfSettings();

// 3) Convert WordprocessingMLPackage to Pdf

org.docx4j.convert.out.pdf.PdfConversion conversion = new org.docx4j.convert.out.pdf.viaXSLFO.Conversion(wordMLPackage);

OutputStream out = new FileOutputStream(new File("test.pdf"));
conversion.output(out,pdfSettings);
System.err.println("Time taken to Generate pdf  "+ (System.currentTimeMillis() - start) + "ms");
} catch (Throwable e) {

e.printStackTrace();
}


Step 4: Now you can run the Java program, PDF will be generate for your Document file.

AddToAny

Contact Form

Name

Email *

Message *