- Hadoop
- Purpose
- This documents describes how to set up and configure a single node Hadoop installation so that you can quickly perform simple operations using Hadoop MapReduce and the Hadoop Distributed File System (HDFS).
- Tools Used
The tool to be used is: Cloudera.
- Cloudera is revolutionizing enterprise data management by offering the first unified Platform for Bigdata: The Enterprise Data Hub. Cloudera offers enterprises one place to store, process, and analyze all their data, empowering them to extend the value of existing investments while enabling fundamental new ways to derive value from their data.
- The cloudera was founded in 2008, and is currently, the leading provider and supporter of Apache Hadoop for the enterprise. Cloudera also offers software for business critical data challenges including storage, access, management, analysis, security, and search.
- Requirements system.
- These 64-bit VMs require a 64-bit host OS and a virtualization product that can support a 64-bit guest OS.
- To use a VMware VM, you must use a player compatible with WorkStation 8.x or higher:
- Player 4.x or higher
- Fusion 4.x or higher
- Older versions of WorkStation can be used to create a new VM using the same virtual disk (VMDK file), but some features in VMware Tools are not available.
- The amount of RAM required varies by the run-time option you choose:
CDH and Cloudera Manager Version
|
RAM Required by VM
|
CDH 5 (default)
|
4+ GiB*
|
Cloudera Express
|
8+ GiB*
|
Cloudera Enterprise (trial)
|
10+ GiB*
|
*Minimum recommended memory. If you are running workloads larger than the examples provided, consider allocating additional memory.
- Installation
In the section we just show you how to installation the Cloudera QuickStart Virtual Machine.
- Use the 7-zip to extract the contents of downloaded zip file.
- Run the VirtualBox and then import Cloudera to VirtualBox.
- Test Cloudera.
- Download WordCount.java from Sakai.
- Create new project in eclipse.
- Add references:
- File system/usr/lib/hadoop/client-0.20
- File system/usr/lib/hadoop
- File system/usr/lib/hadoop/lib
- Create folder input inside project and add document file with text.
- Run program.
- Check file in output folder to see result.
- Eclipse
- General about Eclipse.
Eclipse (@ www.eclipse.org) is an open-source Integrated Development Environment (IDE) supported by IBM. Eclipse is popular for Java application development (Java SE and Java EE) and Android apps. It also supports C/C++, PHP, Python, Perl, and other web project developments via extensible plug-ins. Eclipse is cross-platform and runs under Windows, Linux and Mac OS.
- Installation.
- To use Eclipse for Java programming, you need to first install Java Development Kit (JDK). To read this link to know "How to install SDK (on Window)" .
- To install Eclipse, simply unzip the downloaded file into a directory of your choice (e.g., "d:\myproject").
- Running eclipse.
- Create new project in Eclipse(HelloWorld).
- Customize your code and then run the project.
- References.
It is a amazing post!!
ReplyDeleteBig Data Hadoop Training In Chennai | Big Data Hadoop Training In anna nagar | Big Data Hadoop Training In omr | Big Data Hadoop Training In porur | Big Data Hadoop Training In tambaram | Big Data Hadoop Training In velachery
Brilliant post! We are connecting to this extraordinary post on our site. Keep up the extraordinary composition. tech updates
ReplyDeleteHello, I do think this is an incredible site. I stumbledupon it ;) I will return to once since I have saved as a most loved it. Cash and opportunity is the most ideal approach to change, may you be rich and keep on helping other people. tech updates
ReplyDelete