Date:     Oct 10/30/13 11:47 AM
Subject:  NEJUG Meeting ( Thu, Nov 14): Apache Hadoop and Its Next Generation Compute Platform, YARN
From:     Mark Johnson

Ok enough already!  Big Data this Big Data that!  Why should I care as a Java programmer?
Should I care?  The answer in my opinion is yes for 2 reasons; (1) You are in more demand if you can talk to big data,
and (2) there are a lot of very cool Java problems in the Java stack.  Come join us at the next NEJUG meeting
to find out about the applications around Big Data and Hadoop.  You might find this data stuff is not so
bad and might even find it a lot of fun.

Register now to reserve your seat

Title: Apache Hadoop and Its Next Generation Compute Platform, YARN
Presenter:  Vinod Kumar Vavilapalli
Location:  Constant Contact.

Date: Thu, Nov 14

Time: Doors open at 5pm, pizza at 5:30pm, and the presentation starts at 6pm


You might also be interested to know you are eligible for a 35% discount on Wiley ebooks. Just use code DSUG35
when placing an order online at oreilly.com.


 Presentation Overview:

Apache Hadoop has become the defacto platform for big data processing. It composes mainly of two platforms
(1) Hadoop Distirbuted File System - a distributed file system running on a cluster of commodity machines and
(2) MapReduce - a programming model to perform large scale processing of data residing on HDFS. In this talk,
I'll introduce Hadoop and its main components, what is in Hadoop that is revolutionizing data processing.
I'll quickly explain how users write applications on top of Hadoop. I'll round it up with introductions to
various Hadoop eco-system projects like Pig, Hive, HBase, etc.

Secondly, Apache Hadoop MapReduce has undergone a revolution to emerge as Apache Hadoop YARN, a generic
compute platform. This part of the talk will cover the details of Apache Hadoop YARN - architecture,
applications and its real life usage. I'll conclude with how YARN is changing Hadoop to be a general purpose
processing platform by enabling users to run batch, stream-processing, graph workloads, and more -
all on same cluster resources.


About our speaker:

Vinod Kumar Vavilapalli has been contributing to Apache Hadoop project full-time since mid-2007.
At Apache Software Foundation, he is a long term Hadoop contributor, Hadoop committer, member of the
Apache Hadoop Project Management Committee and a Apache Foundation Member. He is a MAPREDUCE and YARN
go-to guy at Hortonworks, Inc. He has a Bachelors degree from the Indian Institute of Technology Roorkee
in Computer Science & Engineering.

Vinod has been working on Hadoop for more than 5 years and he still has fun doing it. He was involved in
HadoopOnDemand, Hadoop-0.20, CapacityScheduler, Hadoop security, MapReduce and now is a lead developer and
the project lead for Apache Hadoop YARN. Before Hortonworks, he was at Yahoo! working in the Grid team that
made Hadoop what it is today, running at large scale with up to tens of thousands of nodes.

Look forward to seeing you on Thu, Nov 14.

Mark Johnson
President, NEJUG
email:markfjohnson@gmail.com

To unsubscribe from future NEJUG mailings.