Date: Oct 10/30/13 11:47 AM Subject: NEJUG Meeting ( Thu, Nov 14): Apache Hadoop and Its Next Generation Compute Platform, YARN From: Mark Johnson Ok enough already! Big Data this Big Data that! Why should I care as a Java programmer? Should I care? The answer in my opinion is yes for 2 reasons; (1) You are in more demand if you can talk to big data, and (2) there are a lot of very cool Java problems in the Java stack. Come join us at the next NEJUG meeting to find out about the applications around Big Data and Hadoop. You might find this data stuff is not so bad and might even find it a lot of fun. Register now to reserve your seat Title: Apache Hadoop and Its Next Generation Compute Platform, YARN Presenter: Vinod Kumar Vavilapalli Location: Constant Contact. Date: Thu, Nov 14 Time: Doors open at 5pm, pizza at 5:30pm, and the presentation starts at 6pm You might also be interested to know you are eligible for a 35% discount on Wiley ebooks. Just use code DSUG35 when placing an order online at oreilly.com. Presentation Overview: Apache Hadoop has become the defacto platform for big data processing. It composes mainly of two platforms (1) Hadoop Distirbuted File System - a distributed file system running on a cluster of commodity machines and (2) MapReduce - a programming model to perform large scale processing of data residing on HDFS. In this talk, I'll introduce Hadoop and its main components, what is in Hadoop that is revolutionizing data processing. I'll quickly explain how users write applications on top of Hadoop. I'll round it up with introductions to various Hadoop eco-system projects like Pig, Hive, HBase, etc. Secondly, Apache Hadoop MapReduce has undergone a revolution to emerge as Apache Hadoop YARN, a generic compute platform. This part of the talk will cover the details of Apache Hadoop YARN - architecture, applications and its real life usage. I'll conclude with how YARN is changing Hadoop to be a general purpose processing platform by enabling users to run batch, stream-processing, graph workloads, and more - all on same cluster resources. About our speaker: Vinod Kumar Vavilapalli has been contributing to Apache Hadoop project full-time since mid-2007. At Apache Software Foundation, he is a long term Hadoop contributor, Hadoop committer, member of the Apache Hadoop Project Management Committee and a Apache Foundation Member. He is a MAPREDUCE and YARN go-to guy at Hortonworks, Inc. He has a Bachelors degree from the Indian Institute of Technology Roorkee in Computer Science & Engineering. Vinod has been working on Hadoop for more than 5 years and he still has fun doing it. He was involved in HadoopOnDemand, Hadoop-0.20, CapacityScheduler, Hadoop security, MapReduce and now is a lead developer and the project lead for Apache Hadoop YARN. Before Hortonworks, he was at Yahoo! working in the Grid team that made Hadoop what it is today, running at large scale with up to tens of thousands of nodes. Look forward to seeing you on Thu, Nov 14. Mark Johnson President, NEJUG email:markfjohnson@gmail.com To unsubscribe from future NEJUG mailings.