Date: Jan 01/03/14 9:09 PM Subject: Next week NEJUG meeting ( (Thu, Jan 9) ) - Signup now. Sent By: Stevel Lintz Next week, on (Thu, Jan 9), is the first NEJUG presentation meeting of 2014. The holiday break is over, and it's time to get back to your professional development and local area networking (the human kind). Register now to reserve a seat, and then join us for a great season opener, with Drew Ferris speaking on "Taming Text." It's not our rightly vaunted tool-use that makes humans so special, but language. Since the advent of writing systems (about 4,200 years ago), we've been capturing that unique part of our humanity first in clay, then papyrus, then paper, and now bytes – about 20 terabytes just in the Unites States' Library of Congress, not including email, blogs, consumer product reviews, etc. etc. etc. The big problem with almost all that stuff is that it's (mostly) unstructured text. It's virtually meaningless (and thus programmatically useless) until it's processed within a useful context. This is what humans do naturally while reading – and what modern text-processing software is already doing to an ever growing degree. Join us at Constant Contact this coming Thursday to meet other Java and JVM developers and to learn more. Register now to reserve your seat As always the doors will open at 5pm for networking, pizza should arrive around 5:30pm, and the meeting will start promptly at 6pm. Presentation Overview: There is so much text in our lives, we are practically drowning in it. Fortunately, there are innovative tools and techniques for managing unstructured information that can throw the smart developer a much-needed lifeline. In this talk, based on the outline of the book Taming Text, you will receive an introduction to a variety of Java-based open source tools that aide in the development of search and NLP applications. In this presentation, you will be introduced to useful techniques like full-text search, proper name recognition, clustering, tagging, information extraction, and summarization. We wll explore real use cases as you systematically absorb the foundations upon which they are built. Discussed in a clear and concise style, avoiding jargon, we will explain the subject in terms understandable without a background in statistics or natural language processing. Examples are in Java, but the concepts can be applied in any language. About our speaker: Drew Farris is a professional software developer and technology consultant whose interests focus on large scale analytics, distributed computing and machine learning. Previously, he worked at TextWise where he implemented a wide variety of text exploration, management and retrieval applications combining natural language processing, classification and visualization techniques. He has contributed to a number of open source projects including Apache Mahout, Lucene and Solr, and holds a master's degree in Information Resource Management from Syracuse University's iSchool and a B.F.A in Computer Graphics. You might also be interested to know you are eligible for a 50% discount on O'Reilly eBooks just for being a NEJUG member. Just use code DSUG when placing an order online at oreilly.com. Stevel Lintz NEJUG Advisory To unsubscribe from future NEJUG mailings.