nlphackerclan

“We’re not building an ‘end solution’, but a platform for continuous research and development—making NASA the #1 leader in human-computer interaction.”

This project is solving the Data Treasure Hunting challenge.
Description

The aim of this project is to not create a list of keywords, but to devise a method that allows semantic recognition of entered keywords, and newly formed keywords, to provide reference to cross-agency documentation and other information, as well as external documentation via the web.

The objective is to develop a solution, using machine-learning techniques, to cross-reference not only single words, but compound words, and most difficult, direct and indirect oxymora, to relative data and documentation.

Sponsor API & Reference: IBM Bluemix; IBM Watson Bluemix for Wiki creation and sample NLP apps. Microsoft Research for research and developments.

Java build utilizing NLP libraries in conjunction to following metadata: data.nasa.gov vocab.data.gov Project Open Data: https://project-open-data.cio.gov/schema/

Project Information

License: Apache License 2.0 (Apache-2.0)

Source Code/Project URL: https://github.com/MrsLinzan/NLP_Hacker_Clan.git

Resources

Understanding Document Aboutness Step One: Identifying Salient Entities - http://research.microsoft.com/pubs/198455/msrtr13.pdf
Cognitive Computation Group (UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN) - http://cogcomp.cs.illinois.edu/page/demos/
Apache OpenNLP - https://opennlp.apache.org/
The Stanford Natural Language Processing Group - http://nlp.stanford.edu/software/
Natural Language Processing with Python - http://www.nltk.org/book/
IBM Watson - https://console.ng.bluemix.net/solutions/watson
The Jython License - http://www.jython.org/license.html

Team