Data Treasure Hunting

Hashtags:

#humans, #datatreasurehunting, #advanced

Contact:

[email protected]

Tags:

Platform, Data Visualization

Background

In recent years, NASA and other government agencies worldwide have been publishing open data in machine-readable, non-proprietary, and no-cost format on the web (e.g., http://data.nasa.gov/). Everyone is interested in new ways to search that publicly available data and integrate these information assets into innovative databases and applications.

Inconsistent metadata (i.e., information such as keywords that empower search engines such as Google to discover these assets) is a consistent challene across organizations. The challenge is to develop a new technique or application that would enable anyone to add meaningful keywords to the descriptions of our data – keywords that describe the hidden potential of these assets to better leverage our data beyond space applications to other data that may appear unrelated.

Challenge

Devise a clever way to discover good keywords to describe the potential, hidden, secondary uses of open data. For example, how might you discover that a particular information asset might be relevant to or benefit from other keywords, such as waste-processing or disaster-preparedness? Remember: without these additional and seemingly unrelated keywords, entrepreneurs like you might not discover and use open data to solve your most perplexing problems.

You can use any technique that might help discover new keywords. For example, a crowdsourcing application could display information about these assets online and query people about how the assets can be used. You may want to consider predictive analytics or machine-learning techniques to compare the metadata and the data of one information asset to another in order to find new keywords. Or, you might use the unique identifiers of the published data-files to search on the web, discover who already used the data and for what purpose, then catalog it. In fact, you could even develop a clever solution to download the data itself and ‘squeeze it’ in order to generate new keywords.

Not only are we asking you to discover new keywords, but also to retain the log file that explains how these new keywords were discovered.

Considerations

A starter toolkit is now available. This will include the complete existing metadata and download links for information assets that were published on Open Data websites or by other agencies worldwide.

Sample Resources (Participants do not have to use these resources, and NASA in no way endorses any particular entity listed).
- https://project-open-data.cio.gov/schema - Provides a dictionary of existing metadata-fields in the popular data.json catalog that agencies use to prepare and upload information assets to their Open Data portals.
- http://www.engagedata.eu - The European Engage project that uses a crowdsourcing technique to address a similar problem.
- http://www.opendataresearch.org/project/2013/odb - The Open Data Barometer project that uses an expert system to address a similar problem.
The following projects are solving this challenge:
- Ecosystem Treasure Hunting Team
  This will be updated. Visit Project
- NLP_HACKER_CLAN
  >The aim of this project is to not create a list of keywords, but to devise a method that allows semantic recognition of entered keywords, and newly formed keywords, to provide reference to cross-agency documentation and other information, as well as external documentation via the web. > >T... Visit Project
- VMCNSS (VMware Symantec Microsoft Cisco NetApp Solution).
  Project Name: VMCNSS (VMware Symantec Microsoft Cisco NetApp Solution). My ambition is to collect & recovery to protect all created & develop Data as done by hard work by resource. My project vision will achieve as below step by step Solution Infrastructure: 1. Make network infrastructure. 2... Visit Project
- DHM - Data Hunters Macedonia
  DHM (Data Hunters Macedonia) 1.Already achieved 1.1. Discover new keywords using an API. 1.2. Storage and catagorization of the gathered keywords 1.4. Using them in a ("web") application that we developed for Crowd-Sourcing 2. Future plans(Software) 2.0. Ranking the gathered information... Visit Project
- Degrees of Data
  The problem facing data mining is that it's difficult to find relevant data given a keyword due to badly tagged data. We aim to solve this using the Twitter API. Humans have already tagged their tweets with relevant hashtags (that are more often than not related to each other). With this, we sear... Visit Project
- KeyRecommender
  Our team will propose a Statistical Machine Learning model to learn the semantics of the text in an unsupervised way. We will use the Word Vector representation model to represent the text, then apply clustering techniques to identify the relevance between words. Visit Project
- Metatron
  *Metatron: Metadata improvements for public data sets* Metatron is a public dataset search portal for people who don't know in advance what data set they might want, comprising: * A data search front end with a catalog of general topics * A cloud-based back end comprising: * Robo... Visit Project
- Keyword Distillery
  # Keyword Distillery NASA Space Apps Challenge 2015 (Data Treasure Hunting) Generates a ranked list of keywords. Each keyword can generate a ranked list of databases which relate to it, ranked by the strength of the relationship. #Usage * Generate a list of keywords you would like to map,... Visit Project
- An Extendible Bluemix Framework for Improving keywords and themes in data.json Files.
  This framework uses Bluemix services to accomplish the following: Allow data set owners to solicit input about their data sets related to keywords and themes. The framework uses Relationship Extraction and Visual Recognition services to pull entities from user's social media feeds such as Fac... Visit Project
- Smart Suit for Travelers
  Creation and conception of a multiplatform application Linked to a space suit 3 different space and time circumstances analysis -real time healthcare support -data mining -Enhance simulations -Adapt environment -Surpass time limit in space i Visit Project
- Ap^2 - Analitycs Projections Program
  The aim of our project is to solve the problem of open data connection. We have used the NASA Open data and crossed them with other external open data to trace relationship between space and earth activities. Through a mobile app, user can choose existing keywords or add new ones and see how ... Visit Project
- Kvasir
  ##Kvasir Pronounced Kwah-seer, is a tool which aims to create a centralized place for users to find open government data. This project aims to use open sourced data visualization tools to represent the government data in a much more user friendly fashion. We plan to implement maps for data ... Visit Project
- SMART SPACE SUIT
  Creation and conception of a multiplatform application Linked to a space suit 3 different space and time circumstances analysis real time healthcare support data mining Enhance simulations Adapt environment Surpass time limit in space Visit Project
- Open Data Social Gold Miner
  # Open Data Gold Digger *Formerly known as Open Data Social Gold Miner* This project is focused on taking the conglomerates of raw data and mining this data into smaller streams of useful, relevant data. We have done this by utilizing the Project Open Data Metadata Schema v1.1 in analyzing a... Visit Project
- opendatapedia.com
  Objectives: - Generate "new keywords" for opendata assets. - Create a free online and mobile source of easy to use open-data around the world. - Use Crowdsourcing for integrity of information. - dashboard of opendata trending in social. How to achieve: - Free Online Web "opendatapedia.co... Visit Project
- Data Odyssey
  This project aims to solve the data treasure hunting challenge. This project will provide a manual way for users to search for data extensively when the search algorithms fail to find the revelant data. It will use this data to help create new keywords dynamically which may be used to generat... Visit Project
- medapp
  I have made an application in which several salts that are existing on mars can be combined to form the medicines which can be effective for the body of human beings for surviving on mars.I have created a database of 25-30 medicines in which I have combined several elements to form the effective... Visit Project
- miningDTH
  ##Background Everyday data from space are generated by NASA; however, these data need to be analysed to allow people have a better understanding. For this challenge, the APOD catalog was chosen to be a study case. The goal of this project is to build a dynamic knowledge using auto generated key... Visit Project
- epilep.si
  Most of (scientific) discoveries come from the outer edges of the »known«. On one hand we have »superhuman« astronauts with robust (biosensors - brains) in space and on the other hand there is exponentially growing number of persons challenged with epilepsy (sensitive biosensors –brains). We a... Visit Project
- NYSpaceTag
  NASA has a lot of data, but it's hard to find what it's about, and how it connects to other data. To solve the first, we have built a tagging system that extracts natural keywords from titles and descriptions. We ran this across not only NASA data, but on datasets from all government departments.... Visit Project
Welcome to the collaborative hackpad! You can use this open document to collaborate with others, self organize, or share important data. Please keep in mind that this document is community created and any views, opinions, or links do not reflect an official position of the Space Apps Challenge, NASA, or any of our partners.

Building a team or looking for one to join? Feel free to create a Matchmaking section at the bottom of the document to help in gathering great minds together!

If you want to edit this Hackpad, or have trouble viewing it, please create an account at spaceapps2015.hackpad.com

Data Treasure Hunting Hackpad: https://spaceapps2015.hackpad.com/ocl4A0iKVbo

Hashtags:

Contact:

Tags:

Ecosystem Treasure Hunting Team

NLP_HACKER_CLAN

VMCNSS (VMware Symantec Microsoft Cisco NetApp Solution).

DHM - Data Hunters Macedonia

Degrees of Data

KeyRecommender

Metatron

Keyword Distillery

An Extendible Bluemix Framework for Improving keywords and themes in data.json Files.

Smart Suit for Travelers

Ap^2 - Analitycs Projections Program

Kvasir

SMART SPACE SUIT

Open Data Social Gold Miner

opendatapedia.com

Data Odyssey

medapp

miningDTH

epilep.si

NYSpaceTag