Provides a framework for data set owners to increase the value of their data.json catalog files by allowing end user input and automatic analysis.

This project is solving the Data Treasure Hunting challenge.


This framework uses Bluemix services to accomplish the following:

Allow data set owners to solicit input about their data sets related to keywords and themes. The framework uses Relationship Extraction and Visual Recognition services to pull entities from user's social media feeds such as Facebook, twitter, and Instagram. These entities are used to find relevant data sets which are presented to the user asking for feedback regarding keywords.

Data set owners can upload a data.json file and have it automatically processed using Analytical APIs to come up with themes and keywords.

The framework is designed to be extendible and has the potential for other actions such as:

  • Input schema attributes for a data set, feed to engine, and have a ready to go data.json.
  • A service which looks at data in unique ways and add that to the processing chain,
  • A service which can change the format of the data if needed and update the data.json with this new format's download location.
  • If data is extremely numerical add a service to create keywords based on the nature of the data.
  • A service which can do comparison and correlation of different data sets and generate a new data set.

Project Information

License: IBM Public License 1.0 (IPL-1.0)

Source Code/Project URL:


Prototype -
Overview PDF -
Overview Video -


  • Jack Yarborough
  • David Shen
  • Nick Lloyd