We are analysing data from Astronomy Picture of Day (APOD). Everyday a picture and its brief description are provided. Firstly, the description is analysed using natural language processing and data mining techniques. Secondly, a ranked list of keywords related to this picture is built. Finally, an API was built to allow developers obtain the top keywords from a picture in a specific date. In addition, a webpage was created, in order to show how the API works.

This project is solving the Data Treasure Hunting challenge.



Everyday data from space are generated by NASA; however, these data need to be analysed to allow people have a better understanding. For this challenge, the APOD catalog was chosen to be a study case. The goal of this project is to build a dynamic knowledge using auto generated keywords. The first step is to create a ranked list containing words related to the picture, and through our API consume the data and display the knowledge in a friendly way, such as charts, videos, images and links. At the moment, a webpage consuming the API and showing the knowledge via charts was built.


Firstly, using the picture title, a search on google is undertaken to obtain extra knowledge about the picture. Secondly, nouns, proper nouns and adjectives are extracted from the knowledge built and the picture description. Finally, a logarithmic term frequency technique is applied to rank the words.

Project Information

License: MIT license (MIT)

Source Code/Project URL:




  • Wagner Souza Santos
  • Cooper Li