Amazon Web Services’ latest open source project is called Graph Notebook, enabling data scientists to analyze and visualize information stored in their companies’ databases. Graph Notebook is made using software components developed for Amazon Neptune, a managed graph database service.
The project was announced by the cloud giant spawned by Amazon on Wednesday.
Graph databases are explicitly made to store individual records (customer names and dollar values) and show relationships between them. The ability to establish connections between data points is where the real value of Graph Notebook lies.
Data is daunting, but not for Graph Notebook
Graph Notebook is designed to make it easy to work with information kept in graph databases. Even the expert scientists have a hard time manually finding the patterns in an extensive collection of records. It is logical to convert the raw data into a visualization, made possible by Graph Notebook.
Data scientists can use it to create Jupyter Notebook queries that extract the specified subsets of information in a database, specific to what they want to work with. It then visualizes the results queried.
The tool is an excellent choice for two languages especially: Gremlin (supported in a wide variety of significant graph databases) and SPARQL syntax (it’s a bit niche).
The ingenuity of Graph Notebook
Both SPARQL and Gremlin are explicitly optimized for analyzing data stored in graph format. AWS’ Amazon Neptune graph databases also support the languages.
The cloud giant announced that Graph Notebook can be used with the Neptune service or in other environments like an EC2 instance or a personal computer.
The query results are visualized as interconnected objects in a series on a chart. Suppose you have a location like a warehouse and travel routes for delivery trucks. In that case, the visuals will represent each warehouse as a circle, and the distance between locations as connections between the circles.
Tip: AWS emphasises the importance of a Well Architected Framework