Skip to content

paulhoule/gastrodon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gastrodon

Toolkit to display, analyze, and visualize data and documents based on RDF graphs and the SPARQL query language using Pandas, Jupyter, and other Python ecosystem tools.

Gastrodon Links SPARQL to Pandas

Gastrodon links databases that support the SPARQL protocol (more than ten!) to http://pandas.pydata.org/, a popular Python library for analysis of tabular data. Pandas, in turn, is connected to a vast number of visualization, statistics, and machine learning tools, all of which work with Jupyter notebooks. The result is an ideal environment for telling stories that reveal the value of data, ontologies, taxonomies, and models.

In addition to remote databases, Gastrodon can do SPARQL queries over in-memory RDF graphs (from rdflib). It has facilities to copy subgraphs from one graph to another, making it possible to assemble local graphs that contain facts relevant to a particular decision, work on them intimately, and then store results in a permanent triple store.

Seamless Data Translation

Gastrodon mediates between three data models: (1) RDF, (2) Pandas/NumPy, and (3) Native Python. Gastrodon lets you use Python variables in your SPARQL queries simply by adding ?_ to the name of your variables. Unlike many RDF libraries, substitution works with both local and remote SPARQL endpoints. Gastrodon works with the Python type system to keep track of details such as "is this variable a URI or a String?" so that you don't have to.

Query Intelligence

Query Intelligence

Query Intelligence

Gastrodon always has your back because it understands SPARQL. Gastrodon automatically keeps track of namespaces and appends prefix declarations to your queries to keep them short and sweet. Unlike many RDF libraries, Gastrodon supports variable substitution for queries in both local and remote stores. Gastrodon identifies GROUP BY variables and automatically makes them the index of the resulting Pandas DataFrames so that you can make common visualizations automatically.

Error messages you can understand

Many software packages ignore error handling, which is a big mistake, because poor error handling gets in the way of both everyday use and the learning process. Instead of making excuses, Gastrodon has intelligent error handling which adds to the convenience of data analysis and visualization with Gastrodon.

Jupyter native error messages

Awful Stack Trace

Improved Error Messages with Gastrodon

Good Error Message

Getting Started

Installation

Gastrodon requires Python 3.7 and is registered in the Python Package Index and can be installed by typing:

pip install gastrodon

on the command line. Note: Gastrodon downloads packages it requires via pip. If you are running Anancoda (which works great with Gastrodon) you have a second package manager, running parallel with pip, which can install better versions of important software packages than the ones you can get from pip. In Anaconda, you should type the following to create an environment for gastrodon:

conda create -n gastrodonSandbox python=3.6 anaconda
activate gastrodonSandbox
conda install jupyter IPython pandas matplotlib
pip install gastrodon

Documentation

The major documentation resources for Gastrodon itself are:

The following are reference documentation for tools you will use: