This library provides NetworkX API for Neo4j Graph Data Science. You should be able to use it as you would NetworkX but algorithms will run against Neo4j.
Here’s how you use it.
First let’s import our libraries and create an instance of the Neo4j driver:
import nxneo4j
from neo4j import GraphDatabase
driver = GraphDatabase.driver("bolt://localhost", auth=(user_name, password))
And now let’s create a Graph
object around the driver:
G = nxneo4j.Graph(driver)
We can create a little graph:
G.add_node(1)
G.add_nodes_from([2, 3])
G.add_edge(1, 2)
G.add_edge(4, 5)
G.add_edges_from([(1, 2), (1, 3), (2, 3)])
And run algorithms against it:
>>> list(nxneo4j.community.label_propagation_communities(G))
[{1, 2, 3}, {4, 5}]
We can also work with an existing graph. You can import this dataset by playing the following guide from the Neo4j browser:
:play http://guides.neo4j.com/data_science/01_eda.html
And then click through the sections that import the data.
This graph contains nodes with a Character
label that have INTERACTS1
, INTERACTS2
, INTERACTS3
, and INTERACTS45
relationships between them.
Now let’s create a Graph
object that knows about this graph.
First we’ll create a config map that we’ll pass in to the Graph()
constructor.
config = {
"node_label": "Character",
"relationship_type": None,
"identifier_property": "name"
}
We set:
-
node_label
isCharacter
so that we’ll only consider nodes with that label -
relationship_type
isNone
so that we’ll consider all relationship types in the graph -
identifier_property
is the node property that we’ll use to identify each node from the networkx-neo4j API
G = nxneo4j.Graph(driver, config)
We can find the most influential characters by executing the PageRank algorithm against this graph like this:
sorted_pagerank = sorted(nxneo4j.centrality.pagerank(G).items(), key=lambda x: x[1], reverse=True)
for character, score in sorted_pagerank[:10]:
print(character, score)
Tyrion-Lannister 11.312183000000001
Stannis-Baratheon 7.211595999999998
Tywin-Lannister 6.997056000000001
Varys 6.228078
Theon-Greyjoy 4.6472225
Sansa-Stark 4.234794
Walder-Frey 3.2763000000000004
Robb-Stark 3.0024044999999995
Samwell-Tarly 2.9787575000000004
Jon-Snow 2.9183989999999995
Hopefully there are some familiar names there!
What about if we want to find the shortest path between characters?
nxneo4j.path_finding.shortest_path(G, "Tyrion-Lannister", "Hodor")
['Tyrion-Lannister', 'Robb-Stark', 'Hodor']
We can also partition the characters into communities:
communities = nxneo4j.community.label_propagation_communities(G)
sorted_communities = sorted(communities, key=lambda x: len(x), reverse=True)
for community in sorted_communities[:10]:
print(list(community)[:10])
['Josmyn-Peckledon', 'Belwas', 'Rafford', 'Polliver', 'Petyr-Frey', 'Tristifer-IV-Mudd', 'Jeyne-Heddle', 'Urswyck', 'Falyse-Stokeworth', 'Hoster-Blackwood']
['Trystane-Martell', 'Blue-Bard', 'Matthos-Seaworth', 'Marya-Seaworth', 'Mors-Umber', 'Jaehaerys-I-Targaryen', 'Myrcella-Baratheon', 'Justin-Massey', 'Denys-Mallister', 'Clayton-Suggs']
['Oberyn-Martell', 'Nurse', 'Tommen-Baratheon', 'Tanda-Stokeworth', 'Garlan-Tyrell', 'Morgo', 'Qavo-Nogarys', 'Moon-Boy', 'Leonette-Fossoway', 'Allar-Deem']
['Owen', 'Jon-Snow', 'Gerrick-Kingsblood', 'Lanna-(Happy-Port)', 'Maekar-I-Targaryen', 'Gorne', 'Arron', 'Arson', 'Satin', 'Rast']
['Asha-Greyjoy', 'Palla', 'Squirrel', 'Tristifer-Botley', 'Yellow-Dick', 'Lorren', 'Jason-Mallister', 'Benfred-Tallhart', 'Kyra', 'Gynir']
['Harras-Harlaw', 'Baelor-Blacktyde', 'Dunstan-Drumm', 'Ralf-Stonehouse', 'Gorold-Goodbrother', 'Rodrik-Harlaw', 'Talbert-Serry', 'Sigfryd-Harlaw', 'Rodrik-Sparr', 'Wulfe']
['Alliser-Thorne', 'Othell-Yarwyck', 'Jaremy-Rykker', 'Ragwyle', 'Craster', 'Clubfoot-Karl', 'Blane', 'Donal-Noye', 'Halder', 'Mag-Mar-Tun-Doh-Weg']
['Tomard', 'Horton-Redfort', 'Lothor-Brune', 'Myranda-Royce', 'Grisel', 'Merrett-Frey', 'Loras-Tyrell', 'Nestor-Royce', 'Anya-Waynwood', 'Marillion']
['Marq-Piper', 'Rickard-Karstark', 'Margaery-Tyrell', 'Senelle', 'Hallis-Mollen', 'Harren-Hoare', 'Nan', 'Colen-of-Greenpools', 'Desmond-Grell', 'Edmure-Tully']
['Koss', 'Woth', 'Meralyn', 'Mad-Huntsman', 'Dobber', 'Ravella-Swann', 'Ternesio-Terys', 'Yoren', 'Amabel', 'Waif']
Centrality:
for module in dir(nxneo4j.centrality):
if not module.startswith("__"):
print(module)
betweenness_centrality
closeness_centrality
harmonic_centrality
pagerank
Community Detection:
for module in dir(nxneo4j.community):
if not module.startswith("__"):
print(module)
average_clustering
clustering
connected_components
label_propagation_communities
number_connected_components
triangles
Path Finding:
for module in dir(nxneo4j.path_finding):
if not module.startswith("__"):
print(module)
shortest_path
Shortest Path currently only works if you provide both Target
and Source
nodes.
Not all the algorithms are translated yet. These ones are next on the list:
-
Shortest path
-
A-star