Python Object Graphs

Build Status Build Status (Windows) Test Coverage Documentation Status

objgraph is a module that lets you visually explore Python object graphs.

You’ll need graphviz if you want to draw the pretty graphs.

I recommend xdot for interactive use. pip install xdot should suffice; objgraph will automatically look for it in your PATH.

Installation and Documentation

pip install objgraph or download it from PyPI.

Documentation lives at https://mg.pov.lt/objgraph.

Quick start

Try this in a Python shell:

>>> x = []
>>> y = [x, [x], dict(x=x)]
>>> import objgraph
>>> objgraph.show_refs([y], filename='sample-graph.png')
Graph written to ....dot (... nodes)
Image generated as sample-graph.png

(If you’ve installed xdot, omit the filename argument to get the interactive viewer.)

You should see a graph like this:

[graph of objects reachable from y]

If you prefer to handle your own file output, you can provide a file object to the output parameter of show_refs and show_backrefs instead of a filename. The contents of this file will contain the graph source in DOT format.

Backreferences

Now try

>>> objgraph.show_backrefs([x], filename='sample-backref-graph.png')
... 
Graph written to ....dot (8 nodes)
Image generated as sample-backref-graph.png

and you’ll see

[graph of objects from which y is reachable]

Memory leak example

The original purpose of objgraph was to help me find memory leaks. The idea was to pick an object in memory that shouldn’t be there and then see what references are keeping it alive.

To get a quick overview of the objects in memory, use the imaginatively-named show_most_common_types():

>>> objgraph.show_most_common_types() 
tuple                      5224
function                   1329
wrapper_descriptor         967
dict                       790
builtin_function_or_method 658
method_descriptor          340
weakref                    322
list                       168
member_descriptor          167
type                       163

But that’s looking for a small needle in a large haystack. Can we limit our haystack to objects that were created recently? Perhaps.

Let’s define a function that “leaks” memory

>>> class MyBigFatObject(object):
...     pass
...
>>> def computate_something(_cache={}):
...     _cache[42] = dict(foo=MyBigFatObject(),
...                       bar=MyBigFatObject())
...     # a very explicit and easy-to-find "leak" but oh well
...     x = MyBigFatObject() # this one doesn't leak

We take a snapshot of all the objects counts that are alive before we call our function

>>> objgraph.show_growth(limit=3) 
tuple                  5228      5228
function               1330      1330
wrapper_descriptor      967       967

and see what changes after we call it

>>> computate_something()
>>> objgraph.show_growth() 
MyBigFatObject        2         2
dict                797         1

It’s easy to see MyBigFatObject instances that appeared and were not freed. I can pick one of them at random and trace the reference chain back to one of the garbage collector’s roots.

For simplicity’s sake let’s assume all of the roots are modules. objgraph provides a function, is_proper_module(), to check this. If you’ve any examples where that isn’t true, I’d love to hear about them (although see Reference counting bugs).

>>> import random
>>> objgraph.show_chain(
...     objgraph.find_backref_chain(
...         random.choice(objgraph.by_type('MyBigFatObject')),
...         objgraph.is_proper_module),
...     filename='chain.png') 
Graph written to ...dot (13 nodes)
Image generated as chain.png
[chain of references from a module to a MyBigFatObject instance]

It is perhaps surprising to find linecache at the end of that chain (apparently doctest monkey-patches it), but the important things – computate_something and its cache dictionary – are in there.

There are other tools, perhaps better suited for memory leak hunting: heapy, Dozer.

Reference counting bugs

Bugs in C-level reference counting may leave objects in memory that do not have any other objects pointing at them. You can find these by calling get_leaking_objects(), but you’ll have to filter out legitimate GC roots from them, and there are a lot of those:

>>> roots = objgraph.get_leaking_objects()
>>> len(roots) 
4621
>>> objgraph.show_most_common_types(objects=roots)
... 
tuple          4333
dict           171
list           74
instancemethod 4
listiterator   2
MemoryError    1
Sub            1
RuntimeError   1
Param          1
Add            1
>>> objgraph.show_refs(roots[:3], refcounts=True, filename='roots.png')
... 
Graph written to ...dot (19 nodes)
Image generated as roots.png
[GC roots and potentially leaked objects]

API Documentation

More examples, that also double as tests

History

I’ve developed a set of functions that eventually became objgraph when I was hunting for memory leaks in a Python program. The whole story – with illustrated examples – is in this series of blog posts:

And here’s the change log

Support and Development

The source code can be found in this Git repository: https://github.com/mgedmin/objgraph.

To check it out, use git clone https://github.com/mgedmin/objgraph.

Report bugs at https://github.com/mgedmin/objgraph/issues.

For more information, see Hacking on objgraph.