Map Reduce for Notebooks
Project description
Papermill is a tool for parameterizing, executing, and analyzing Jupyter Notebooks.
The goals for Papermill are:
Parametrizing notebooks
Executing and collecting metrics across the notebooks
Summarizing collections of notebooks
Installation
pip install papermill
In-Notebook bindings
Usage
Parameterizing a Notebook.
To parameterize your notebook designate a cell with the tag parameters. Papermill looks for the parameters cell and replaces those values with the parameters passed in at execution time.
Executing a Notebook
The two ways to execute the notebook with parameters are through the Python API and through the command line interface.
Executing a Notebook via Python API
import papermill as pm
pm.execute_notebook(
notebook_path='path/to/input.ipynb',
output_path='path/to/output.ipynb',
parameters=dict(alpha=0.6, ratio=0.1)
)
Executing a Notebook via CLI
$ papermill local/input.ipynb s3://bkt/output.ipynb -p alpha 0.6 -p l1_ratio 0.1
Recording Values to the Notebook
Users can save values to the notebook document to be consumed by other notebooks.
Recording values to be saved with the notebook.
### notebook.ipynb
import papermill as pm
pm.record("hello", "world")
pm.record("number", 123)
pm.record("some_list", [1,3,5])
pm.record("some_dict", {"a":1, "b":2})
Users can recover those values as a Pandas dataframe via the the read_notebook function.
### summary.ipynb
import papermill as pm
nb = pm.read_notebook('notebook.ipynb')
nb.dataframe
Displaying Plots and Images Saved by Other Notebooks
Display a matplotlib histogram with the key name “matplotlib_hist”.
### notebook.ipynb
# Import plt and turn off interactive plotting to avoid double plotting.
import papermill as pm
import matplotlib.pyplot as plt; plt.ioff()
from ggplot import mpg
f = plt.figure()
plt.hist('cty', bins=12, data=mpg)
pm.display('matplotlib_hist', f)
Read in that above notebook and display the plot saved at “matplotlib_hist”.
### summary.ipynb
import papermill as pm
nb = pm.read_notebook('notebook.ipynb')
nb.display_output('matplotlib_hist')
Analyzing a Collection of Notebooks
Papermill can read in a directory of notebooks and provides the NotebookCollection interface for operating on them.
### summary.ipynb
import papermill as pm
nbs = pm.read_notebooks('/path/to/results/')
# Show named plot from 'notebook1.ipynb'
# Accepts a key or list of keys to plot in order.
nbs.display_output('train_1.ipynb', 'matplotlib_hist')
# Dataframe for all notebooks in collection
nbs.dataframe.head(10)
Project details
Release history Release notifications | RSS feed
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
Built Distribution
File details
Details for the file papermill-0.6.4.tar.gz
.
File metadata
- Download URL: papermill-0.6.4.tar.gz
- Upload date:
- Size: 24.4 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 37ff912194df106ecbc420ec62972f58c5276f6e9c23389379aea2ba8a9c556d |
|
MD5 | eeaed8d8068b8f716bf68e5f76a45eb8 |
|
BLAKE2b-256 | 068b7fb8dc8d46d125525f553443b04aed0d352604741f37a8571bedb58a536c |
File details
Details for the file papermill-0.6.4-py2-none-any.whl
.
File metadata
- Download URL: papermill-0.6.4-py2-none-any.whl
- Upload date:
- Size: 11.6 kB
- Tags: Python 2
- Uploaded using Trusted Publishing? No
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 13307a996f1843fe7f141bc3514e936bfa17212914d009f01d459259d21d3f2b |
|
MD5 | 13e80fe4b8b4fd78f23bc7871fd6f378 |
|
BLAKE2b-256 | 45d416eca205016853c4941a7b6a72b5af6ff9b4c595127ba6b6ed66d3850a48 |