Skip to main content

Storage Workflows for Notebooks

Project description

bookstore :books:

Documentation Status Build Status CircleCI Codecov

bookstore :books: provides tooling and workflow recommendations for storing :cd:, scheduling :calendar:, and publishing :book: notebooks.

The full documentation is hosted on ReadTheDocs.

How does bookstore work

Automatic Notebook Versioning

Every save of a notebook creates an immutable copy of the notebook on object storage.

To simplify implementation, we currently rely on S3 as the object store, using versioned buckets.

Storage Paths

All notebooks are archived to a single versioned S3 bucket with specific prefixes denoting the lifecycle of the notebook:

  • /workspace - where users edit
  • /published - public notebooks (to an organization)

Each notebook path is a namespace that an external service ties into the schedule. We archive off versions, keeping the path intact (until a user changes them).

Prefix Intent
/workspace/kylek/notebooks/mine.ipynb Notebook in “draft”
/published/kylek/notebooks/mine.ipynb Current published copy

Scheduled notebooks will also be referred to by the notebook key. In addition, we'll need to be able to surface version IDs as well.

Transitioning to this Storage Plan

Since most people are on a regular filesystem, we'll start with writing to the /workspace prefix as Archival Storage (writing on save using a post_save_hook for a Jupyter contents manager).

Publishing

The bookstore publishing endpoint is a serverextension to the classic Jupyter server. This means you will need to explicitly enable the serverextension to use the endpoint.

To do so, run:

jupyter serverextension enable --py bookstore

To enable it only for the current environment, run:

jupyter serverextension enable --py bookstore --sys-prefix

Installation

bookstore requires Python 3.6 or higher.

Note: Supports installation on Jupyter servers running Python 3.6 and above. Your notebooks can still be run in Python 2 or Python 3.

  1. Clone this repo.
  2. At the repo's root, enter in the Terminal: python3 -m pip install . (Tip: don't forget the dot at the end of the command)

Configuration

# jupyter config
# At ~/.jupyter/jupyter_notebook_config.py for user installs on macOS
# See https://jupyter.readthedocs.io/en/latest/projects/jupyter-directories.html for other places to plop this

from bookstore import BookstoreContentsArchiver

c.NotebookApp.contents_manager_class = BookstoreContentsArchiver

# All Bookstore settings are centralized on one config object so you don't have to configure it for each class
c.BookstoreSettings.workspace_prefix = "/workspace/kylek/notebooks"
c.BookstoreSettings.published_prefix = "/published/kylek/notebooks"

c.BookstoreSettings.s3_bucket = "<bucket-name>"

# Note: if bookstore is used from an EC2 instance with the right IAM role, you don't
# have to specify these
c.BookstoreSettings.s3_access_key_id = <AWS Access Key ID / IAM Access Key ID>
c.BookstoreSettings.s3_secret_access_key = <AWS Secret Access Key / IAM Secret Access Key>

Developing

If you are developing on bookstore you will want to run the ci tests locally and to make releases.

Use CONTRIBUTING.md to learn more about contributing. Use running_ci_locally.md to learn more about running ci tests locally. Use running_python_tests.md to learn about running tests locally. Use RELEASING.md to learn more about releasing bookstore.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

bookstore-2.5.1.tar.gz (116.4 kB view details)

Uploaded Source

Built Distribution

bookstore-2.5.1-py2.py3-none-any.whl (34.2 kB view details)

Uploaded Python 2 Python 3

File details

Details for the file bookstore-2.5.1.tar.gz.

File metadata

  • Download URL: bookstore-2.5.1.tar.gz
  • Upload date:
  • Size: 116.4 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for bookstore-2.5.1.tar.gz
Algorithm Hash digest
SHA256 b37324cfd3ea9622c5bd4d53b9698c897d2306f884b6ac4ebf33945120cf1c78
MD5 d9660de478e82e2ce5572a102387da5d
BLAKE2b-256 35f8ffab0bcfda118421e6a72d5f13ca65dbf78e280882fc29bdd9193f4f26b9

See more details on using hashes here.

File details

Details for the file bookstore-2.5.1-py2.py3-none-any.whl.

File metadata

  • Download URL: bookstore-2.5.1-py2.py3-none-any.whl
  • Upload date:
  • Size: 34.2 kB
  • Tags: Python 2, Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.4.2 requests/2.20.0 setuptools/41.0.1 requests-toolbelt/0.8.0 tqdm/4.36.1 CPython/3.6.8

File hashes

Hashes for bookstore-2.5.1-py2.py3-none-any.whl
Algorithm Hash digest
SHA256 ed4c9707039ada3240346a246a1a2e19c206fe140a20b8506a7b77464da47da7
MD5 4f25fe9378a8a3a4c21be1337032ddd1
BLAKE2b-256 dd1c18e81aad60d2bce7aa579e8a5a7f56b6452f259fa63e58b14874d1a31ad0

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page