Please refer to the v1.0.1
documentation;
the code for v1.0
is identical to the code for v1.0.1
.
See https://docs.cleanlab.ai/ if you want to browse the documentation (including for past versions).
In the cleanlab
repository, we've configured GitHub Actions to perform the following automatically:
-
When a commit is pushed to the
master
branch, a new version of themaster
docs will be built and deployed to thecleanlab-docs
repository. -
When a release is published, a new version of the docs with the corresponding release tag will be built and deployed as a new folder in the
cleanlab-docs
repository. Redirection to thestable
version of the docs will be changed to this newly released one, accessible via a link on the docs' site sidebar. All the older versions will remain available in thecleanlab-docs
repo, accessible by manually entering the subdirectory in the URL. -
When a user manually runs the workflow, one of the above will happen depending on the user's selection to run from a
branch
ortag
.
If you'd like to build our docs locally or remotely yourself, or want to know more about the steps taken in the GitHub Pages workflow, read on!
pip install -r docs/requirements.txt
-
Install Pandoc.
-
If you don't already have it, install wget. This can be done with
brew
on macOS:brew install wget
-
[Optional] Create a new branch, make your code changes, and then
git commit
them. ONLY COMMITTED CHANGES WILL BE REFLECTED IN THE DOCS BUILD WITHsphinx-multiversion
. Instead usesphinx-build
if you don't want to commit some test changes but still want to see their corresponding docs. -
Build the docs with either
-
- If you're building from a branch (usually the
master
branch):
sphinx-multiversion docs/source cleanlab-docs -D smv_branch_whitelist=YOUR_BRANCH_NAME -D smv_tag_whitelist=None
- If you're building from a tag (usually the tag of the stable release):
sphinx-multiversion docs/source cleanlab-docs -D smv_branch_whitelist=None -D smv_tag_whitelist=YOUR_TAG_NAME
Note: To also build docs for another branch or tag, run the above command again changing only the
YOUR_BRANCH_NAME
orYOUR_TAG_NAME
placeholder. - If you're building from a branch (usually the
-
- If you want to test out some changes without comitting them, then you can build from your current working directory tree (where you have any un-committed changes locally saved):
sphinx-build docs/source cleanlab-docs
This won't properly produce/display other versions of the docs, but that shouldn't matter if you are just trying to test some local edits to the current version. If some notebooks are giving you trouble (eg. due to runtime or dependencies), you can simply delete those .ipynb files before calling
sphinx-build
.Fast build: Executing the Jupyter Notebooks (i.e., the
.ipynb
files) that make up some portion of the docs, such as the tutorials, takes a long time. If you want to skip rendering these, set the environment variableSKIP_NOTEBOOKS=1
. You can either set this usingexport SKIP_NOTEBOOKS=1
or do this inline withSKIP_NOTEBOOKS=1 sphinx-multiversion ...
.Skipping specific notebooks: If you want to skip rendering a few specific notebooks during your local build, the best way to do this is to temporarily move the files outside the
cleanlab
folder (sonbsphinx
would not find it), then build the docs, before finally moving the files back (to ensure they will not be deleted when pushed to GitHub)Example workflow for skipping notebooks, given our current working directory is the
cleanlab
root folder and we want to ignore theaudio.ipynb
notebook:- create an empty folder outside of cleanlab folder
mkdir ../ignore_notebooks
- move the notebook to ignore from local build to the newly created folder
mv docs/source/tutorials/audio.ipynb ../ignore_notebooks
- build the docs locally, using
sphinx-build
as it does not require you to commit your changes
sphinx-build docs/source cleanlab-docs
- move the notebook back to its original location
mv ../ignore_notebooks/audio.ipynb docs/source/tutorials
While building the docs with
sphinx-multiversion
, your terminal might output:unknown config value 'smv_branch_whitelist' in override, ignoring
, andunknown config value 'smv_tag_whitelist' in override, ignoring
.
This is because the
smv_branch_whitelist
andsmv_tag_whitelist
config values are only used bysphinx-multiversion
, but may also be checked bysphinx
or other extensions that do not use them. Hence, these can be safely ignored as long as the docs are built correctly. -
-
[Optional] To show dynamic versioning and version warning banners:
-
Copy the
docs/_templates/versioning.js
file to thecleanlab-docs/
directory. -
In the copied
versioning.js
file:-
find
placeholder_version_number
and replace it with the latest release tag name, and -
find
placeholder_commit_hash
and replace it with themaster
branch commit hash.
-
-
-
[Optional] To redirect site visits from
/
or/stable
to the stable version of the docs:-
Create a copy of the
docs/_templates/redirect-to-stable.html
file and rename it asindex.html
. -
In this
index.html
file, findstable_url
and replace it with/cleanlab-docs/YOUR_LATEST_RELEASE_TAG_NAME/index.html
. -
Copy this
index.html
to:-
cleanlab-docs/
, and -
cleanlab-docs/stable/
.
-
-
-
The docs for each branch and/or tag can be found in the
cleanlab-docs/
directory, open any of theindex.html
in your browser to view the docs:cleanlab-docs | index.html (redirects to stable release of the docs) | versioning.js (for dynamic versioning and version warning banner) | └───YOUR_BRANCH_NAME (e.g. master) │ index.html │ ... │ └───YOUR_TAG_NAME_1 (e.g. your stable release tag name) │ index.html │ ... │ └───YOUR_TAG_NAME_2 (e.g. an old release tag name) │ index.html │ ... │ └───stable │ index.html (redirects to stable release of the docs) │ └───...
Note: If you're building the docs from a working directory tree, the docs will be found at the top of the
cleanlab-docs/
directory:cleanlab-docs | index.html (docs for the working directory tree) | ... | └───...
This may overwrite some of the files in
cleanlab-docs/
, likeindex.html
from the previous step.
-
Fork the
cleanlab
repository. -
Create a new repository named
cleanlab-docs
and a new branch namedmaster
. -
In the
cleanlab-docs
repo, configure GitHub Pages; under the Source section, select themaster
branch and/(root)
folder. Take note of the URL where your site is published. -
Generate SSH deploy key and add them to your repos as such:
- In the
cleanlab-docs
repo, go to Settings > Deploy Keys > Add deploy key and add your public key with the Allow write access - In the
cleanlab
repo, go to Settings > Secrets > New repository secrets and add your private key namedACTIONS_DEPLOY_KEY
- In the
-
In the
cleanlab
repo, check that you have the GitHub Pages workflow under the repo's Actions tab. This should be created automatically from.github\workflows\gh-pages.yaml
. This workflow can be activated by any of the 3 triggers below:- A push to the
master
branch in thecleanlab
repo. - Publish of a new release in the
cleanlab
repo. - Manually run from the Run workflow option and select either the
master
branch or one of the release tag.
- A push to the
-
Activate the workflow with any of the 3 triggers listed above and wait for it to complete.
-
Navigate to the URL where your GitHub Pages site is published in step 3. The default URL should have the format https://repository_owner.github.io/cleanlab-docs/.
GitHub Actions automatically builds and deploys the docs' build artifacts when triggered. If you delete and recreate a release tag, the docs for this tag will be rebuilt and redeployed, hence overwriting the existing artifacts with the new ones.
On rare occasions, you may want to update the docs without deleting and recreating the release tag, for example, when you want to fix a typo in the docs, but you've already deployed your tag to PyPI or Conda. This can be done by manually adding specific docs' build artifacts to the cleanlab/cleanlab-docs
repo. These steps are for users who have push
permission to cleanlab/cleanlab
and cleanlab/cleanlab-docs
repo.
-
If you haven't already done so, clone the
cleanlab/cleanlab
repo. -
Make the necessary code changes.
-
Perform git add and git commit for the changes.
-
git push to the
cleanlab/cleanlab
repo. As this is pushed from a non-master
branch, GitHub Actions will only build but not deploy the docs' build artifacts. -
Navigate to github.com/cleanlab/cleanlab in your browser, select the "Actions" tab, under "Workflow", click "GitHub Pages", then select the workflow that was triggered by the previous step.
-
Ensure that the workflow has completed running.
-
Scroll to the bottom of the page, under "Artifacts", click "docs-html" to download the docs' build artifacts.
-
Unzip "docs-html.zip" and open the "docs-html" folder.
-
Identify the files you would like to replace, i.e., the corresponding files creating the pages on docs.cleanlab.ai.
-
Replace these files in github.com/cleanlab/cleanlab-docs by uploading the new ones to the corresponding version folder in the
master
branch of thecleanlab/cleanlab-docs
repo.
⚠️ Any build artifacts manually added tocleanlab/cleanlab-docs
that do not live in themaster
branch of thecleanlab/cleanlab
repo will be lost in future versions of cleanlab docs. So any edit made in the v2.0.0 docs which you also want to have in the v2.0.1, v2.0.2, etc. docs needs to be introduced as a PR to thecleanlab/cleanlab
repo as well.
⚠️ Currently, if updating stable/old version (sayvXXX
) of tutorials from latest master branch version, the install of cleanlab package in notebooks/colabs will be wrong. To remedy this, you need to update the cleanlab version in all.ipynb
files inside folders: cleanlab-docs/vXXX/tutorials/ and cleanlab-docs/vXXX/_sources/. The tutorial.html
pages will also have wrong colab links as well. Currently have to update the.html
files in cleanlab-docs/vXXX/tutorials/ to replace these colab links with the proper links (replace/master/
in the link with/vXXX/
for the version you are building docs for).
We've configured GitHub Actions to run the GitHub Pages workflow (gh-pages.yaml) to build and deploy our docs' static files. Here's a breakdown of what this workflow does in the background:
-
Spin up a Ubuntu server.
-
Install Pandoc, a document converter required by
nbsphinx
to generate static sites from notebooks (.ipynb
). -
Check-out the
cleanlab
repository. -
Setup Python and cache dependencies.
-
Install dependencies for the docs from
docs/requirements.txt
.
- Run Sphinx with the
sphinx-multiversion
wrapper to build the doc's static site files. These files will be outputted to thecleanlab-docs/
directory.
-
Get the latest release tag name and insert it in the
versioning.js
file. Theindex.html
of each doc version will read this as a variable and display it beside the stable hyperlink. -
Insert the latest commit hash in the
versioning.js
file. Theindex.html
of each doc version will read this as a variable and display it beside the developer hyperlink. -
Copy the
versioning.js
file to thecleanlab-docs/
folder.
If the workflow is triggered by a new release, generate the redirecting HTML which redirects site visits to the stable version
-
Insert the relative path to the stable docs in the
redirect-to-stable.html
file AKA the redirecting HTML. -
Create a copy of the
redirect-to-stable.html
file tocleanlab-docs/index.html
andcleanlab-docs/index.html
.
- Deploy
cleanlab-docs/
folder to thecleanlab/cleanlab-docs
repo'smaster branch
.
Each tutorial is a Jupyter notebook (unexecuted .ipynb file) that will be executed during CI for the version displayed at docs.cleanlab.ai using nbsphinx. Some basic linting is also applied to ensure proper notebook formatting such as no trailing newlines at the end of cells. Here are some tips when adding a new tutorial notebook:
-
Make sure to clear all Cell outputs before you
git commit
a tutorial. The outputs of cells should never be tracked in git, these outputs are automatically constructed for displaying on docs.cleanlab.ai during the CI which executes all notebooks in the folder docs/source/. -
For cells which contain code that should not be executed during CI, make sure the cell-type is Markdown and use proper syntax to make contents look like code.
-
To suppress certain Jupyter cells that should not be shown on docs.cleanlab.ai web version of tutorial:
"metadata": {
"nbsphinx": "hidden"
}
This includes cells that install dependencies and cells that run tests to verify the notebook has executed correctly. These cells will still be visible when the notebook is run in Colab or locally in Jupyter, so make sure to add a comment explaining their purpose at the top.
- If developing Notebook in virtualenv, make sure at the end to change the end of the raw .ipynb file to have the following:
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
}
instead of containing your own virtualenv in there. CI will FAIL if you instead list your own virtualenv here!
-
When adding dependencies to a tutorial:
- Make sure to update docs/requirements.txt which lists all extra dependencies installed during CI to build the docs.
- Add a comment in hidden cell not displayed on docs.cleanlab.ai stating which version of dependencies you used.
- Think carefully whether each dependency is really necessary and if its future versions will be stable / compatible with future versions of existing dependencies.
-
Don't forget to update docs/source/index.rst with a short title and docs/source/tutorials/index.rst to ensure your tutorial properly linked. Otherwise it will not appear on docs.cleanlab.ai!
-
Ask yourself:
- How can I make this tutorial run faster without sacrificing educational value? Perhaps use smaller subsample of the dataset, smaller/pretrained model, etc.
- What sections of this tutorial are least vital? Consider creating a separate Examples notebook that features those.
All of our tutorials are quickstart guides that should run quite fast. Longer/comprehensive notebooks are better added in Examples.
-
Verify your new docstrings adhere to our documentation format guidelines
-
To ensure documentation for new source code files is linked from the main page, don't forget to update: docs/source/index.rst