For contributors

^

doc curation

A package for curating doc file collections. Prominent features:

Scrape texts off various sites, such as Wikisource. See example here. (PS: Consider contributing to raw_etexts repo. )
OCR some pdf with google drive. Automatically splits into 25 page bits and ocrs them individually. See usage example here, function here.

For users

Autogenerated Docs on readthedocs (might be broken).
Manually and periodically generated docs here
For detailed examples and help, please see individual module files in this package.

Installation or upgrade:

For stable version pip install doc_curation -U
For latest code pip install git https://github.com/sanskrit-coders/doc_curation/@master -U
Web.

Usage:

Enable Google Driver API and download service account key file having Google Driver API access.

from doc_curation import pdf
pdf_file = '/home/file.pdf'
key_file = '/home/key.json'
pdf.split_and_ocr_on_drive(pdf_file, key_file)

For contributors

Contact

Have a problem or question? Please head to github.

Packaging

~/.pypirc should have your pypi login credentials.

python setup.py bdist_wheel
twine upload dist/* --skip-existing

Build documentation

sphinx html docs can be generated with cd docs; make html

Testing

Run pytest in the root directory.

Auxiliary tools

pyup

Name		Name	Last commit message	Last commit date
Latest commit History 206 Commits
.github		.github
curation_projects		curation_projects
doc_curation		doc_curation
docs		docs
tests		tests
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE.txt		LICENSE.txt
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

doc curation

For users

Installation or upgrade:

Usage:

For contributors

Contact

Packaging

Build documentation

Testing

Auxiliary tools

About

Releases

Packages

Languages

License

anirudh2290/doc_curation

Folders and files

Latest commit

History

Repository files navigation

doc curation

For users

Installation or upgrade:

Usage:

For contributors

Contact

Packaging

Build documentation

Testing

Auxiliary tools

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages