- This repository provides some kinds of tools for book digitizing.
- The quality of scanned document leads to the good digitizing.
- This tools assume that
- Book is scaned with 2-page spread and gray color.
- The doeument area is all the same on all pages.
- Scaned data is skew, shadow coming from binding margin.
- cmake(version 3.20.0 or higher)
- OpenCV
- After scan, create file set of "d.png" file. If you have PDF file format, extract png file by Acrobat or other tools.
- Use split tool to separate 2-page spread into 1 page png file.
./execute <path of png files>
For file specification, wildcard * is available. Note that split file(s) are exported in current directory, please take care overwriting.
- To obtaine good skew modification, remove shadow around the edges.
./execute <path of png files>
However, this tool does not provies auto shadow detection, therefore you have to change parameters of white fill region in main.cpp or by hand using like photoshop or gimp.
- Rotate picture by line detection.
./execute <path of png files>
This tool also requires parameters adjustment for good line detection.
-
Use affin transform to locate the pages at the same position.
-
Bind separated 2 pages, compress by jpeg format.
-
OCR, compress, index (please use acrobat)
-
finish