A dataset for scientific document figure classiication
We proved the scientific document images from the article published in CVPR, ECCV and ICCV. We don't have any copy write on this figure images. We provide you a python script for dowloading the pdf files from IEEE and CVF. Please make sure that you have acces to these websites.
Convert the all pdf file to image file. Download pdfbox
git clone https://github.com/jobinkv/DocFigure.git
cd DocFigure
wget http://mirrors.estointernet.in/apache/pdfbox/2.0.14/pdfbox-app-2.0.14.jar
python readAnotation.py
It will create a folder sub images in a folder images
Trained model link
To test the trained model run
python testTrainedModel.py --trainedFigClassModel '/downloded/path/to/epoch_9_loss_0.04706_testAcc_0.96867_X_resnext101_docSeg.pth' --inputImage '/path/of/inputimage/for/testing'