You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this paper, we introduce a novel dataset called CORD, which stands for a Consolidated Receipt Dataset for post-OCR parsing. To the best of our knowledge, this is the first publicly available dataset which includes both box-level text and parsing class annotations. The parsing class labels are provided in two-levels. The eight superclasses include store, payment, menu, subtotal, and total. The eight superclasses are subdivided into 54 subclasses e.g., store has nine subclasses including name, address, telephone, and fax.
Furthermore, it also provides line annotations for the serialization task which is a newly emerging problem as a combination of the two tasks.
License
CC-BY 4.0
The text was updated successfully, but these errors were encountered:
NusaCatalogue: https://indonlp.github.io/nusa-catalogue/card.html?cord_v2
The text was updated successfully, but these errors were encountered: