Skip to content
View petri's full-sized avatar

Organizations

@plone @collective @koodaamo @zopefoundation @beanstalkd

Block or report petri

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

A Docker-powered service for PDF document layout analysis. This service provides a powerful and flexible PDF analysis service. The service allows for the segmentation and classification of differen…

Python 113 12 Updated Sep 27, 2024

LOTUS: The semantic query engine - process data with LMs as easily as writing pandas code

Python 264 18 Updated Sep 27, 2024

PDF Table Extractor is an innovative Python project designed to tackle the challenge of extracting tables from scanned PDF documents. Leveraging advanced optical character recognition (OCR) and ima…

Jupyter Notebook 13 2 Updated Mar 27, 2024

Table Transformer (TATR) is a deep learning model for extracting tables from unstructured documents (PDFs and images). This is also the official repository for the PubTables-1M dataset and GriTS ev…

Python 2,216 247 Updated Jun 24, 2024

Distributed Training Over-The-Internet

615 22 Updated Aug 27, 2024

A library for company name parsing based on cleanco

Python 4 Updated Jul 29, 2024

BluOS API client

Python 1 Updated Sep 11, 2024

Qdrant - High-performance, massive-scale Vector Database for the next generation of AI. Also available in the cloud https://cloud.qdrant.io/

Rust 19,965 1,349 Updated Sep 27, 2024

tiktoken is a fast BPE tokeniser for use with OpenAI's models.

Python 11,930 810 Updated Aug 15, 2024

DSPy: The framework for programming—not prompting—foundation models

Python 17,345 1,325 Updated Sep 28, 2024

Pydantic extension for annotating autocorrecting fields.

Python 205 3 Updated Jun 20, 2024

A RDF-based representation of the HTML-vocabulary to express HTML-documents in RDF, rendering them semantic.

Python 21 7 Updated Sep 29, 2024

Tools for reading and fusing live data streams from Polar OH1 (PPG) and H10 (ECG) sensors. pip install polarpy.

Python 11 4 Updated Mar 26, 2023

Python client for Polar AccessLink API.

Python 2 2 Updated Oct 1, 2019

Python client for Polar web API.

Python 1 Updated Dec 28, 2019

A free, opensource, multiplatform, universal viewer and toolbox intended for, but not limited to, timeseries storage files like EEG, EMG, ECG, BioImpedance, etc.

C 11 35 Updated Jul 6, 2014

Developer APIs to Accelerate LLM Projects

Jupyter Notebook 1,361 133 Updated Aug 12, 2024

Full text search in your Pandas dataframe

Python 199 7 Updated Aug 9, 2024

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

HTML 8,613 703 Updated Sep 29, 2024

Fast and robust date extraction from web pages, with Python or on the command-line

Python 118 26 Updated Sep 2, 2024

Heuristic based boilerplate removal tool

Python 719 80 Updated May 9, 2024

fast python port of arc90's readability tool, updated to match latest readability.js!

Python 2,651 349 Updated Aug 15, 2024

A Python 3 compatible version of goose http://goose3.readthedocs.io/en/latest/index.html

HTML 810 99 Updated Aug 20, 2024

img2table is a table identification and extraction Python Library for PDF and images, based on OpenCV image processing

Python 528 74 Updated Sep 1, 2024

Camelot: PDF Table Extraction for Humans

Python 3,645 355 Updated Jan 5, 2023

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 6,464 657 Updated Aug 29, 2024

Fork of https://github.com/infoscout/weighted-levenshtein

Python 2 Updated Dec 28, 2022
Python 2 1 Updated May 5, 2023

A collection of 24 x 24 dp SVG spinners! (CSS & SMIL)

SVG 6,035 818 Updated May 5, 2023

borb is a library for reading, creating and manipulating PDF files in python.

Python 3,373 148 Updated Aug 26, 2024
Next