Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.
Extension for pie to include taggers with their models and pre/postprocessors
A simple and efficient tool to parallelize Pandas operations on all available CPUs
An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.
A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.
Interactive Widgets for the Jupyter Notebook
Targetted language identifier, based on FastText and Hunspell.
Tool to fix bitexts and tag near-duplicates for removal
Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.
All languages stopwords collection
Script permettant de convertir un forum hébergé par Forumactif en forum phpbb.
Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains
Implementation of Nougat Neural Optical Understanding for Academic Documents
The Next Generation Multi-Modality Superintelligence
Open Source Neural Machine Translation and (Large) Language Models in PyTorch
Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)
Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.
⚡ A Fast, Extensible Progress Bar for Python and CLI