Skip to content
View OrianeN's full-sized avatar

Block or report OrianeN

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

Showing results
Jupyter Notebook 23 Updated Sep 23, 2024

Repository to track the progress in Natural Language Processing (NLP), including the datasets and the current state-of-the-art for the most common NLP tasks.

Python 22,561 3,615 Updated Jul 28, 2024

Extension for pie to include taggers with their models and pre/postprocessors

Python 11 2 Updated May 30, 2024

A simple and efficient tool to parallelize Pandas operations on all available CPUs

Python 3,643 210 Updated Jul 9, 2024

A Neural Framework for MT Evaluation

Python 491 76 Updated Jul 29, 2024

An interactive NVIDIA-GPU process viewer and beyond, the one-stop solution for GPU process management.

Python 4,665 146 Updated Sep 11, 2024

A CoNLL-U parser that takes a CoNLL-U formatted string and turns it into a nested python dictionary.

Python 308 50 Updated Sep 19, 2024
Python 1 2 Updated Sep 20, 2024

Interactive Widgets for the Jupyter Notebook

TypeScript 3,139 948 Updated Sep 12, 2024

Pre-filtering step for bicleaner

Python 4 2 Updated Jul 26, 2024

Jupyter Interactive Notebook

Jupyter Notebook 11,619 4,879 Updated Sep 9, 2024

Targetted language identifier, based on FastText and Hunspell.

Python 27 4 Updated Sep 4, 2024

Tool to fix bitexts and tag near-duplicates for removal

Python 29 3 Updated Aug 19, 2024

Bicleaner is a parallel corpus classifier/cleaner that aims at detecting noisy sentence pairs in a parallel corpus.

Python 148 22 Updated Jun 18, 2024

Translation Memory Open-source Purifier

Python 32 10 Updated Nov 6, 2022

The source code for the TIRA Shared Task Platform

Python 14 9 Updated Sep 27, 2024

All languages stopwords collection

JavaScript 420 76 Updated Jan 7, 2024

Script permettant de convertir un forum hébergé par Forumactif en forum phpbb.

Python 21 13 Updated Aug 18, 2022

Your PyTorch AI Factory - Flash enables you to easily configure and run complex AI recipes for over 15 tasks across 7 data domains

Python 1,740 213 Updated Oct 8, 2023

Convert HTML to Markdown

Python 1,028 135 Updated Jul 14, 2024

Implementation of Nougat Neural Optical Understanding for Academic Documents

Python 8,814 560 Updated Apr 16, 2024

The Next Generation Multi-Modality Superintelligence

Python 69 10 Updated Sep 3, 2024

Open Source Neural Machine Translation and (Large) Language Models in PyTorch

Python 6,722 2,248 Updated Jun 27, 2024

Efficient Low-Memory Aligner

C 136 30 Updated Sep 3, 2024

Obtain Word Alignments using Pretrained Language Models (e.g., mBERT)

Python 347 47 Updated Nov 7, 2023

Plumb a PDF for detailed information about each char, rectangle, line, et cetera — and easily extract text and tables.

Python 6,460 657 Updated Aug 29, 2024

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

Python 5,160 496 Updated Sep 26, 2024

Neural Machine Translation (NMT) tutorial. Data preprocessing, model training, evaluation, and deployment.

Jupyter Notebook 148 28 Updated Apr 17, 2024

⚡ A Fast, Extensible Progress Bar for Python and CLI

Python 28,421 1,349 Updated Aug 17, 2024
Next