Skip to content
View dtuit's full-sized avatar

Block or report dtuit

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Get your documents ready for gen AI

Python 7,390 356 Updated Nov 8, 2024

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,144 208 Updated Nov 6, 2024

PDF to Markdown with vision models

Python 5,938 316 Updated Nov 7, 2024

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

Python 2,450 150 Updated Nov 8, 2024

Interactive Tools for Machine Learning, Deep Learning and Math

2,642 307 Updated Aug 11, 2024

Graph-structured Indices for Scalable, Fast, Fresh and Filtered Approximate Nearest Neighbor Search

C 1,127 219 Updated Nov 4, 2024

Dispatch and distribute your ML training to "serverless" clusters in Python, like PyTorch for ML infra. Iterable, debuggable, multi-cloud/on-prem, identical across research and production.

Python 974 37 Updated Nov 8, 2024

A native PyTorch Library for large model training

Python 2,577 200 Updated Nov 5, 2024

dstack is a lightweight, open-source alternative to Kubernetes & Slurm, simplifying AI container orchestration with multi-cloud & on-prem support. It natively supports NVIDIA, AMD, & TPU.

Python 1,490 153 Updated Nov 7, 2024

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 31,463 3,746 Updated Nov 8, 2024

Reverse Engineering the Abstraction and Reasoning Corpus

Jupyter Notebook 184 32 Updated Oct 7, 2024

SkyPilot: Run AI and batch jobs on any infra (Kubernetes or 12 clouds). Get unified execution, cost savings, and high GPU availability via a simple interface.

Python 6,755 503 Updated Nov 8, 2024

Annotated version of the Mamba paper

Jupyter Notebook 455 18 Updated Feb 27, 2024

Machine Learning Engineering Open Book

Python 11,603 705 Updated Nov 6, 2024

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model

Python 2,966 197 Updated Sep 19, 2024

DocLLM: A layout-aware generative language model for multimodal document understanding

112 5 Updated Jan 3, 2024

Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs

Python 2,182 143 Updated Nov 8, 2024

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Jupyter Notebook 2,566 346 Updated Nov 2, 2024

Algorithms, papers, datasets, performance comparisons for Document AI. Continuously updating.

161 3 Updated Aug 29, 2024

Serving multiple LoRA finetuned LLM as one

Python 979 46 Updated May 8, 2024

S-LoRA: Serving Thousands of Concurrent LoRA Adapters

Python 1,744 95 Updated Jan 21, 2024

Sampling profiler for Python programs

Rust 12,792 429 Updated Nov 2, 2024

Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.

HTML 9,049 747 Updated Nov 7, 2024

An unofficial Pytorch implementation of ERNIE-Layout which is originally released through PaddleNLP.

Python 98 11 Updated Nov 15, 2023

docTR (Document Text Recognition) - a seamless, high-performing & accessible library for OCR-related tasks powered by Deep Learning.

Python 3,827 436 Updated Nov 8, 2024

Example of a python monorepo using pip, the poetry backend, and Pants

Python 81 11 Updated Dec 4, 2023

Domain Specific Language for the Abstraction and Reasoning Corpus

Python 206 46 Updated Oct 11, 2024
Python 4 Updated Apr 5, 2024

batched loras

Python 336 15 Updated Sep 6, 2023

We identify the desiderata for a comprehensive benchmark and propose Visually Rich Document Understanding (VRDU). VRDU contains two datasets that represent several challenges: rich schema including…

74 5 Updated Feb 8, 2023
Next