Block or Report
Block or report warner-benjamin
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
A fusion of a linear layer and a cross entropy loss, written for pytorch in triton.
A collection of memory efficient attention operators implemented in the Triton language.
FlagGems is an operator library for large language models implemented in Triton Language.
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
A collection of GPT system prompts and various prompt injection/leaking knowledge.
A Native-PyTorch Library for LLM Fine-tuning
Automatically create Faiss knn indices with the most optimal similarity search parameters.
Easily use and train state of the art late-interaction retrieval methods (ColBERT) in any RAG pipeline. Designed for modularity and ease-of-use, backed by research.
Experiment of using Tangent to autodiff triton
Simple, minimal implementation of the Mamba SSM in one file of PyTorch.
Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
Scripts for fine-tuning Meta Llama3 with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supportin…
Machine Learning Engineering Open Book
Command-line sampling profiler for macOS and Linux
Best inference performance optimization framework for HuggingFace Diffusers on NVIDIA GPUs.
🛋 The AI and Generative Art platform for everyone
Multipack distributed sampler for fast padding-free training of LLMs
Accessible large language models via k-bit quantization for PyTorch.
A guidance language for controlling large language models.
A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)
FFCV-SSL Fast Forward Computer Vision for Self-Supervised Learning.
A playbook for systematically maximizing the performance of deep learning models.
Cramming the training of a (BERT-type) language model into limited compute.
Fast and memory-efficient exact attention
uploadcare / pillow-simd
Forked from python-pillow/PillowThe friendly PIL fork