A collection of memory efficient attention operators implemented in the Triton language.
-
Updated
Jun 5, 2024 - Python
A collection of memory efficient attention operators implemented in the Triton language.
Triton implementation of FlashAttention2 that adds Custom Masks.
VIT inference in triton because, why not?
Triton implement of bi-directional (non-causal) linear attention
🧠️🖥️2️⃣️0️⃣️0️⃣️1️⃣️💾️📜️ The sourceCode:Triton category for AI2001, containing Triton programming language datasets
LAMB go brrr
A container of various PyTorch neural network modules written in Triton.
🌳️🌐️#️⃣️ The Bliss Browser Triton (ClosedAI) language support module, allowing Triton (ClosedAI) programs to be written in and ran within the browser.
Writing TensorRT plugins using Triton and Python
Triton implementation for FISTA (Experimental)
Add a description, image, and links to the triton-lang topic page so that developers can more easily learn about it.
To associate your repository with the triton-lang topic, visit your repo's landing page and select "manage topics."