Skip to content
View hanzz2007's full-sized avatar

Block or report hanzz2007

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

You like pytorch? You like micrograd? You love tinygrad! ❤️

Python 26,368 2,895 Updated Sep 28, 2024

A retargetable MLIR-based machine learning compiler and runtime toolkit.

C 2,578 577 Updated Sep 28, 2024

LonestarGPU: Irregular algorithms parallelized for GPUs

C 33 11 Updated Nov 11, 2019

Benchmark for measuring the performance of sparse and irregular memory access.

C 72 15 Updated Sep 26, 2024

Library for specialized dense and sparse matrix operations, and deep learning primitives.

C 842 181 Updated Sep 28, 2024

The book "Performance Analysis and Tuning on Modern CPU"

TeX 2,115 157 Updated Sep 26, 2024

ARM GCC 内联汇编参考手册 - 中文版

HTML 113 40 Updated Jul 24, 2023

Hands-On Practical MLIR Tutorial

C 296 40 Updated Oct 20, 2023

Simple, portable, and self-contained stacktrace library for C 11 and newer

C 663 69 Updated Sep 17, 2024

Unicode routines (UTF8, UTF16, UTF32) and Base64: billions of characters per second using SSE2, AVX2, NEON, AVX-512, RISC-V Vector Extension. Part of Node.js, WebKit/Safari and Bun.

C 1,116 69 Updated Sep 27, 2024

transformer tokenizers (e.g. BERT tokenizer) in C (WIP)

C 10 3 Updated Apr 7, 2022

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 27,549 4,053 Updated Sep 29, 2024

This repository contains integer operators on GPUs for PyTorch.

Python 178 48 Updated Sep 29, 2023

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,538 363 Updated Sep 13, 2024

📋 A list of open LLMs available for commercial use.

10,966 700 Updated Jul 5, 2024

GPTQ inference TVM kernel

Cuda 35 1 Updated Apr 25, 2024

Vulkan/CUDA/HIP/OpenCL/Level Zero/Metal Fast Fourier Transform library

C 1,521 91 Updated Sep 27, 2024

Virtual whiteboard for sketching hand-drawn like diagrams

TypeScript 1 Updated May 3, 2023

Provides very lightweight outcome<T> and result<T> (non-Boost edition)

C 702 62 Updated Sep 4, 2024

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,328 2,910 Updated Sep 2, 2024

Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/sp…

Python 1,662 103 Updated Aug 29, 2023

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Python 132,703 26,444 Updated Sep 28, 2024

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,145 540 Updated Sep 27, 2024

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 36,421 5,721 Updated Aug 19, 2024

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.

Python 7,119 502 Updated Sep 18, 2024
C 81 22 Updated Sep 23, 2022

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Python 36,551 4,505 Updated Sep 25, 2024
Python 1,414 108 Updated May 12, 2023

onnxruntime-extensions: A specialized pre- and post- processing library for ONNX Runtime

C 321 82 Updated Sep 26, 2024
Next