Taka152

🎯

Focusing

Ying Xiong Taka152

🎯

Focusing

48 followers · 40 following

Microsoft
Beijing, China

Achievements

Starred repositories

google-ai-edge / model-explorer

A modern model graph visualizer and debugger

JavaScript 974 73 Updated Sep 12, 2024

google / aqt

Python 247 25 Updated Sep 13, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 1,485 57 Updated Sep 12, 2024

meta-llama / llama3

The official Meta Llama 3 GitHub site

Python 26,077 2,921 Updated Aug 12, 2024

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 23,245 2,582 Updated Aug 26, 2024

NVIDIA / warp

A Python framework for high performance GPU simulation and graphics

Python 4,086 226 Updated Sep 9, 2024

zeux / calm

CUDA/Metal accelerated language model inference

C 364 13 Updated Sep 3, 2024

stas00 / ml-engineering

Machine Learning Engineering Open Book

Python 10,949 652 Updated Sep 12, 2024

xai-org / grok-1

Grok open release

Python 49,414 8,326 Updated Aug 30, 2024

hpcaitech / Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Python 21,591 2,076 Updated Aug 9, 2024

openai / transformer-debugger

Python 4,005 231 Updated Jun 4, 2024

hashhar / dash-contrib-docset-feeds

A collection of Dash's user contributed docset feed for using with Zeal

Shell 424 26 Updated Sep 2, 2024

microsoft / Olive

Olive: Simplify ML Model Finetuning, Conversion, Quantization, and Optimization for CPUs, GPUs and NPUs.

Python 1,511 159 Updated Sep 13, 2024

outline / outline

The fastest knowledge base for growing teams. Beautiful, realtime collaborative, feature packed, and markdown compatible.

TypeScript 27,278 2,177 Updated Sep 13, 2024

microsoft / onnxruntime-genai

Generative AI extensions for onnxruntime

C 418 95 Updated Sep 13, 2024

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Python 5,239 499 Updated Jul 31, 2024

karpathy / minbpe

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 9,025 830 Updated Jul 1, 2024

IST-DASLab / marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 557 45 Updated Sep 4, 2024

AutoGPTQ / AutoGPTQ

An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.

Python 4,319 461 Updated Aug 19, 2024

wejoncy / QLLM

A general 2-8 bits quantization toolbox with GPTQ/AWQ/HQQ, and export to onnx/onnx-runtime easily.

Python 141 12 Updated Aug 28, 2024

SJTU-IPADS / PowerInfer

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

C 7,867 403 Updated Sep 6, 2024

astral-sh / ruff

An extremely fast Python linter and code formatter, written in Rust.

Rust 30,921 1,021 Updated Sep 13, 2024

ml-explore / mlx

MLX: An array framework for Apple silicon

C 16,416 935 Updated Sep 13, 2024

LC044 / WeChatMsg

提取微信聊天记录，将其导出成HTML、Word、Excel文档永久保存，对聊天记录进行分析生成年度聊天报告，用聊天数据训练专属于个人的AI聊天助手

Python 32,975 3,458 Updated Jul 20, 2024

linexjlin / GPTs

leaked prompts of GPTs

28,267 3,801 Updated Sep 9, 2024

plasma-umass / scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

Python 11,564 388 Updated Sep 9, 2024

google-deepmind / graphcast

Python 4,500 559 Updated Aug 20, 2024

google / latexify_py

A library to generate LaTeX expression from Python code.

Python 7,156 379 Updated May 13, 2024

microsoft / inshellisense

IDE style command line auto complete

TypeScript 8,352 184 Updated Sep 10, 2024

intel / intel-extension-for-transformers

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Python 2,104 206 Updated Aug 26, 2024

Starred topics

Compiler

Natural language processing