-
Plus
- Bay Area
- yasenh.github.io
Lists (6)
Sort Name ascending (A-Z)
Stars
generate release PRs based on the conventionalcommits.org spec
A project for 3D multi-object tracking
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
ClickHouse® is a real-time analytics DBMS
A curated list of radar datasets, detection, tracking and fusion
Refine high-quality datasets and visual AI models
Easily turn large sets of image urls to an image dataset. Can download, resize and package 100M urls in 20h on one machine.
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
🎉 Modern CUDA Learn Notes with PyTorch: CUDA Cores, Tensor Cores, fp32/tf32, fp16/bf16, fp8/int8, flash_attn, rope, sgemm, hgemm, sgemv, warp/block reduce, elementwise, softmax, layernorm, rmsnorm.
ShellCheck, a static analysis tool for shell scripts
Radar Camera Fusion in Autonomous Driving
A framework for managing and maintaining multi-language pre-commit hooks.
Modular visual interface for GDB in Python
中文的C Template的教学指南。与知名书籍C Templates不同,该系列教程将C Templates作为一门图灵完备的语言来讲授,以求帮助读者对Meta-Programming融会贯通。(正在施工中)
Learn how to design large-scale systems. Prep for the system design interview. Includes Anki flashcards.
Implementation of popular deep learning networks with TensorRT network definition API
Collaborative Collection of C Best Practices. This online resource is part of Jason Turner's collection of C Best Practices resources. See README.md for more information.
NVIDIA DLA-SW, the recipes and tools for running deep learning workloads on NVIDIA DLA cores for inference applications.
A GPU benchmark tool for evaluating GPUs and CPUs on mixed operational intensity kernels (CUDA, OpenCL, HIP, SYCL, OpenMP)
A project demonstrating how to use the libs of cuPCL.