Skip to content
View syed-ahmed's full-sized avatar

Organizations

@icgrp

Block or report syed-ahmed

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

llama3 implementation one matrix multiplication at a time

Jupyter Notebook 13,243 1,063 Updated May 23, 2024

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Python 1,153 77 Updated Oct 7, 2024

A curated list of practical guide resources of LLMs (LLMs Tree, Examples, Papers)

9,351 715 Updated May 31, 2024

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C 256 51 Updated Oct 7, 2024
Jupyter Notebook 335 19 Updated Oct 4, 2023

Run compilers interactively from your web browser and interact with the assembly

TypeScript 16,186 1,726 Updated Oct 7, 2024

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 269 44 Updated Nov 28, 2021

A tool based on Excalidraw to create stop motion animations and slides.

TypeScript 485 37 Updated Oct 2, 2024

Making DAG construction easier

Python 239 11 Updated Sep 9, 2024
VHDL 2 Updated Apr 20, 2021

helper scripts for vivado and vivado_hls build with cmake.

CMake 1 Updated Jan 27, 2021

Examples shown as part of the tutorial "Productive parallel programming on FPGA with high-level synthesis".

C 190 46 Updated Nov 14, 2021

a cheat-sheet for mathematical notation in code form

15,023 1,073 Updated Mar 8, 2022

FPGA SoC Linux Device Tree Overlay FPGA Manager U-Boot&Linux Kernel&Debian11 Images (for Xilinx:Zynq Ultrascale MPSoC)

125 38 Updated Jul 23, 2023

Example for ZynqMP-FPGA-XRT(Xilinx RunTime for ZynqMP-FPGA-Linux)

Ruby 6 1 Updated Jul 6, 2020

XRT(Xilinx Runtime) for ZynqMP-FPGA-Linux

Makefile 5 1 Updated May 19, 2023

Tool for updating the contents of BlockRAMs found in Xilinx 7 series bitstreams.

LLVM 17 4 Updated Feb 9, 2022

Scalable systolic array-based matrix-matrix multiplication implemented in Vivado HLS for Xilinx FPGAs.

C 297 51 Updated Mar 15, 2022

Soba frontend

1 Updated Feb 23, 2020
Python 1 Updated Feb 23, 2020

A booklet on machine learning systems design with exercises. NOT the repo for the book "Designing Machine Learning Systems"

HTML 8,978 1,415 Updated Apr 15, 2023

A collection of out-of-tree LLVM passes for teaching and learning

C 2,929 387 Updated Jul 28, 2024

Intro to Creative Coding workshop with p5.js and Tone.js

750 56 Updated Nov 22, 2022

A high-level performance analysis tool for FPGA-based accelerators

C 18 7 Updated Jun 2, 2017

A blog for LLVM(v9.0.0 or v11.0.0) beginner, step by step, with detailed documents and comments. Record the way I learn LLVM and accomplish a complete project for FPGA High-Level Synthesis with it.

C 101 23 Updated Jun 17, 2022

Saga is a mobile app that lets users team up and compete in local scavenger hunts comprised of challenging riddles, augmented reality games, and geolocation puzzles. It's like Escape Room, but for …

Objective-C 1 Updated Jul 28, 2017
Next