- Stuttgart, Germany
- http://indiegamr.com
Stars
Official code implementation of General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model
"LightRAG: Simple and Fast Retrieval-Augmented Generation"
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
High-resolution models for human tasks.
TypeScript notebook for rapid prototyping
A simple, easy-to-hack GraphRAG implementation
High performance HTML and CSS renderer powered by WGPU
mPLUG-Owl: The Powerful Multi-modal Large Language Model Family
TypeScript AI agent platform with Autonomous agents, Software developer agents, AI code review agents and more
faster-whisper livestream translation, OBS noise reduction, dual language subtitles
Speech-to-text, text-to-speech, speaker recognition, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 …
Zero-dependent. A native nodejs screenshots library for Mac、Windows、Linux.
This package allows you to retrieve precise information about active and open windows on Windows, MacOS, and Linux. You can obtain the position, size, title, and other memory of windows.
Official Implementation of "Lumina-mGPT: Illuminate Flexible Photorealistic Text-to-Image Generation with Multimodal Generative Pretraining"
A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
Code for our INTERSPEECH paper Simul-Whisper: Attention-Guided Streaming Whisper with Truncation Detection
A python package to build AI-powered real-time audio applications
Run PyTorch LLMs locally on servers, desktop and mobile
A small Rust library that lets you get position, size, title and a few other properties of the active window on Windows, MacOS and Linux
Automate code reviews, patching and documentation with self-hosted LLM workflows.
SGLang is a fast serving framework for large language models and vision language models.
Langtrace 🔍 is an open-source, Open Telemetry based end-to-end observability tool for LLM applications, providing real-time tracing, evaluations and metrics for popular LLMs, LLM frameworks, vector…
PraisonAI application combines AutoGen and CrewAI or similar frameworks into a low-code solution for building and managing multi-agent LLM systems, focusing on simplicity, customisation, and effici…
One-click deploy of a Knowledge Graph powered RAG (GraphRAG) in Azure
A modular graph-based Retrieval-Augmented Generation (RAG) system