Stars
Automate browser-based workflows with LLMs and Computer Vision
📃 A better UX for chat, writing content, and coding with LLMs.
Open source hot backup tool for InnoDB and XtraDB databases
An open-source RAG-based tool for chatting with your documents.
TEN Agent is the world’s first real-time multimodal agent integrated with the OpenAI Realtime API, RTC, and features weather checks, web search, vision, and RAG capabilities.
Demo of set up for Web App Backend using FastAPI Async SQLAlchemy
CursorCore: Assist Programming through Aligning Anything
Generate descriptions from product images in multiple languages with AI
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
Inpaint anything using Segment Anything and inpainting models.
Code for Machine Learning for Algorithmic Trading, 2nd edition.
m3u8[m3u8-downloader] 视频在线提取工具 流媒体下载 、视频下载 、 m3u8下载 、 B站视频下载 桌面客户端 windows mac
The First Multimodal Seach Engine Pipeline and Benchmark for LMMs
Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Real-time voice-changer for voice-chat, etc. Will support many different voice-filters and features in the future. 🎵
the framework/ sdk that lets you build browser controlling agents in 3 lines of code. join chat @ https://discord.gg/umgnyQU2K8
A python tool that uses GPT-4, FFmpeg, and OpenCV to automatically analyze videos, extract the most interesting sections, and crop them for an improved viewing experience.
欢迎来到 Web3 世界,这里汇集了大量 Web3 示例项目和高质量学习资源。加入我们,和一百万开发者同行,一起探索并塑造未来世界的繁荣景象。立刻行动,开启您的 Web3 之旅!
PresentationGen是一个通过大语言模型生成PPT文件的SpringBoot Web应用。A SpringBoot web application that generates PPT files using a llm.
State-of-the-Art zero-shot voice conversion & singing voice conversion with in context learning
A feature-rich command-line audio/video downloader
Discover and converse with advanced AI models like Mistral, LLAMA2, and GPT-3.5 from leading sources like OLLAMA, Hugging Face, and OpenAI. Easily extract insights from PDFs, web pages, and YouTube…