streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL
-
Updated
Dec 23, 2024 - Python
streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL
🔥🔥 LLaVA : Extending LLaVA with Phi-3 and LLaMA-3 (LLaVA LLaMA-3, LLaVA Phi-3)
Azure OpenAI (demos, documentation, accelerators).
Phi-3.5 for Mac: Locally-run Vision and Language Models for Apple Silicon
Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision.
Phi-3-Vision model test - running locally
Microsoft Phi-3 Vision-the first Multimodal model By Microsoft- Demo With Huggingface
Microsoft Phi-3 Vision-the first Multimodal model By Microsoft- Demo With Huggingface
Add a description, image, and links to the phi-3-vision topic page so that developers can more easily learn about it.
To associate your repository with the phi-3-vision topic, visit your repo's landing page and select "manage topics."