Simple Gradio application integrated with Hugging Face Multimodals to support visual question answering chatbot and more features
docker text-to-speech computer-vision gradio vlm visual-question-answering llm mllm vision-foundation-model image-text-to-text florence-2 xtts-v2 mini-internvl
-
Updated
Aug 16, 2024 - Python