Ravi-Teja-konda

Ravi Teja Ravi-Teja-konda

14 followers · 7 following

in/ravi-teja-konda

Achievements

Lists (1)

Sort

✨ Inspiration

1 repository

Beta Lists are currently in beta. Share feedback and report bugs.

Starred repositories

butzist / ActivityLauncher

Activity launcher creates shortcuts for any installed app and hidden activities to launch them with ease

Kotlin 898 170 Updated May 31, 2024

openai / swarm

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 14,945 1,382 Updated Oct 15, 2024

feder-cr / Auto_Jobs_Applier_AIHawk

Auto_Jobs_Applier_AIHawk is a tool that automates the jobs application process. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized…

Python 19,066 2,844 Updated Oct 27, 2024

zyayoung / Awesome-Video-LLMs

Explore VLM-Eval, a framework for evaluating Video Large Language Models, enhancing your video analysis with cutting-edge AI technology.

Python 28 2 Updated Jan 20, 2024

merveenoyan / smol-vision

Recipes for shrinking, optimizing, customizing cutting edge vision models. 💜

Jupyter Notebook 789 78 Updated Sep 11, 2024

michaeltrs / Text2Face

Python 14 1 Updated May 17, 2024

hiyouga / LLaMA-Factory

Unified Efficient Fine-Tuning of 100 LLMs (ACL 2024)

Python 33,099 4,074 Updated Oct 28, 2024

nachiketashunya / Amazon-ML-Challenge-2024

This repo is for Amazon ML Challenge 2024. The challenge was to develop a Machine Learning model to extract product details directly from the product images.

Python 47 8 Updated Sep 24, 2024

OpenBMB / MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Python 12,399 869 Updated Oct 22, 2024

QwenLM / Qwen2-VL

Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Python 2,795 162 Updated Oct 4, 2024

bklieger-groq / g1

g1: Using Llama-3.1 70b on Groq to create o1-like reasoning chains

Python 3,766 344 Updated Oct 7, 2024

roboflow / maestro

streamline the fine-tuning process for multimodal models: PaliGemma, Florence-2, and Qwen2-VL

Python 1,374 101 Updated Oct 24, 2024

kornia / kornia-rs

Low-level Computer Vision library in Rust

Rust 186 18 Updated Oct 28, 2024

Meituan-AutoML / MobileVLM

Strong and Open Vision Language Assistant for Mobile Devices

Python 1,019 66 Updated Apr 15, 2024

PKU-YuanGroup / Video-LLaVA

【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 2,940 212 Updated Sep 25, 2024

autodistill / autodistill-florence-2

Use Florence 2 to auto-label data for use in training fine-tuned object detection models.

Python 59 7 Updated Aug 15, 2024

AlexanderMelde / SPHAR-Dataset

Surveillance Perspective Human Action Recognition Dataset: 7759 Videos from 14 Action Classes, aggregated from multiple sources, all cropped spatio-temporally and filmed from a surveillance-camera …

Python 85 16 Updated Sep 28, 2020

marslanm / Multimodality-Representation-Learning

This repository provides a comprehensive collection of research papers focused on multimodal representation learning, all of which have been cited and discussed in the survey just accepted https://…

68 7 Updated Oct 19, 2023

drmuskangarg / Multimodal-datasets

This repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the informati…

246 18 Updated Jan 10, 2022

remyxai / VQASynth

Compose multimodal datasets 🎹

Python 197 9 Updated Oct 26, 2024

andimarafioti / florence2-finetuning

Quick exploration into fine tuning florence 2

Jupyter Notebook 265 24 Updated Sep 19, 2024

DAMO-NLP-SG / VideoLLaMA2

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Python 814 56 Updated Oct 26, 2024

GrantCuster / gemini-spatial-example

How to use bounding boxes with the Gemini API

TypeScript 86 11 Updated Jun 23, 2024

gpt-engineer-org / gpt-engineer

Platform to experiment with the AI Software Engineer. Terminal based. NOTE: Very different from https://gptengineer.app

Python 52,250 6,802 Updated Sep 12, 2024

yunlong10 / Awesome-LLMs-for-Video-Understanding

🔥🔥🔥Latest Papers, Codes and Datasets on Vid-LLMs.

1,456 73 Updated Oct 9, 2024

amanchadha / llm-course

Course on LLMs: Building Personalized Customer Chatbots •

Jupyter Notebook 21 9 Updated May 19, 2024

NVlabs / VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Python 1,936 157 Updated Oct 24, 2024

mbzuai-oryx / Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted fo…

Python 1,193 106 Updated Aug 27, 2024

vikhyat / moondream

tiny vision language model

Jupyter Notebook 5,156 447 Updated Oct 26, 2024

orpatashnik / StyleCLIP

Official Implementation for "StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery" (ICCV 2021 Oral)

HTML 3,990 561 Updated May 30, 2023

Ravi Teja Ravi-Teja-konda

Lists (1)

✨ Inspiration

Starred repositories

gpt-4-vision

chatgpt-plugins