visual-language-models

Star

Here are 16 public repositories matching this topic...

THUDM / CogVLM

Star

a state-of-the-art-level open visual language model | 多模态预训练模型

pretrained-models language-model multi-modal cross-modality visual-language-models

Updated May 29, 2024
Python

camel-ai / crab

Sponsor

Star

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/

multi-agent-systems gui-automation large-language-models language-model-agent visual-language-models

Updated Nov 22, 2024
Python

bilel-bj / ROSGPT_Vision

Star

Commanding robots using only Language Models' prompts

robotics language-models ros2 robotic-vision large-language-models llm prompt-engineering chatgpt language-models-are-next robotic-design-patterns prompting-robotic-modalities visual-language-models

Updated Aug 7, 2024
Python

hk-zh / language-conditioned-robot-manipulation-models

Star

https://arxiv.org/abs/2312.10807

reinforcement-learning imitation-learning robot-manipulation neural-symbolic foundation-models visual-language-models language-conditioned-learning large-languge-models

Updated Dec 1, 2024

AlignGPT-VL / AlignGPT

Star

Official repo for "AlignGPT: Multi-modal Large Language Models with Adaptive Alignment Capability"

large-language-models multimodal-large-language-models visual-language-models

Updated Jul 12, 2024
Python

tianyu-z / VCR

Star

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

benchmark deep-learning visual-language-models

Updated Dec 18, 2024
Python

Sid2697 / HOI-Ref

Star

Code implementation for paper titled "HOI-Ref: Hand-Object Interaction Referral in Egocentric Vision"

dataset dataset-generation vlm hand-object-interaction egocentric-vision large-language-models visual-language-models

Updated Apr 16, 2024
Python

xinyanghuang7 / Basic-Visual-Language-Model

Star

Build a simple basic multimodal large model from scratch. 从零搭建一个简单的基础多模态大模型🤖

visual-language-learning large-language-models visual-language-models multimodel-large-language-model

Updated Jun 19, 2024
Python

jaisidhsingh / CoN-CLIP

Star

Implementation of the "Learn No to Say Yes Better" paper.

deep-learning pytorch multimodal compositionality image-captions image-text-matching visual-language-models

Updated Nov 2, 2024
Python

amathislab / wildclip

Star

Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models

behavior computer-vision clip camera-trap computervision visual-language-models

Updated Mar 8, 2024
Python

sduzpf / UAP_VLP

Star

Universal Adversarial Perturbations for Vision-Language Pre-trained Models

deep-neural-networks adversarial-attacks visual-language-models

Updated Oct 4, 2024
Python

declare-lab / Sealing

Star

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

multimodality video-understanding video-question-answering visual-language-models naacl2024

Updated Jul 25, 2024
Python

csebuetnlp / IllusionVQA

Star

This repository contains the data and code of the paper titled "IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models"

vqa vqa-dataset optical-illusions visual-language-models

Updated Oct 6, 2024
Jupyter Notebook

GraphPKU / CoI

Star

Chain of Images for Intuitively Reasoning

chatbot llama multimodal chatgpt llava visual-language-models gpt4v dalle3 chain-of-throught chain-of-image

Updated Nov 29, 2023
Python

CristianoPatricio / concept-based-interpretability-VLM

Star

Code for the paper "Towards Concept-based Interpretability of Skin Lesion Diagnosis using Vision-Language Models", ISBI 2024 (Oral).

deep-learning medical-imaging clip interpretability explainable-ai skin-lesion-classification melanoma-diagnosis concept-based-explanations visual-language-models ieee-isbi

Updated Jun 5, 2024
Jupyter Notebook

laclouis5 / uform-coreml-converters

Star

CLI for converting UForm models to CoreML.

transformers coreml coremltools uform visual-language-models

Updated Jan 12, 2024
Python

Improve this page

Add a description, image, and links to the visual-language-models topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the visual-language-models topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

visual-language-models

Here are 16 public repositories matching this topic...

THUDM / CogVLM

camel-ai / crab

bilel-bj / ROSGPT_Vision

hk-zh / language-conditioned-robot-manipulation-models

AlignGPT-VL / AlignGPT

tianyu-z / VCR

Sid2697 / HOI-Ref

xinyanghuang7 / Basic-Visual-Language-Model

jaisidhsingh / CoN-CLIP

amathislab / wildclip

sduzpf / UAP_VLP

declare-lab / Sealing

csebuetnlp / IllusionVQA

GraphPKU / CoI

CristianoPatricio / concept-based-interpretability-VLM

laclouis5 / uform-coreml-converters

Improve this page

Add this topic to your repo