-
Sichuan University
- https://lin-yijie.github.io
Highlights
- Pro
Block or Report
Block or report Lin-Yijie
Contact GitHub support about this user’s behavior. Learn more about reporting abuse.
Report abuseStars
Language
Sort by: Recently starred
a collection of AWESOME things about Optimal Transport in Deep Learning
[ECCV 2024] official code for "Long-CLIP: Unlocking the Long-Text Capability of CLIP"
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Source code of our MM'22 paper Partially Relevant Video Retrieval
The Paper List of Large Multi-Modality Model, Parameter-Efficient Finetuning, Vision-Language Pretraining, Conventional Image-Text Matching for Preliminary Insight.
Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: [email protected] [email protected] qinyang.gm…
This is a summary of research on noisy correspondence. There may be omissions. If anything is missing please get in touch with us. Our emails: [email protected] [email protected] qinyang.gm…
Official implementation of "Decoupled Contrastive Multi-View Clustering with High-Order Random Walks", [AAAI 2024].
Multi-granularity Correspondence Learning from Long-term Noisy Videos [ICLR 2024, Oral]
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, B…
A playbook for systematically maximizing the performance of deep learning models.
This repo contains the code and data of "Graph Matching with Bi-level Noisy Correspondence".
✨✨Latest Advances on Multimodal Large Language Models
Official code for VisProg (CVPR 2023 Best Paper!)
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
Temporal Alignment Representations with Contrastive Learning
A list of awesome papers and cool resources on optimal transport and its applications in general! As you will notice, this list is currently mostly focused on optimal transport for machine learning…
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
ChatGPT 中文调教指南。各种场景使用指南。学习怎么让它听你的话。
Google Drive Public File Downloader when Curl/Wget Fails
PyTorch implementation for Dual Contrastive Prediction for Incomplete Multi-view Representation Learning (TPAMI'22)
PyTorch GPU distributed training code for MIL-NCE HowTo100M
Audio Visual Instance Discrimination with Cross-Modal Agreement
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch