JINA EMBEDDINGS 2 - Open Source Text Embeddings for Long Documents 1️⃣ Text embedding models are powerful tools for representing text as fixed-sized vectors. 2️⃣ Most existing open-source models, especially those built on architectures like BERT, struggle to represent lengthy documents. 3️⃣ Jina Embeddings v2, an open-source text embedding model addresses this issue. 4️⃣ Jina Embeddings v2 is capable of encoding long documents of up to 8192 tokens. 5️⃣ Jina Embeddings v2 not only achieves state-of-the-art performance on MTEB benchmark. 6️⃣ Jina Embeddings v2 matches the performance of OpenAI’s proprietary text-embedding-ada-002 model. ➡️ Jina Embeddings v2 (base model) link: https://lnkd.in/gYv_Xnhq ➡️ Jina Embeddings v2 (small model) link: https://lnkd.in/gSektkWa ✔️ For complete details, refer the paper (paper link in the comments) #nlproc #nlp #deeplearning #datascience #ai #generativeai #embeddings
Kalyan KS Any guide to implement with Langchain ..
Thanks for the good 😊
Paper link: https://arxiv.org/pdf/2310.19923.pdf