Kalyan KS’ Post

JINA EMBEDDINGS 2 - Open Source Text Embeddings for Long Documents 1️⃣ Text embedding models are powerful tools for representing text as fixed-sized vectors. 2️⃣ Most existing open-source models, especially those built on architectures like BERT, struggle to represent lengthy documents.  3️⃣ Jina Embeddings v2, an open-source text embedding model addresses this issue. 4️⃣ Jina Embeddings v2 is capable of encoding long documents of up to 8192 tokens. 5️⃣ Jina Embeddings v2 not only achieves state-of-the-art performance on MTEB benchmark. 6️⃣ Jina Embeddings v2 matches the performance of OpenAI’s proprietary text-embedding-ada-002 model. ➡️ Jina Embeddings v2 (base model) link: https://lnkd.in/gYv_Xnhq ➡️ Jina Embeddings v2 (small model) link: https://lnkd.in/gSektkWa ✔️ For complete details, refer the paper (paper link in the comments) #nlproc #nlp #deeplearning #datascience #ai #generativeai #embeddings

  • No alternative text description for this image
Shriman Narayan

Generative Ai engineer, NLP engineer, LLMs, ChatBots, LangChain engineer

9mo

Kalyan KS Any guide to implement with Langchain ..

Like
Reply
Meenakshi A.

Technologist & Believer in Systems for People and People for Systems

9mo

Thanks for the good 😊

See more comments

To view or add a comment, sign in

Explore topics