Quick Start |
Documentation |
LangChain and
LlamaIndex Support |
Discord
www.getzep.com
Zep is a long-term memory service for AI Assistant apps. With Zep, you can provide AI assistants with the ability to recall past conversations, no matter how distant, while also reducing hallucinations, latency, and cost.
Zep persists and recalls chat histories, and automatically generates summaries and other artifacts from these chat histories. It also embeds messages and summaries, enabling you to search Zep for relevant context from past conversations. Zep does all of this asyncronously, ensuring these operations don't impact your user's chat experience. Data is persisted to database, allowing you to scale out when growth demands.
Zep also provides a simple, easy to use abstraction for document vector search called Document Collections. This is designed to complement Zep's core memory features, but is not designed to be a general purpose vector database.
Zep allows you to be more intentional about constructing your prompt:
- automatically adding a few recent messages, with the number customized for your app;
- a summary of recent conversations prior to the messages above;
- and/or contextually relevant summaries or messages surfaced from the entire chat session.
- and/or relevant Business data from Zep Document Collections.
Zep Cloud is a managed service with Zep Open Source at its core. In addition to Zep Open Source's memory management features, Zep Cloud offers:
- Fact Extraction: Automatically build fact tables from conversations, without having to define a data schema upfront.
- Dialog Classification: Instantly and accurately classify chat dialog. Understand user intent and emotion, segment users, and more. Route chains based on semantic context, and trigger events.
- Structured Data Extraction: Quickly extract business data from chat conversations using a schema you define. Understand what your Assistant should ask for next in order to complete its task.
With increased LLM context lengths, it may be tempting to include entire an chat history in a prompt, alongside RAG results, and other instructions. Unfortunately, we've seen poor recall, hallucinations, and slow and expensive inference as a result.
Our goal with Zep is to elevate the layer of abstraction for memory management. We believe developer productivity is best served by infrastructure with well-designed abstractions, rather than building peristence, summarization, extraction, embedding management, and search from the ground up.
No. Zep uses embeddings and vector database capaiblities under the hood to power many of its features, but is not designed to be a general purpose vector database.
Users, Sessions, and Chat Messages are first-class abstractions in Zep. This allows simple and flexible management of chat memory, including the execution of Right To Be Forgetten requests and other privacy compliance-related tasks with single-API call.
Yes - Zep offers Python & TypeScript/JS SDKs for easy integration with your Assistant app. We also have examples of using Zep with popular frameworks - see below.
Yes - the Zep team and community contributors have built integrations with Zep, making it simple to, for example, drop Zep's memory components into a LangChain app. Please see the Zep Documentation and your favorite framework's documentation for more.
Zep Open Source relies on an external LLM API service to function. OpenAI, Azure OpenAI, Anthropic, and OpenAI-compatible APIs are all supported.
- 🏎️ Quick Start Guide: Docker deployment, and coding, in < 5 minutes.
- 📚 Zep By Example: Learn how to use Zep by example.
- 🦙 Building Apps with LlamaIndex
- 🦜⛓️ Building Apps with LangChain
- 🛠️ Getting Started with TypeScript/JS or Python
user_request = CreateUserRequest(
user_id=user_id,
email="[email protected]",
first_name="Jane",
last_name="Smith",
metadata={"foo": "bar"},
)
new_user = client.user.add(user_request)
# create a chat session
session_id = uuid.uuid4().hex # A new session identifier
session = Session(
session_id=session_id,
user_id=user_id,
metadata={"foo" : "bar"}
)
client.memory.add_session(session)
# Add a chat message to the session
history = [
{ role: "human", content: "Who was Octavia Butler?" },
]
messages = [Message(role=m.role, content=m.content) for m in history]
memory = Memory(messages=messages)
client.memory.add_memory(session_id, memory)
# Get all sessions for user_id
sessions = client.user.getSessions(user_id)
const memory = new ZepMemory({
sessionId,
baseURL: zepApiURL,
apiKey: zepApiKey,
});
const chain = new ConversationChain({ llm: model, memory });
const response = await chain.run(
{
input="What is the book's relevance to the challenges facing contemporary society?"
},
);
Hybrid similarity search over a document collection with text input and JSONPath filters (TypeScript)
const query = "Who was Octavia Butler?";
const searchResults = await collection.search({ text: query }, 3);
// Search for documents using both text and metadata
const metadataQuery = {
where: { jsonpath: '$[*] ? (@.genre == "scifi")' },
};
const newSearchResults = await collection.search(
{
text: query,
metadata: metadataQuery,
},
3
);
from llama_index import VectorStoreIndex, SimpleDirectoryReader
from llama_index.vector_stores import ZepVectorStore
from llama_index.storage.storage_context import StorageContext
vector_store = ZepVectorStore(
api_url=zep_api_url,
api_key=zep_api_key,
collection_name=collection_name
)
documents = SimpleDirectoryReader("documents/").load_data()
storage_context = StorageContext.from_defaults(vector_store=vector_store)
index = VectorStoreIndex.from_documents(
documents,
storage_context=storage_context
)
# Search by embedding vector, rather than text query
# embedding is a list of floats
results = collection.search(
embedding=embedding, limit=5
)
Please see the Zep Quick Start Guide for important configuration information.
docker compose up
Looking for other deployment options?
Please see the Zep Develoment Guide for important beta information and usage instructions.
pip install zep-python
or
npm i @getzep/zep-js