LLM Guardrails & Observability for Developers

Protect against jailbreaks, hallucinations, and malicious inputs, while monitoring LLM behavior with detailed traces, spans, and metrics.

Hero

Lightening fast evaluators

Check user inputs and LLM outputs with our built-in evaluators, or create your custom guardrails. We support text, conversations, RAG, and even tools.

Jailbreak & Prompt Injection

Prevent your LLM from getting jailbroken using prompt injection and other latest threat vectors.

Groundedness

Check if the LLM output is grounded in RAG contextual documents, and not hallucinating.

Sensitive Topics

Detect political, legal, medical, or even religious content from being discussed.

Profanity & Hate

Filter out offensive and threatening language as well as hate speech from your LLM.

PII Detection

Catch personally identifiable information from your LLM outputs, especially when using RAG.

JSON Validation

Ensure LLM output is valid JSON, and conforms to a specific JSON Schema.

Secrets Detector

Detect the leaking of private keys, tokens, and other secrets within your outputs.

Competitor Blocklist

Block competitors from being mentioned in your LLM outputs and avoid embarrassement.

Tone & Mood

Identify and grade the tone and mood of your LLM outputs to ensure they are appropriate.

Language Detection

Ensure the LLM output matches the user input to avoid language confusion.

Topical Relevance

Compare input and output embeddings to ensure the LLM is on topic and relevant.

Emotion Analysis

Detect the emotions in the content and ensure they are appropriate.

Sentiment Analysis

Determines if a payload has a positive, negative, or neutral sentiment.

Embeddings Similarity

Compares how similar two embeddings are (e.g., user input vs LLM output, or vs a reference text).

Our evaluators integrate with the world's leading AI providers.

Meta AI
AWS BEDROCK AI
AZURE AI
OPENAI
GOOGLE AI
ANTHROPIC AI

Use our SDK to integrate Modelmetry into your application.

const modelmetry = new ModelmetryClient() const guardrails = modelmetry.guardrails() const result = await guardrails.checkText("What does the employee handbook say about vacation time during a busy period?", { guardrailId: "grd_jaohsfzgcbd523hbt1grwmvp", }) if (result.failed) { // handle the failure return "Sorry user, I cannot help you with this query at the moment." for (const entry of result.summarisedEntries) { // You can have access to more data for debugging (scores, evaluation(s) that failed) in the Check console.log(entry) } } // carry on as normal
Download @modelmetry/sdk on NPMDownload modelmetry-sdk on PyPIView @modelmetry on Github

Powerful features. Simple to use.

Modelmetry is designed, from the ground up, to be a powerful yet simple-to-use platform for managing and monitoring your LLM applications.

Guardrails

Enforce safety and quality guidelines for your LLM application with real-time checks on user inputs and model outputs.

Guardrails

Advanced Grading

Create dead-simple or highly customizable pass/fail rules for evaluations with an expressive grading system which offers granular control.

Advanced Grading

Metrics

Track key performance indicators (KPIs) like latency, cost, and token usage for optimized performance and cost management.

Metrics

Observability

Use instrumentation, traces, spans, and events to monitor and gain deep insights into your LLM application's behavior.

Observability

Evaluators

Leverage advanced, pre-built evaluators for detecting threats, jailbreaks, toxicity, and other qualitative metrics.

Evaluators

Customizable

Tailor Modelmetry to your specific needs with configurable evaluations, metrics, and automations. Or use sane defaults.

Customizable

Webhooks

Integrate seamlessly with your existing tools and services through flexible webhook integrations.

Webhooks

Automations

Streamline workflows and enforce policies by triggering actions based on real-time data from your LLM application.

Automations

Open-Source SDK

Easily instrument your LLM application with our lightweight, open-source SDKs for Python and Typescript.

Open-Source SDK

Custom Roles

Manage access and permissions across your team with granular custom roles and permissions.

Custom Roles

Changelog

Stay up-to-date with the latest changes and improvements to Modelmetry.

  • today
    Add RAG documents to User messages in the payload editor
    feature
  • Dec 12, 2024
    Autocomplete Hints in Automation Rule Editor
    improvement
  • Dec 1, 2024
    Dashboards with analytics
    feature
  • Nov 18, 2024
    Clickable hints in grading editor
    improvement
  • last month
    New flexible grading system
    feature
  • last month
    Evaluate a payload by config (JS SDK)
    improvement
  • last month
    Copy and paste snippets to use guardrails in your code
    feature
  • Oct 7, 2024
    Collapsible sidebar for more
    improvement
  • Sep 18, 2024
    Test evaluators in app
    feature
  • Sep 14, 2024
    Attach secrets to evaluators
    improvement
15 more changes

Simple pricing

Our pricing is simple and transparent.

Hobby

$99.99 /year
  • Evaluators
    All
  • Traces
    Unlimited
  • Findings
    Unlimited
  • Guardrail Checks
    500/mth
  • Spans
    500/mth
  • Team Members
    Includes 1
  • Roles
    Standard

Business

$699.99 /year
  • Evaluators
    All
  • Traces
    Unlimited
  • Findings
    Unlimited
  • Guardrail Checks
    200,000/mth
  • Spans
    200,000/mth
  • Team Members
    Includes 5
  • Roles
    Custom

Have any questions? Use our live chat or email us at [email protected]

What is Modelmetry?

Modelmetry is an advanced platform designed to enhance the safety, quality, and appropriateness of data and models in applications utilizing Large Language Models (LLMs) like chatbots. It offers a comprehensive suite of evaluators to assess critical aspects such as emotion analysis, PII leak detection, text moderation, relevancy, and security threat detection.

With customizable guardrails, early termination options, and detailed metrics and scores, Modelmetry ensures that your LLM applications meet high standards of performance and safety. This robust framework provides actionable insights, safeguarding the integrity and effectiveness of your AI-driven solutions.

Who is Modelmetry for?

Modelmetry is ideal for developers and software engineers aiming to ensure their AI-driven applications are safe, reliable, and compliant with regulations.

Do you have a free plan or free trial?

Not at this stage. We prefer to focus on profitability so that we can provide a better service to our paying customers. We offer a very affordable Hobby plan at $9.99/month so that you can try our service without breaking the bank.

Can I build my own evaluators, or can I only use built-in ones?

Modelmetry allows you to create and use custom evaluators tailored to your specific needs. This flexibility enables you to define criteria and metrics that are most relevant to your application, ensuring a more precise and effective evaluation process. You can also simply use an LLM-as-a-Judge evaluator with your own prompt.

Is Modelmetry open source?

Our client SDKs are open source. Our backend is proprietary because, well, it's our secret sauce. We can expert all your data upon request.

We believe in keeping it simple for our customers and this means minimizing the hurdles to adoption. Our platform works out of the box; it's simple yet comprehensive. And if you want to leave, we can export all your data for you to download!

How does Modelmetry handles data privacy and security?

Modelmetry is committed to protecting your data privacy and security. We do not access payloads on your behalf, ever. We are a security-focused company and have implemented robust measures to ensure the confidentiality and integrity of your data. We use encryption, secure connections, and other industry-standard security practices to safeguard your data.

Do you store inputs and outputs?

We do store inputs and outputs so you can review them alongside metrics and scores. We do not access payloads on your behalf, ever, unless you authorised us to do so.