Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improvement: Add vertex embeddings support #622

Merged
merged 7 commits into from
Oct 1, 2024

Conversation

elentaure
Copy link
Contributor

Title:

  • Add support for embeddings in vertex ai

Description:
Based on the embeddings code for google ai studio and modified to adapt to the vertex api differences.

Copy link
Contributor

@narengogi narengogi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried making a request to this provider in local, the request failed when the request is openai compliant, suggested changes in comments

Failing request

curl --location 'http://localhost:8787/v1/embeddings' \
--header 'x-portkey-provider: vertex-ai' \
--header 'x-portkey-vertex-region: us-central1' \
--header 'Content-Type: application/json' \
--header 'x-portkey-api-key: jd-' \
--header 'Authorization: ya29.c....' \
--header 'x-portkey-vertex-project-id: {{YOUR_PROJECT_ID}}' \
--data-raw '{
    "model": "textembedding-gecko@001",
    "input": "Hello this is a test",
}'

@@ -68,3 70,49 @@ export interface VertexLlamaChatCompleteStreamChunk {
created?: number;
provider?: string;
}

export const GoogleErrorResponseTransform: (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method can be moved to a utils.ts file (preferred), or just the embed.ts file if it's only being used there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved

import { GOOGLE_VERTEX_AI } from '../../globals';
import { generateInvalidProviderResponseError } from '../utils';

export const GoogleEmbedConfig: ProviderConfig = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The request structure is incorrect, you've typed params as having type VertexEmbedParams, this wouldn't be OpenAI compliant. The gateway transforms an OpenAI embeddings request to a Vertex embeddings request, so the code should be as below

export interface EmbedInstancesData {
  content: string;
}

export const GoogleEmbedConfig: ProviderConfig = {
  input: {
    param: 'instances',
    required: true,
    transform: (params: EmbedParams): Array<EmbedInstancesData> => {
      const instances = Array<EmbedInstancesData>();
      if (Array.isArray(params.input)) {
        params.input.forEach((text) => {
          instances.push({
            content: text,
          });
        });
      } else {
        instances.push({
          content: params.input,
        });
      }
      return instances;
    },
  },
  parameters: {
    param: 'parameters',
    required: false,
  },
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better implementation, with support for task_type

export interface EmbedInstancesData {
  content: string;
}

enum TASK_TYPE {...}

interface GoogleEmbedParams extends EmbedParams {
  task_type: TASK_TYPE | string
}

export const GoogleEmbedConfig: ProviderConfig = {
  input: {
    param: 'instances',
    required: true,
    transform: (params: EmbedParams): Array<EmbedInstancesData> => {
      const instances = Array<EmbedInstancesData>();
      if (Array.isArray(params.input)) {
        params.input.forEach((text) => {
          instances.push({
            content: text,
            task_type: params.task_type
          });
        });
      } else {
        instances.push({
          content: params.input,
        });
      }
      return instances;
    },
  },
  parameters: {
    param: 'parameters',
    required: false,
  },
};

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified according to suggestions. Please check if now works ok with task type and the SDK

Copy link
Contributor

@narengogi narengogi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!!

@VisargD
Copy link
Collaborator

VisargD commented Oct 1, 2024

Thanks for the PR! We will merge this today.

narengogi added a commit to Portkey-AI/docs-core that referenced this pull request Oct 1, 2024
@VisargD VisargD merged commit 4840893 into Portkey-AI:main Oct 1, 2024
1 check passed
@VisargD
Copy link
Collaborator

VisargD commented Oct 3, 2024

Hey @elentaure - I noticed one difference while going through the vertex embeddings documentation. The gateway is currently picking up the usage object from response.metadata.billableCharacterCount. But this will give us the character count and not the token count. To keep it OpenAI schema compliant, it would be better to use statistics.token_count for usage object.

Documentation for the tokenCount parameter: https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text-embeddings-api#response_body

{
  "predictions": [
    {
      "embeddings": {
        "statistics": {
          "truncated": boolean,
          "token_count": integer
        },
        "values": [ number ]
      }
    }
  ]
}

I will raise a quick PR to make this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants