Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: openai default model uses gpt-4o-mini #1526

Merged
merged 4 commits into from
Sep 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/components/llms/models/litellm.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 12,7 @@ config = {
"llm": {
"provider": "litellm",
"config": {
"model": "gpt-3.5-turbo",
"model": "gpt-4o-mini",
"temperature": 0.2,
"max_tokens": 1500,
}
Expand Down
2 changes: 1 addition & 1 deletion embedchain/configs/chroma.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 5,7 @@ app:
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down
2 changes: 1 addition & 1 deletion embedchain/configs/full-stack.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 10,7 @@ chunker:
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down
2 changes: 1 addition & 1 deletion embedchain/configs/opensearch.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 8,7 @@ app:
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down
8 changes: 4 additions & 4 deletions embedchain/docs/api-reference/advanced/configuration.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 20,7 @@ app:
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down Expand Up @@ -82,7 82,7 @@ cache:
"llm": {
"provider": "openai",
"config": {
"model": "gpt-3.5-turbo",
"model": "gpt-4o-mini",
"temperature": 0.5,
"max_tokens": 1000,
"top_p": 1,
Expand Down Expand Up @@ -140,7 140,7 @@ config = {
'llm': {
'provider': 'openai',
'config': {
'model': 'gpt-3.5-turbo',
'model': 'gpt-4o-mini',
'temperature': 0.5,
'max_tokens': 1000,
'top_p': 1,
Expand Down Expand Up @@ -206,7 206,7 @@ Alright, let's dive into what each key means in the yaml config above:
2. `llm` Section:
- `provider` (String): The provider for the language model, which is set to 'openai'. You can find the full list of llm providers in [our docs](/components/llms).
- `config`:
- `model` (String): The specific model being used, 'gpt-3.5-turbo'.
- `model` (String): The specific model being used, 'gpt-4o-mini'.
- `temperature` (Float): Controls the randomness of the model's output. A higher value (closer to 1) makes the output more random.
- `max_tokens` (Integer): Controls how many tokens are used in the response.
- `top_p` (Float): Controls the diversity of word selection. A higher value (closer to 1) makes word selection more diverse.
Expand Down
6 changes: 3 additions & 3 deletions embedchain/docs/components/llms.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 62,7 @@ app = App.from_config(config_path="config.yaml")
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down Expand Up @@ -205,7 205,7 @@ app = App.from_config(config_path="config.yaml")
llm:
provider: azure_openai
config:
model: gpt-3.5-turbo
model: gpt-4o-mini
deployment_name: your_llm_deployment_name
temperature: 0.5
max_tokens: 1000
Expand Down Expand Up @@ -887,7 887,7 @@ response = app.chat("Which companies did Elon Musk found?")
llm:
provider: openai
config:
model: gpt-3.5-turbo
model: gpt-4o-mini
temperature: 0.5
max_tokens: 1000
token_usage: true
Expand Down
2 changes: 1 addition & 1 deletion embedchain/docs/examples/rest-api/create.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 32,7 @@ app:
llm:
provider: openai
config:
model: "gpt-3.5-turbo"
model: "gpt-4o-mini"
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down
2 changes: 1 addition & 1 deletion embedchain/docs/get-started/faq.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 122,7 @@ You can achieve this by setting `stream` to `true` in the config file.
llm:
provider: openai
config:
model: 'gpt-3.5-turbo'
model: 'gpt-4o-mini'
temperature: 0.5
max_tokens: 1000
top_p: 1
Expand Down
87 changes: 54 additions & 33 deletions embedchain/embedchain/config/model_prices_and_context_window.json
Original file line number Diff line number Diff line change
@@ -1,6 1,6 @@
{
"openai/gpt-4": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 8192,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00003,
Expand All @@ -13,6 13,20 @@
"input_cost_per_token": 0.000005,
"output_cost_per_token": 0.000015
},
"gpt-4o-mini": {
"max_tokens": 4096,
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00000015,
"output_cost_per_token": 0.00000060
},
"gpt-4o-mini-2024-07-18": {
"max_tokens": 4096,
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00000015,
"output_cost_per_token": 0.00000060
},
"openai/gpt-4o-2024-05-13": {
"max_tokens": 4096,
"max_input_tokens": 128000,
Expand Down Expand Up @@ -153,7 167,7 @@
"openai/text-embedding-ada-002": {
"max_tokens": 8191,
"max_input_tokens": 8191,
"output_vector_size": 1536,
"output_vector_size": 1536,
"input_cost_per_token": 0.0000001,
"output_cost_per_token": 0.000000
},
Expand All @@ -176,7 190,7 @@
"max_output_tokens": 4096,
"input_cost_per_token": 0.000002,
"output_cost_per_token": 0.000002
},
},
"openai/gpt-3.5-turbo-instruct": {
"max_tokens": 4096,
"max_input_tokens": 8192,
Expand All @@ -197,6 211,13 @@
"max_output_tokens": 4096,
"input_cost_per_token": 0.000005,
"output_cost_per_token": 0.000015
},
"azure/gpt-4o-mini": {
"max_tokens": 4096,
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00000015,
"output_cost_per_token": 0.00000060
},
"azure/gpt-4-turbo-2024-04-09": {
"max_tokens": 4096,
Expand Down Expand Up @@ -325,7 346,7 @@
"max_input_tokens": 8191,
"input_cost_per_token": 0.00000002,
"output_cost_per_token": 0.000000
},
},
"mistralai/mistral-tiny": {
"max_tokens": 8191,
"max_input_tokens": 32000,
Expand Down Expand Up @@ -595,77 616,77 @@
"max_tokens": 8192,
"max_input_tokens": 32760,
"max_output_tokens": 8192,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.0-pro": {
"vertexai/gemini-1.0-pro": {
"max_tokens": 8192,
"max_input_tokens": 32760,
"max_output_tokens": 8192,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.0-pro-001": {
"vertexai/gemini-1.0-pro-001": {
"max_tokens": 8192,
"max_input_tokens": 32760,
"max_output_tokens": 8192,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.0-pro-002": {
"vertexai/gemini-1.0-pro-002": {
"max_tokens": 8192,
"max_input_tokens": 32760,
"max_output_tokens": 8192,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.5-pro": {
"vertexai/gemini-1.5-pro": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000000625,
"input_cost_per_token": 0.000000625,
"output_cost_per_token": 0.000001875
},
"vertexai/gemini-1.5-flash-001": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0,
"input_cost_per_token": 0,
"output_cost_per_token": 0
},
"vertexai/gemini-1.5-flash-preview-0514": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0,
"input_cost_per_token": 0,
"output_cost_per_token": 0
},
"vertexai/gemini-1.5-pro-001": {
"vertexai/gemini-1.5-pro-001": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000000625,
"input_cost_per_token": 0.000000625,
"output_cost_per_token": 0.000001875
},
"vertexai/gemini-1.5-pro-preview-0514": {
"vertexai/gemini-1.5-pro-preview-0514": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000000625,
"input_cost_per_token": 0.000000625,
"output_cost_per_token": 0.000001875
},
"vertexai/gemini-1.5-pro-preview-0215": {
"vertexai/gemini-1.5-pro-preview-0215": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000000625,
"input_cost_per_token": 0.000000625,
"output_cost_per_token": 0.000001875
},
"vertexai/gemini-1.5-pro-preview-0409": {
"max_tokens": 8192,
"max_input_tokens": 1000000,
"max_output_tokens": 8192,
"input_cost_per_token": 0.000000625,
"input_cost_per_token": 0.000000625,
"output_cost_per_token": 0.000001875
},
"vertexai/gemini-experimental": {
Expand All @@ -682,7 703,7 @@
"max_images_per_prompt": 16,
"max_videos_per_prompt": 1,
"max_video_length": 2,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.0-pro-vision": {
Expand All @@ -692,7 713,7 @@
"max_images_per_prompt": 16,
"max_videos_per_prompt": 1,
"max_video_length": 2,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/gemini-1.0-pro-vision-001": {
Expand All @@ -702,7 723,7 @@
"max_images_per_prompt": 16,
"max_videos_per_prompt": 1,
"max_video_length": 2,
"input_cost_per_token": 0.00000025,
"input_cost_per_token": 0.00000025,
"output_cost_per_token": 0.0000005
},
"vertexai/claude-3-sonnet@20240229": {
Expand All @@ -713,7 734,7 @@
"output_cost_per_token": 0.000015
},
"vertexai/claude-3-haiku@20240307": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 200000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00000025,
Expand All @@ -727,49 748,49 @@
"output_cost_per_token": 0.000075
},
"cohere/command-r": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.00000050,
"output_cost_per_token": 0.0000015
},
"cohere/command-light": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 4096,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000015,
"output_cost_per_token": 0.000015
},
"cohere/command-r-plus": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 128000,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000003,
"output_cost_per_token": 0.000015
},
"cohere/command-nightly": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 4096,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000015,
"output_cost_per_token": 0.000015
},
"cohere/command": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 4096,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000015,
"output_cost_per_token": 0.000015
},
"cohere/command-medium-beta": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 4096,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000015,
"output_cost_per_token": 0.000015
},
"cohere/command-xlarge-beta": {
"max_tokens": 4096,
"max_tokens": 4096,
"max_input_tokens": 4096,
"max_output_tokens": 4096,
"input_cost_per_token": 0.000015,
Expand Down
2 changes: 1 addition & 1 deletion embedchain/embedchain/llm/azure_openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 26,7 @@ def _get_answer(prompt: str, config: BaseLlmConfig) -> str:
chat = AzureChatOpenAI(
deployment_name=config.deployment_name,
openai_api_version=str(config.api_version) if config.api_version else "2024-02-01",
model_name=config.model or "gpt-3.5-turbo",
model_name=config.model or "gpt-4o-mini",
temperature=config.temperature,
max_tokens=config.max_tokens,
streaming=config.stream,
Expand Down
2 changes: 1 addition & 1 deletion embedchain/embedchain/llm/openai.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 52,7 @@ def _get_answer(self, prompt: str, config: BaseLlmConfig) -> str:
messages.append(SystemMessage(content=config.system_prompt))
messages.append(HumanMessage(content=prompt))
kwargs = {
"model": config.model or "gpt-3.5-turbo",
"model": config.model or "gpt-4o-mini",
"temperature": config.temperature,
"max_tokens": config.max_tokens,
"model_kwargs": config.model_kwargs or {},
Expand Down
Loading
Loading