-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
litellm_2024_10_29 #8
Conversation
* docs 1k rps load test * docs load testing * docs load testing litellm * docs load testing * clean up load test doc * docs prom metrics for load testing * docs using prometheus on load testing * doc load testing with prometheus
* fixes for required values for gcs bucket * docs gcs bucket logging
* add yaml with all router settings * add docs for router settings * docs router settings litellm settings
* add prompt caching for latest models * add cache_read_input_token_cost for prompt caching models
* fix(litellm_logging.py): ensure cache hits are scrubbed if "turn_off_message_logging" is enabled * fix(sagemaker.py): fix streaming to raise error immediately Fixes BerriAI#6054 * (fixes) gcs bucket key based logging (BerriAI#6044) * fixes for gcs bucket logging * fix StandardCallbackDynamicParams * fix - gcs logging when payload is not serializable * add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket * working success callbacks * linting fixes * fix linting error * add type hints to functions * fixes for dynamic success and failure logging * fix for test_async_chat_openai_stream * fix handle case when key based logging vars are set as os.environ/ vars * fix prometheus track cooldown events on custom logger (BerriAI#6060) * (docs) add 1k rps load test doc (BerriAI#6059) * docs 1k rps load test * docs load testing * docs load testing litellm * docs load testing * clean up load test doc * docs prom metrics for load testing * docs using prometheus on load testing * doc load testing with prometheus * (fixes) docs + qa - gcs key based logging (BerriAI#6061) * fixes for required values for gcs bucket * docs gcs bucket logging * bump: version 1.48.12 → 1.48.13 * ci/cd run again * bump: version 1.48.13 → 1.48.14 * update load test doc * (docs) router settings - on litellm config (BerriAI#6037) * add yaml with all router settings * add docs for router settings * docs router settings litellm settings * (feat) OpenAI prompt caching models to model cost map (BerriAI#6063) * add prompt caching for latest models * add cache_read_input_token_cost for prompt caching models * fix(litellm_logging.py): check if param is iterable Fixes BerriAI#6025 (comment) * fix(factory.py): support passing an "assistant_continue_message" to prevent bedrock error Fixes BerriAI#6053 * fix(databricks/chat): handle streaming responses * fix(factory.py): fix linting error * fix(utils.py): unify anthropic + deepseek prompt caching information to openai format Fixes BerriAI#6069 * test: fix test * fix(types/utils.py): support all openai roles Fixes BerriAI#6052 * test: fix test --------- Co-authored-by: Ishaan Jaff <[email protected]>
* add /key/health endpoint * add /key/health endpoint * fix return from /key/health * update doc string * fix doc string for /key/health * add test for /key/health * fix linting * docs /key/health
* add cache_read_input_token_cost for prompt caching models * add prompt caching for latest models * add openai cost calculator * add openai prompt caching test * fix lint check * add not on how usage._cache_read_input_tokens is used * fix cost calc whisper openai * use output_cost_per_second * add input_cost_per_second
* add azure o1 models to model cost map * add azure o1 cost tracking * fix azure cost calc * add get llm provider test
…BerriAI#6079) In model_prices_and_context_window.json, openrouter/* models all have litellm_provider set as "openrouter", except for four openrouter/openai/* models, which were set to "openai". I suppose they must be set to "openrouter", so one can know it should use the openrouter API for this model.
…older (BerriAI#6080) * refactor gcs bucket * add readme
* fix: fix type-checking errors * fix: fix additional type-checking errors * fix: additional type-checking error fixes * fix: fix additional type-checking errors * fix: additional type-check fixes * fix: fix all type-checking errors + add pyright to ci/cd * fix: fix incorrect import * ci(config.yml): use mypy on ci/cd * fix: fix type-checking errors in utils.py * fix: fix all type-checking errors on main.py * fix: fix mypy linting errors * fix(anthropic/cost_calculator.py): fix linting errors * fix: fix mypy linting errors * fix: fix linting errors
* docs(prompt_caching.md): add prompt caching cost calc example to docs * docs(prompt_caching.md): add proxy examples to docs * feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching * docs(prompt_caching.md): add docs on checking model support for prompt caching * build: fix invalid json
* fix: enable new "disable_prisma_schema_update" flag * build(config.yml): remove setup remote docker step * ci(config.yml): give container time to start up * ci(config.yml): update test * build(config.yml): actually start docker * build(config.yml): simplify grep check * fix(prisma_client.py): support reading disable_schema_update via env vars * ci(config.yml): add test to check if all general settings are documented * build(test_General_settings.py): check available dir * ci: check ../ repo path * build: check ./ * build: fix test
…BerriAI#6071) * fix(utils.py): fix fix pydantic obj to schema creation for vertex endpoints Fixes BerriAI#6027 * test(test_completion.pyu): skip test - avoid hitting gemini rate limits * fix(common_utils.py): fix ruff linting error
Code cov - add checks for patch and overall repo
…ser_to_regen_tokens (admin ui / auth fix) Allow internal user to call /key/{token}/regenerate
* fix(utils.py): support passing dynamic api base to validate_environment Returns True if just api base is required and api base is passed * fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api Fixes BerriAI#6410 * fix(anthropic/chat/transformation.py): return correct error message * fix(http_handler.py): add error response text in places where we expect it * fix(factory.py): handle base case of no non-system messages to bedrock Fixes BerriAI#6411 * feat(cohere/embed): Support cohere image embeddings Closes BerriAI#6413 * fix(__init__.py): fix linting error * docs(supported_embedding.md): add image embedding example to docs * feat(cohere/embed): use cohere embedding returned usage for cost calc * build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + "supports_image_input" flag) * fix(cohere_transformation.py): fix linting error * test(test_proxy_server.py): cleanup test * test: cleanup test * fix: fix linting errors
(proxy audit logs) fix serialization error on audit logs
* add /user/delete call * ui show modal asking if you want to delete user * fix delete user modal
* testing for failure events prometheus * set set_llm_deployment_failure_metrics * test_async_post_call_failure_hook * unit testing for all prometheus functions * fix linting
BerriAI#6449) * add type for dd llm obs request ob * working dd llm obs * datadog use well defined type * clean up * unit test test_create_llm_obs_payload * fix linting * add datadog_llm_observability * add datadog_llm_observability * docs DD LLM obs * run testing again * document DD_ENV * test_create_llm_obs_payload
* fix(azure.py): handle /openai/deployment in azure api base * fix(factory.py): fix faulty anthropic tool result translation check Fixes BerriAI#6422 * fix(gpt_transformation.py): add support for parallel_tool_calls to azure Fixes BerriAI#6440 * fix(factory.py): support anthropic prompt caching for tool results * fix(vertex_ai/common_utils): don"t pop non-null required field Fixes BerriAI#6426 * feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini Closes BerriAI#6434 * build(model_prices_and_context_window.json): Add "supports_assistant_prefill" for bedrock claude-3-5-sonnet v2 models Closes BerriAI#6437 * fix(types/utils.py): fix linting * test: update test to include required fields * test: fix test * test: handle flaky test * test: remove e2e test - hitting gemini rate limits
* docs(exception_mapping.md): add missing exception types Fixes Aider-AI/aider#2120 (comment) * fix(main.py): register custom model pricing with specific key Ensure custom model pricing is registered to the specific model+provider key combination * test: make testing more robust for custom pricing * fix(redis_cache.py): instrument otel logging for sync redis calls ensures complete coverage for all redis cache calls
…used when expected (BerriAI#6471) * test test_dual_cache_get_set * unit testing for dual cache * fix async_set_cache_sadd * test_dual_cache_local_only
* docs(exception_mapping.md): add missing exception types Fixes Aider-AI/aider#2120 (comment) * fix(main.py): register custom model pricing with specific key Ensure custom model pricing is registered to the specific model+provider key combination * test: make testing more robust for custom pricing * fix(redis_cache.py): instrument otel logging for sync redis calls ensures complete coverage for all redis cache calls * refactor: pass parent_otel_span for redis caching calls in router allows for more observability into what calls are causing latency issues * test: update tests with new params * refactor: ensure e2e otel tracing for router * refactor(router.py): add more otel tracing acrosss router catch all latency issues for router requests * fix: fix linting error * fix(router.py): fix linting error * fix: fix test * test: fix tests * fix(dual_cache.py): pass ttl to redis cache * fix: fix param
…riAI#6484) * fix logging DB fails on prometheus * unit testing log to otel wrapper * unit testing for service logger + prometheus * use LATENCY buckets for service logging * fix service logging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request #8 has too many files changed.
We can only review pull requests with up to 300 changed files, and this pull request has 1040.
Caution Review failedThe pull request is closed. WalkthroughThis pull request introduces comprehensive updates across the LiteLLM project, focusing on documentation enhancements, configuration improvements, and expanded support for various AI providers and observability tools. Key changes include updates to CircleCI configuration, documentation restructuring, new integration support for platforms like Langtrace and Literal AI, and modifications to caching, pass-through endpoints, and provider-specific implementations. Changes
Sequence DiagramsequenceDiagram
participant User
participant LiteLLMProxy
participant AIProvider
participant ObservabilityTool
User->>LiteLLMProxy: Make API Request
LiteLLMProxy->>AIProvider: Forward Request
AIProvider-->>LiteLLMProxy: Return Response
LiteLLMProxy->>ObservabilityTool: Log Telemetry
LiteLLMProxy-->>User: Return Response
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot"s finite context window. It"s strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
User description
Title
Relevant issues
Type
🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test
Changes
[REQUIRED] Testing - Attach a screenshot of any new tests passing locall
If UI changes, send a screenshot/GIF of working UI fixes
PR Type
Enhancement, Tests, Other
Description
Introduced extensive test cases for Bedrock completion functionality, embedding, and error handling scenarios.
Added a new handler for Vertex AI embeddings with synchronous and asynchronous methods, including logging and error handling.
Enhanced logging system with dynamic callbacks, payload standardization, and new integrations.
Expanded proxy types with support for user roles, routes, and enhanced metadata handling.
Marked DynamoDB wrapper as deprecated and refactored imports for better organization.
Added a placeholder module for Azure AI rerank transformations.
Adjusted import paths in load tests to align with the new module structure.
Enhanced exception handling and logging for LLM API integrations.
Deleted numerous outdated files and modules, including caching, integrations, and experimental files.
Changes walkthrough 📝
2 files
test_bedrock_completion.py
Comprehensive test suite for Bedrock completion and embedding.
tests/local_testing/test_bedrock_completion.py
configurations.
scenarios.
streaming responses.
test_embedding.py
Expanded test coverage for embeddings including image and caching.
tests/local_testing/test_embedding.py
and providers.
headers.
modes.
7 files
dynamo_db.py
Deprecation notice and refactoring for DynamoDB wrapper.
litellm/proxy/db/dynamo_db.py
set_env_vars_based_on_arn
method for improved clarity.embedding_handler.py
New Vertex AI embedding handler with async support.
litellm/llms/vertex_ai_and_google_ai_studio/vertex_embeddings/embedding_handler.py
asynchronous methods.
operations.
litellm_logging.py
Enhanced logging system with dynamic callbacks and payload
standardization.
litellm/litellm_core_utils/litellm_logging.py
DynamicLoggingCache
class to manage logging clientinitialization and prevent memory leaks.
Logging
class with dynamic callback processing andstandardized logging payloads.
for callback functions.
functions.
_types.py
Expanded proxy types with enhanced roles, routes, and metadata
handling.
litellm/proxy/_types.py
callback validation.
authentication fields.
organization management.
configuration and logging.
transformation.py
Introduced transformation module for Azure AI rerank.
litellm/llms/azure_ai/rerank/transformation.py
/rerank
formats.exception_mapping_utils.py
Enhanced exception handling and logging for LLM API integrations.
litellm/litellm_core_utils/exception_mapping_utils.py
_get_response_headers
function to extract headers fromexceptions.
exception_type
function for detailed exception mapping acrossvarious LLM providers.
exception_logging
for enhanced debugging and logging ofexceptions.
_add_key_name_and_team_to_alert
helper for metadata enrichmentin error messages.
__init__.py
Initialized Bedrock LLM chat module with handlers.
litellm/llms/bedrock/chat/init.py
BedrockConverseLLM
andBedrockLLM
handlers.1 files
test_loadtest_router_withs3_cache.py
Adjusted import path for Cache module in load test.
cookbook/litellm_router_load_test/test_loadtest_router_withs3_cache.py
Cache
to align with new module structure.101 files
Summary by CodeRabbit
Based on the comprehensive summary of changes, here are the release notes:
Release Notes v1.50.2
New Features
Improvements
Bug Fixes
Documentation