Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

litellm_2024_10_29 #8

Merged
merged 1,956 commits into from
Jan 8, 2025
Merged

litellm_2024_10_29 #8

merged 1,956 commits into from
Jan 8, 2025

Conversation

FortiShield
Copy link

@FortiShield FortiShield commented Jan 8, 2025

User description

Title

Relevant issues

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

[REQUIRED] Testing - Attach a screenshot of any new tests passing locall

If UI changes, send a screenshot/GIF of working UI fixes


PR Type

Enhancement, Tests, Other


Description

  • Introduced extensive test cases for Bedrock completion functionality, embedding, and error handling scenarios.

  • Added a new handler for Vertex AI embeddings with synchronous and asynchronous methods, including logging and error handling.

  • Enhanced logging system with dynamic callbacks, payload standardization, and new integrations.

  • Expanded proxy types with support for user roles, routes, and enhanced metadata handling.

  • Marked DynamoDB wrapper as deprecated and refactored imports for better organization.

  • Added a placeholder module for Azure AI rerank transformations.

  • Adjusted import paths in load tests to align with the new module structure.

  • Enhanced exception handling and logging for LLM API integrations.

  • Deleted numerous outdated files and modules, including caching, integrations, and experimental files.


Changes walkthrough 📝

Relevant files
Tests
2 files
test_bedrock_completion.py
Comprehensive test suite for Bedrock completion and embedding.

tests/local_testing/test_bedrock_completion.py

  • Introduced extensive test cases for Bedrock completion functionality.
  • Added parameterized tests for various models, streaming, and guardrail
    configurations.
  • Implemented tests for tool usage, embedding, and error handling
    scenarios.
  • Included utility functions for encoding images and processing
    streaming responses.
  • +1920/-0
    test_embedding.py
    Expanded test coverage for embeddings including image and caching.

    tests/local_testing/test_embedding.py

  • Added new test cases for embedding functionalities with various models
    and providers.
  • Introduced support for testing image embeddings and rate-limit
    headers.
  • Enhanced existing tests with parameterized inputs and asynchronous
    modes.
  • Added caching tests for Bedrock embeddings.
  • +334/-41
    Enhancement
    7 files
    dynamo_db.py
    Deprecation notice and refactoring for DynamoDB wrapper. 

    litellm/proxy/db/dynamo_db.py

  • Marked the file as deprecated with a note about PostgreSQL support.
  • Refactored imports for better organization and removed unused imports.
  • Updated the set_env_vars_based_on_arn method for improved clarity.
  • +22/-314
    embedding_handler.py
    New Vertex AI embedding handler with async support.           

    litellm/llms/vertex_ai_and_google_ai_studio/vertex_embeddings/embedding_handler.py

  • Added a new handler for Vertex AI embeddings with synchronous and
    asynchronous methods.
  • Implemented request transformation and response handling for embedding
    operations.
  • Integrated logging for pre-call and post-call operations.
  • Included error handling for HTTP and timeout exceptions.
  • +236/-0 
    litellm_logging.py
    Enhanced logging system with dynamic callbacks and payload
    standardization.

    litellm/litellm_core_utils/litellm_logging.py

  • Introduced DynamicLoggingCache class to manage logging client
    initialization and prevent memory leaks.
  • Enhanced Logging class with dynamic callback processing and
    standardized logging payloads.
  • Added support for new logging integrations and improved error handling
    for callback functions.
  • Refactored and standardized logging payload generation with helper
    functions.
  • +1193/-512
    _types.py
    Expanded proxy types with enhanced roles, routes, and metadata
    handling.

    litellm/proxy/_types.py

  • Added new enums and classes for user roles, routes, and logging
    callback validation.
  • Introduced support for pass-through endpoints and enhanced JWT
    authentication fields.
  • Refactored and expanded request/response models for team and
    organization management.
  • Improved metadata handling and added new utility classes for
    configuration and logging.
  • +445/-124
    transformation.py
    Introduced transformation module for Azure AI rerank.       

    litellm/llms/azure_ai/rerank/transformation.py

  • Added a placeholder file for translating between Cohere and Azure AI
    /rerank formats.
  • +3/-0     
    exception_mapping_utils.py
    Enhanced exception handling and logging for LLM API integrations.

    litellm/litellm_core_utils/exception_mapping_utils.py

  • Introduced _get_response_headers function to extract headers from
    exceptions.
  • Added exception_type function for detailed exception mapping across
    various LLM providers.
  • Implemented exception_logging for enhanced debugging and logging of
    exceptions.
  • Added _add_key_name_and_team_to_alert helper for metadata enrichment
    in error messages.
  • +2144/-1
    __init__.py
    Initialized Bedrock LLM chat module with handlers.             

    litellm/llms/bedrock/chat/init.py

    • Added imports for BedrockConverseLLM and BedrockLLM handlers.
    +2/-0     
    Miscellaneous
    1 files
    test_loadtest_router_withs3_cache.py
    Adjusted import path for Cache module in load test.           

    cookbook/litellm_router_load_test/test_loadtest_router_withs3_cache.py

    • Updated import path for Cache to align with new module structure.
    +1/-1     
    Additional files
    101 files
    config.yml +675/-30
    requirements.txt +1/-1     
    devcontainer.json +2/-1     
    .dockerignore +6/-0     
    ghcr_deploy.yml +54/-5   
    ghcr_helm_deploy.yml +4/-1     
    .pre-commit-config.yaml +8/-3     
    Dockerfile +9/-6     
    check_file_length.py [link]   
    codecov.yaml +32/-0   
    LiteLLM_HuggingFace.ipynb +148/-634
    grafana_dashboard.json +807/-0 
    readme.md +10/-2   
    test_loadtest_openai_client.py +1/-1     
    clickhouse.py +0/-72   
    clickhouse_insert_logs.py +0/-39   
    mlflow_langchain_tracing_litellm_proxy.ipynb +312/-0 
    create_views.py +212/-0 
    update_unassigned_teams.py +27/-0   
    Dockerfile.ghcr_base +2/-2     
    Chart.yaml +2/-2     
    README.md +34/-3   
    deployment.yaml +13/-68 
    values.yaml +9/-7     
    docker-compose.yml +8/-0     
    Dockerfile.alpine +5/-2     
    Dockerfile.custom_ui +2/-2     
    Dockerfile.database +9/-6     
    Dockerfile.non_root +84/-0   
    build_admin_ui.sh [link]   
    entrypoint.sh [link]   
    assistants.md +33/-0   
    batches.md +181/-0 
    all_caches.md +74/-10 
    caching_api.md +7/-4     
    local_caching.md +4/-4     
    audio.md +316/-0 
    input.md +29/-24 
    json_mode.md +8/-2     
    prefix.md +119/-0 
    prompt_caching.md +502/-0 
    stream.md +73/-1   
    usage.md +51/-0   
    vision.md +145/-0 
    data_security.md +84/-3   
    async_embedding.md +1/-1     
    moderation.md +1/-1     
    supported_embedding.md +54/-0   
    enterprise.md +14/-7   
    exception_mapping.md +27/-12 
    code_quality.md +12/-0   
    index.md +50/-7   
    load_test.md +2/-511 
    load_test_advanced.md +209/-0 
    load_test_rpm.md +348/-0 
    load_test_sdk.md +87/-0   
    argilla.md +67/-0   
    arize_integration.md +3/-1     
    braintrust.md +1/-1     
    callbacks.md +1/-0     
    gcs_bucket_integration.md +3/-47   
    helicone_integration.md +1/-1     
    langfuse_integration.md +43/-28 
    langsmith_integration.md +13/-5   
    langtrace_integration.md +63/-0   
    literalai_integration.md +122/-0 
    logfire_integration.md +1/-1     
    opentelemetry_integration.md +78/-0   
    opik_integration.md +95/-0   
    sentry.md +1/-1     
    traceloop_integration.md +0/-36   
    oidc.md +41/-1   
    old_guardrails.md +355/-0 
    bedrock.md +293/-0 
    cohere.md +253/-0 
    google_ai_studio.md +223/-0 
    langfuse.md +132/-0 
    vertex_ai.md +859/-0 
    dbally.md +3/-0     
    prompt_injection.md +1/-86   
    ai21.md +187/-5 
    anthropic.md +355/-11
    aws_sagemaker.md +402/-12
    azure.md +357/-46
    azure_ai.md +89/-3   
    bedrock.md +312/-16
    cerebras.md +145/-0 
    cohere.md +89/-4   
    custom_llm_server.md +248/-3 
    fireworks_ai.md +19/-1   
    gemini.md +310/-4 
    github.md +0/-1     
    groq.md +0/-1     
    huggingface.md +223/-246
    jina_ai.md +24/-0   
    litellm_proxy.md +89/-0   
    nvidia_nim.md +93/-0   
    openai.md +48/-1   
    openai_compatible.md +1/-1     
    palm.md +6/-0     
    Additional files not shown

    💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    Summary by CodeRabbit

    Based on the comprehensive summary of changes, here are the release notes:

    Release Notes v1.50.2

    New Features

    • Added support for image embeddings across multiple providers
    • Introduced Langtrace AI integration for logging and evaluation
    • Enhanced Literal AI integration with multi-step trace support
    • Added pass-through endpoints for Bedrock, Cohere, Google AI Studio, Langfuse, and Vertex AI
    • Expanded Anthropic support with prompt caching and function/tool calling
    • Added OpenTelemetry integration for tracing

    Improvements

    • Updated documentation for multiple providers (AI21, AWS SageMaker, etc.)
    • Enhanced error handling and exception mapping
    • Improved caching mechanisms across different models
    • Added support for more embedding models and providers
    • Expanded observability integrations (Argilla, Opik, Logfire)

    Bug Fixes

    • Refined import paths for various modules
    • Updated configuration management for proxy and SDK
    • Improved error reporting and logging mechanisms

    Documentation

    • Comprehensive updates to provider-specific documentation
    • Added detailed integration guides for observability tools
    • Improved quick start and usage instructions

    ishaan-jaff and others added 30 commits October 4, 2024 16:56
    * docs 1k rps load test
    
    * docs load testing
    
    * docs load testing litellm
    
    * docs load testing
    
    * clean up load test doc
    
    * docs prom metrics for load testing
    
    * docs using prometheus on load testing
    
    * doc load testing with prometheus
    * fixes for required values for gcs bucket
    
    * docs gcs bucket logging
    * add yaml with all router settings
    
    * add docs for router settings
    
    * docs router settings litellm settings
    * add prompt caching for latest models
    
    * add cache_read_input_token_cost for prompt caching models
    * fix(litellm_logging.py): ensure cache hits are scrubbed if "turn_off_message_logging" is enabled
    
    * fix(sagemaker.py): fix streaming to raise error immediately
    
    Fixes BerriAI#6054
    
    * (fixes)  gcs bucket key based logging  (BerriAI#6044)
    
    * fixes for gcs bucket logging
    
    * fix StandardCallbackDynamicParams
    
    * fix - gcs logging when payload is not serializable
    
    * add test_add_callback_via_key_litellm_pre_call_utils_gcs_bucket
    
    * working success callbacks
    
    * linting fixes
    
    * fix linting error
    
    * add type hints to functions
    
    * fixes for dynamic success and failure logging
    
    * fix for test_async_chat_openai_stream
    
    * fix handle case when key based logging vars are set as os.environ/ vars
    
    * fix prometheus track cooldown events on custom logger (BerriAI#6060)
    
    * (docs) add 1k rps load test doc  (BerriAI#6059)
    
    * docs 1k rps load test
    
    * docs load testing
    
    * docs load testing litellm
    
    * docs load testing
    
    * clean up load test doc
    
    * docs prom metrics for load testing
    
    * docs using prometheus on load testing
    
    * doc load testing with prometheus
    
    * (fixes) docs + qa - gcs key based logging  (BerriAI#6061)
    
    * fixes for required values for gcs bucket
    
    * docs gcs bucket logging
    
    * bump: version 1.48.12 → 1.48.13
    
    * ci/cd run again
    
    * bump: version 1.48.13 → 1.48.14
    
    * update load test doc
    
    * (docs) router settings - on litellm config  (BerriAI#6037)
    
    * add yaml with all router settings
    
    * add docs for router settings
    
    * docs router settings litellm settings
    
    * (feat)  OpenAI prompt caching models to model cost map (BerriAI#6063)
    
    * add prompt caching for latest models
    
    * add cache_read_input_token_cost for prompt caching models
    
    * fix(litellm_logging.py): check if param is iterable
    
    Fixes BerriAI#6025 (comment)
    
    * fix(factory.py): support passing an "assistant_continue_message" to prevent bedrock error
    
    Fixes BerriAI#6053
    
    * fix(databricks/chat): handle streaming responses
    
    * fix(factory.py): fix linting error
    
    * fix(utils.py): unify anthropic + deepseek prompt caching information to openai format
    
    Fixes BerriAI#6069
    
    * test: fix test
    
    * fix(types/utils.py): support all openai roles
    
    Fixes BerriAI#6052
    
    * test: fix test
    
    ---------
    
    Co-authored-by: Ishaan Jaff <[email protected]>
    * add /key/health endpoint
    
    * add /key/health endpoint
    
    * fix return from /key/health
    
    * update doc string
    
    * fix doc string for /key/health
    
    * add test for /key/health
    
    * fix linting
    
    * docs /key/health
    * add cache_read_input_token_cost for prompt caching models
    
    * add prompt caching for latest models
    
    * add openai cost calculator
    
    * add openai prompt caching test
    
    * fix lint check
    
    * add not on how usage._cache_read_input_tokens is used
    
    * fix cost calc whisper openai
    
    * use output_cost_per_second
    
    * add input_cost_per_second
    * add azure o1 models to model cost map
    
    * add azure o1 cost tracking
    
    * fix azure cost calc
    
    * add get llm provider test
    …BerriAI#6079)
    
    In model_prices_and_context_window.json, openrouter/* models all have litellm_provider set as "openrouter", except for four openrouter/openai/* models, which were set to "openai".
    I suppose they must be set to "openrouter", so one can know it should use the openrouter API for this model.
    * fix: fix type-checking errors
    
    * fix: fix additional type-checking errors
    
    * fix: additional type-checking error fixes
    
    * fix: fix additional type-checking errors
    
    * fix: additional type-check fixes
    
    * fix: fix all type-checking errors + add pyright to ci/cd
    
    * fix: fix incorrect import
    
    * ci(config.yml): use mypy on ci/cd
    
    * fix: fix type-checking errors in utils.py
    
    * fix: fix all type-checking errors on main.py
    
    * fix: fix mypy linting errors
    
    * fix(anthropic/cost_calculator.py): fix linting errors
    
    * fix: fix mypy linting errors
    
    * fix: fix linting errors
    * docs(prompt_caching.md): add prompt caching cost calc example to docs
    
    * docs(prompt_caching.md): add proxy examples to docs
    
    * feat(utils.py): expose new helper `supports_prompt_caching()` to check if a model supports prompt caching
    
    * docs(prompt_caching.md): add docs on checking model support for prompt caching
    
    * build: fix invalid json
    * fix: enable new "disable_prisma_schema_update" flag
    
    * build(config.yml): remove setup remote docker step
    
    * ci(config.yml): give container time to start up
    
    * ci(config.yml): update test
    
    * build(config.yml): actually start docker
    
    * build(config.yml): simplify grep check
    
    * fix(prisma_client.py): support reading disable_schema_update via env vars
    
    * ci(config.yml): add test to check if all general settings are documented
    
    * build(test_General_settings.py): check available dir
    
    * ci: check ../ repo path
    
    * build: check ./
    
    * build: fix test
    …BerriAI#6071)
    
    * fix(utils.py): fix  fix pydantic obj to schema creation for vertex endpoints
    
    Fixes BerriAI#6027
    
    * test(test_completion.pyu): skip test - avoid hitting gemini rate limits
    
    * fix(common_utils.py): fix ruff linting error
    ishaan-jaff and others added 24 commits October 25, 2024 16:50
    Code cov - add checks for patch and overall repo
    …ser_to_regen_tokens
    
    (admin ui / auth fix) Allow internal user to call /key/{token}/regenerate
    * fix(utils.py): support passing dynamic api base to validate_environment
    
    Returns True if just api base is required and api base is passed
    
    * fix(litellm_pre_call_utils.py): feature flag sending client headers to llm api
    
    Fixes BerriAI#6410
    
    * fix(anthropic/chat/transformation.py): return correct error message
    
    * fix(http_handler.py): add error response text in places where we expect it
    
    * fix(factory.py): handle base case of no non-system messages to bedrock
    
    Fixes BerriAI#6411
    
    * feat(cohere/embed): Support cohere image embeddings
    
    Closes BerriAI#6413
    
    * fix(__init__.py): fix linting error
    
    * docs(supported_embedding.md): add image embedding example to docs
    
    * feat(cohere/embed): use cohere embedding returned usage for cost calc
    
    * build(model_prices_and_context_window.json): add embed-english-v3.0 details (image cost + "supports_image_input" flag)
    
    * fix(cohere_transformation.py): fix linting error
    
    * test(test_proxy_server.py): cleanup test
    
    * test: cleanup test
    
    * fix: fix linting errors
    (proxy audit logs) fix serialization error on audit logs
    * add /user/delete call
    
    * ui show modal asking if you want to delete user
    
    * fix delete user modal
    * testing for failure events prometheus
    
    * set set_llm_deployment_failure_metrics
    
    * test_async_post_call_failure_hook
    
    * unit testing for all prometheus functions
    
    * fix linting
    BerriAI#6449)
    
    * add type for dd llm obs request ob
    
    * working dd llm obs
    
    * datadog use well defined type
    
    * clean up
    
    * unit test test_create_llm_obs_payload
    
    * fix linting
    
    * add datadog_llm_observability
    
    * add datadog_llm_observability
    
    * docs DD LLM obs
    
    * run testing again
    
    * document DD_ENV
    
    * test_create_llm_obs_payload
    * fix(azure.py): handle /openai/deployment in azure api base
    
    * fix(factory.py): fix faulty anthropic tool result translation check
    
    Fixes BerriAI#6422
    
    * fix(gpt_transformation.py): add support for parallel_tool_calls to azure
    
    Fixes BerriAI#6440
    
    * fix(factory.py): support anthropic prompt caching for tool results
    
    * fix(vertex_ai/common_utils): don"t pop non-null required field
    
    Fixes BerriAI#6426
    
    * feat(vertex_ai.py): support code_execution tool call for vertex ai + gemini
    
    Closes BerriAI#6434
    
    * build(model_prices_and_context_window.json): Add "supports_assistant_prefill" for bedrock claude-3-5-sonnet v2 models
    
    Closes BerriAI#6437
    
    * fix(types/utils.py): fix linting
    
    * test: update test to include required fields
    
    * test: fix test
    
    * test: handle flaky test
    
    * test: remove e2e test - hitting gemini rate limits
    * docs(exception_mapping.md): add missing exception types
    
    Fixes Aider-AI/aider#2120 (comment)
    
    * fix(main.py): register custom model pricing with specific key
    
    Ensure custom model pricing is registered to the specific model+provider key combination
    
    * test: make testing more robust for custom pricing
    
    * fix(redis_cache.py): instrument otel logging for sync redis calls
    
    ensures complete coverage for all redis cache calls
    …used when expected (BerriAI#6471)
    
    * test test_dual_cache_get_set
    
    * unit testing for dual cache
    
    * fix async_set_cache_sadd
    
    * test_dual_cache_local_only
    * docs(exception_mapping.md): add missing exception types
    
    Fixes Aider-AI/aider#2120 (comment)
    
    * fix(main.py): register custom model pricing with specific key
    
    Ensure custom model pricing is registered to the specific model+provider key combination
    
    * test: make testing more robust for custom pricing
    
    * fix(redis_cache.py): instrument otel logging for sync redis calls
    
    ensures complete coverage for all redis cache calls
    
    * refactor: pass parent_otel_span for redis caching calls in router
    
    allows for more observability into what calls are causing latency issues
    
    * test: update tests with new params
    
    * refactor: ensure e2e otel tracing for router
    
    * refactor(router.py): add more otel tracing acrosss router
    
    catch all latency issues for router requests
    
    * fix: fix linting error
    
    * fix(router.py): fix linting error
    
    * fix: fix test
    
    * test: fix tests
    
    * fix(dual_cache.py): pass ttl to redis cache
    
    * fix: fix param
    …riAI#6484)
    
    * fix logging DB fails on prometheus
    
    * unit testing log to otel wrapper
    
    * unit testing for service logger + prometheus
    
    * use LATENCY buckets for service logging
    
    * fix service logging
    Copy link

    @sourcery-ai sourcery-ai bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    The pull request #8 has too many files changed.

    We can only review pull requests with up to 300 changed files, and this pull request has 1040.

    Copy link

    coderabbitai bot commented Jan 8, 2025

    Caution

    Review failed

    The pull request is closed.

    Walkthrough

    This pull request introduces comprehensive updates across the LiteLLM project, focusing on documentation enhancements, configuration improvements, and expanded support for various AI providers and observability tools. Key changes include updates to CircleCI configuration, documentation restructuring, new integration support for platforms like Langtrace and Literal AI, and modifications to caching, pass-through endpoints, and provider-specific implementations.

    Changes

    File/Directory Change Summary
    .circleci/config.yml Downgraded CircleCI version, added new jobs for testing, modified dependency management
    .circleci/requirements.txt Updated OpenAI package version from 1.34.0 to 1.52.0
    .devcontainer/devcontainer.json Added new VS Code extension for autopep8
    .dockerignore Added entries for files and directories to be ignored during Docker builds
    .github/workflows/ghcr_deploy.yml Added new job for building and pushing Docker images, updated existing jobs
    .github/workflows/ghcr_helm_deploy.yml Introduced a linting step for Helm charts
    .gitignore Added entry to ignore specific files
    .pre-commit-config.yaml Replaced mypy with pyright for type checking
    Dockerfile Modified paths and specified package versions
    codecov.yaml Enhanced configuration for code coverage tracking
    cookbook/LiteLLM_HuggingFace.ipynb Updated documentation and code for Hugging Face integration
    docs/my-website/docs/ Extensive documentation updates across multiple files, including new integrations, provider support, and observability tools
    deploy/charts/litellm-helm/ Updated Helm chart version and application version
    docker/ Various Dockerfile modifications, including path updates and dependency management

    Sequence Diagram

    sequenceDiagram
        participant User
        participant LiteLLMProxy
        participant AIProvider
        participant ObservabilityTool
    
        User->>LiteLLMProxy: Make API Request
        LiteLLMProxy->>AIProvider: Forward Request
        AIProvider-->>LiteLLMProxy: Return Response
        LiteLLMProxy->>ObservabilityTool: Log Telemetry
        LiteLLMProxy-->>User: Return Response
    
    Loading

    Poem

    🐰 A Rabbit"s Ode to LiteLLM"s Grand Update

    Configs dance, docs unfurl with grace,
    Integrations bloom in every space,
    CircleCI spins its testing wheel,
    Observability reveals its zeal,
    LiteLLM hops forward, innovation"s trait! 🚀


    Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

    ❤️ Share
    🪧 Tips

    Chat

    There are 3 ways to chat with CodeRabbit:

    • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
      • I pushed a fix in commit <commit_id>, please review it.
      • Generate unit testing code for this file.
      • Open a follow-up GitHub issue for this discussion.
    • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
      • @coderabbitai generate unit testing code for this file.
      • @coderabbitai modularize this function.
    • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
      • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
      • @coderabbitai read src/utils.ts and generate unit testing code.
      • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
      • @coderabbitai help me debug CodeRabbit configuration file.

    Note: Be mindful of the bot"s finite context window. It"s strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

    CodeRabbit Commands (Invoked using PR comments)

    • @coderabbitai pause to pause the reviews on a PR.
    • @coderabbitai resume to resume the paused reviews.
    • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
    • @coderabbitai full review to do a full review from scratch and review all the files again.
    • @coderabbitai summary to regenerate the summary of the PR.
    • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
    • @coderabbitai resolve resolve all the CodeRabbit review comments.
    • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
    • @coderabbitai help to get help.

    Other keywords and placeholders

    • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
    • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
    • Add @coderabbitai anywhere in the PR title to generate the title automatically.

    CodeRabbit Configuration File (.coderabbit.yaml)

    • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
    • Please see the configuration documentation for more information.
    • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

    Documentation and Community

    • Visit our Documentation for detailed information on how to use CodeRabbit.
    • Join our Discord Community to get help, request features, and share feedback.
    • Follow us on X/Twitter for updates and announcements.

    @khulnasoft-bot khulnasoft-bot merged commit 4dc97a9 into khulnasoft:main Jan 8, 2025
    0 of 2 checks passed
    Copy link

    PR Reviewer Guide 🔍

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 5 🔵🔵🔵🔵🔵
    🧪 PR contains tests
    🔒 Security concerns

    Sensitive information exposure:
    The code handles sensitive data like API keys and authentication tokens. While there are some redaction mechanisms in place (e.g., scrub_sensitive_keys_in_metadata), the extensive logging functionality increases the risk of accidentally exposing sensitive information. Additional validation and redaction should be added, particularly around the metadata handling and logging callbacks.

    ⚡ Recommended focus areas for review

    Code Complexity

    The logging module has been significantly refactored with complex logic for handling different logging scenarios and callbacks. The complexity increases risk of bugs and makes maintenance harder. Consider breaking down into smaller, more focused classes/methods.

    Error Handling

    Multiple catch-all exception blocks that swallow errors with only logging. This could hide important errors. Consider more specific exception handling and propagating critical errors.

    messages=self.messages,
    end_user=self.model_call_details.get("user", "default"),
    response_obj=result,
    start_time=start_time,
    end_time=end_time,
    Memory Management

    The DynamicLoggingCache implementation uses an in-memory cache without size limits or eviction policies. This could lead to memory leaks in high-volume scenarios.

    class DynamicLoggingCache:
        """
        Prevent memory leaks caused by initializing new logging clients on each request.
    
        Relevant Issue: https://github.com/BerriAI/litellm/issues/5695
        """
    
        def __init__(self) -> None:
            self.cache = InMemoryCache()
    
        def get_cache_key(self, args: dict) -> str:
            args_str = json.dumps(args, sort_keys=True)
            cache_key = hashlib.sha256(args_str.encode("utf-8")).hexdigest()
            return cache_key
    
        def get_cache(self, credentials: dict, service_name: str) -> Optional[Any]:
            key_name = self.get_cache_key(
                args={**credentials, "service_name": service_name}
            )
            response = self.cache.get_cache(key=key_name)
            return response
    
        def set_cache(self, credentials: dict, service_name: str, logging_obj: Any) -> None:
            key_name = self.get_cache_key(
                args={**credentials, "service_name": service_name}
            )
            self.cache.set_cache(key=key_name, value=logging_obj)
            return None

    Copy link

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    General
    Clean up environment variables after tests to prevent test pollution

    Add cleanup of environment variables after tests to avoid test pollution. The
    LITELLM_LOCAL_MODEL_COST_MAP environment variable is set but never cleaned up.

    tests/local_testing/test_completion_cost.py [633-634]

    -os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
    -litellm.model_cost = litellm.get_model_cost_map(url="")
    +try:
    +    os.environ["LITELLM_LOCAL_MODEL_COST_MAP"] = "True"
    +    litellm.model_cost = litellm.get_model_cost_map(url="")
    +    # Test code here
    +finally:
    +    del os.environ["LITELLM_LOCAL_MODEL_COST_MAP"]
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Environment variable cleanup is crucial for test isolation. Not cleaning up can affect other tests and create hard-to-debug issues.

    8
    Add proper exception handling and logging for file operations

    Handle potential exceptions when loading the anthropic tokenizer JSON file to avoid
    silent failures.

    litellm/utils.py [140-146]

     try:
         with resources.open_text("litellm.llms.tokenizers", "anthropic_tokenizer.json") as f:
             json_data = json.load(f)
         claude_json_str = json.dumps(json_data)
    -except:
    -    claude_json_str=""
    +except Exception as e:
    +    print_verbose(f"Error loading anthropic tokenizer: {str(e)}")
    +    claude_json_str = ""
    • Apply this suggestion
    Suggestion importance[1-10]: 4

    Why: While the suggestion improves error visibility by logging the specific exception, the existing code already handles the failure case adequately by setting a default empty string.

    4
    Possible issue
    Add defensive null check to prevent potential NullPointerException when accessing object attributes

    Add a null check before accessing the "choices" attribute of the ModelResponse
    object to avoid potential NullPointerException.

    litellm/utils.py [644-648]

     if (
         isinstance(original_response, ModelResponse)
    +    and hasattr(original_response, "choices")
         and len(original_response.choices) > 0
     ):
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion adds a valuable safety check using hasattr() before accessing the "choices" attribute, which could prevent runtime errors if the ModelResponse object is malformed.

    7
    Add validation to handle missing configuration file paths

    Add error handling for the case when config_file_path is None but
    user_config_file_path is also None. Currently this could lead to undefined behavior.

    litellm/proxy/proxy_server.py [1380]

     file_path = config_file_path or user_config_file_path
    +if file_path is None:
    +    raise ValueError("No config file path provided - both config_file_path and user_config_file_path are None")
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion addresses a potential source of runtime errors by validating configuration file paths early. This is important for system stability and better error messaging.

    7
    Add type validation for numeric configuration parameters

    Add type validation for health_check_interval to prevent potential runtime errors if
    an invalid type is provided.

    litellm/proxy/proxy_server.py [1344-1347]

    -if health_check_interval is not None and isinstance(health_check_interval, float):
    -    await asyncio.sleep(health_check_interval)
    +if health_check_interval is not None:
    +    if not isinstance(health_check_interval, (int, float)):
    +        raise TypeError("health_check_interval must be a number")
    +    await asyncio.sleep(float(health_check_interval))
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion enhances type safety by validating health_check_interval parameter and supporting both int and float types, preventing potential runtime errors.

    7
    Add defensive error handling for missing or None response cost values

    Add error handling for edge cases where response_cost is None in
    CustomLoggingHandler. Currently if the cost calculation fails, response_cost remains
    None which could cause issues in dependent code.

    tests/local_testing/test_completion_cost.py [37-38]

     def log_success_event(self, kwargs, response_obj, start_time, end_time):
    -    self.response_cost = kwargs["response_cost"]
    +    self.response_cost = kwargs.get("response_cost", 0.0)
    +    if self.response_cost is None:
    +        self.response_cost = 0.0
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion improves code robustness by safely handling missing or None response_cost values, preventing potential NoneType errors in production code.

    7
    Add input validation to prevent processing of null values

    Add type checking for the "messages" variable to prevent potential TypeError when
    it"s None.

    litellm/utils.py [543]

    -messages = args[1] if len(args) > 1 else kwargs.get("input", None)
    +messages = args[1] if len(args) > 1 else kwargs.get("input")
    +if messages is None:
    +    raise ValueError(""messages" or "input" parameter is required")
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: The suggestion adds important input validation that could prevent downstream issues, though the current code already handles None values through the get() method"s default parameter.

    6
    Add error handling for subprocess operations

    Add error handling for subprocess.Popen() call to catch and handle potential OS
    errors when starting Ollama server.

    litellm/proxy/proxy_server.py [1310]

    -subprocess.Popen(command, stdout=devnull, stderr=devnull)
    +try:
    +    subprocess.Popen(command, stdout=devnull, stderr=devnull)
    +except OSError as e:
    +    verbose_proxy_logger.error(f"Failed to start Ollama server: {str(e)}")
    +    raise
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: The suggestion improves error handling for OS-level operations, providing better debugging information when Ollama server fails to start.

    6

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.