Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: inference engine provider and accessibility #2470

Merged
merged 5 commits into from
Mar 25, 2024

Conversation

louis-jan
Copy link
Contributor

@louis-jan louis-jan commented Mar 22, 2024

Describe Your Changes

This PR lets the frontend control the inference engine directly. It helps manage logic before UX constraints, making the user experience smoother without waiting for events later on.

Currently:
Event-Driven:

graph TD;
    A[Chat Shell] -->|broadcast events| B[Extensions];
    B -->|broadcast events| A;
Loading

Pros:

  • Easily adaptable to different frameworks and languages.
  • Components communicate through events, enabling easier maintenance and updates.
  • Extensions can own their logic controllers, allowing for customization.

Cons:

  • Managing event subscriptions across components can be challenging.
  • Introducing events may complicate state management.
  • Handling events in frontend applications can add complexity.
  • App lacks awareness of available inference engines, potentially resulting in timeouts.
  • Inability to implement pre-processing UX constraints until extension broadcasts event messages.
  • Can't be directly used by other extensions or tools

Updated:
Direct Request (Engine Provider):

graph TD;
    A[Inference Extensions] --> |register engines| CoreSDK
    C[Chat Shell] -->|imports|CoreSDK
    Extensions -->|imports|CoreSDK
    CoreSDK --> |uses| Engines;
    Engines -->|inference| Outputs;
Loading

Pros:

  • Frontend has direct control over the logic flow.
  • Avoids overhead of event handling.
  • Simpler to scale in smaller systems.
  • App now possesses knowledge of all installed inference engines.
  • Easy to add UX constraints before inference.
  • Able to be imported and used from other extensions/tools

Cons:

  • May restrict extension of functionalities.
  • Components are tightly coupled, reducing adaptability.
  • May pose challenges as system complexity grows.

The core package has been restructured to accommodate compatibility with both browser and Node.js environments. It now includes two main directories:

  • browser: Contains modules optimized for web browsers, with the main entry point defined in browser/index.ts.
  • node: Houses modules tailored for Node.js, with the main entry point defined in node/index.ts.

Fixes Issues

Self Checklist

graph TD;
    A[Chat Shell] -->|broadcast events| B[Assistant Extension];
    B -->|broadcast events| C[Inference Extension] -->|broadcast events| A;
Loading

->

graph TD;
    AssistantExtension --> |registers Tools| CoreSDK;
    App --> |imports| CoreSDK;
    Extension --> |imports| CoreSDK;
    CoreSDK --> |uses| Tools;
    Tools --> |inference|Engines
Loading

@louis-jan louis-jan requested a review from namchuai March 22, 2024 15:33
@github-actions github-actions bot added the type: chore Maintenance, integration, packaging related label Mar 22, 2024
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch 4 times, most recently from 7c8f7e9 to 7fb5548 Compare March 23, 2024 13:47
@louis-jan louis-jan marked this pull request as ready for review March 23, 2024 13:47
@louis-jan louis-jan requested a review from urmauur March 23, 2024 13:47
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch from 7fb5548 to b914301 Compare March 23, 2024 13:48
@louis-jan louis-jan marked this pull request as draft March 23, 2024 13:49
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch 3 times, most recently from 767b06d to ec3196d Compare March 24, 2024 06:51
@louis-jan louis-jan mentioned this pull request Mar 24, 2024
28 tasks
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch from ec3196d to 24677e4 Compare March 24, 2024 16:16
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch 3 times, most recently from ec3196d to 4df4851 Compare March 25, 2024 05:19
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch from 4df4851 to 9551996 Compare March 25, 2024 05:25
@louis-jan louis-jan changed the title refactor: load and unload model promises refactor: inference engine provider and accessibility Mar 25, 2024
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch from d9c8ff8 to bdd2d09 Compare March 25, 2024 05:33
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch 2 times, most recently from beed065 to 26711fd Compare March 25, 2024 06:34
@louis-jan louis-jan force-pushed the chore/load-unload-model-sync branch from 26711fd to 14a6746 Compare March 25, 2024 09:20
@louis-jan louis-jan marked this pull request as ready for review March 25, 2024 12:03
@louis-jan louis-jan merged commit 66f7d3d into dev Mar 25, 2024
4 checks passed
@louis-jan louis-jan deleted the chore/load-unload-model-sync branch March 25, 2024 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: chore Maintenance, integration, packaging related
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

2 participants