Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐕 Batch: Refactoring Test workflows in models #1484

Open
7 tasks
DhanshreeA opened this issue Jan 3, 2025 · 3 comments
Open
7 tasks

🐕 Batch: Refactoring Test workflows in models #1484

DhanshreeA opened this issue Jan 3, 2025 · 3 comments
Assignees

Comments

@DhanshreeA
Copy link
Member

DhanshreeA commented Jan 3, 2025

Summary

This issue will encompass efforts to reconcile, clean up, and enhance our test (and build) pipelines for individual models.

We currently have a test module and CLI command (ersilia test ...) that can check a given model for functionality, completeness, and correctness. In addition to this, we also have a testing playground - a test utility which checks a given model for functionality, completeness, and correctness; and is able to simulate running one or more models on a user's system.

Existing test in our model pipeline is quite redundant in face of these functionalities because it is quite naive in comparison since it only tests nullity in model predictions, and is not robust to how a model might serialize its outputs. Moreover, the Docker build pipelines are bloated with code that can be removed in favor of a singular workflow testing the built images. We also need to handle testing for ARM and AMD builds more smartly since currently we only test the AMD images, but recently we have experienced some models successfully building for the ARM platform but then not actually working.

Furthermore, we need to revisit H5 serialization within Ersilia, and also include tests for this functionality at the level of testing models.

Each of the objectives below should be considered individual tasks, and should be addressed in separate PRs referencing this issue.

Objective(s)

  • Consolidate the following input-output combinations in the testing scenarios covered by the ersilia test command:
  1. Input = CSV - Output = CSV
  2. Input = CSV - Output = HDF5
  3. Input = CSV - Output = JSON
  4. Input = SMILE - Output = CSV
  5. Input = SMILE - Output = HDF5
  6. Input = SMILE - Output = JSON
  • For the test-model.yml workflow, we should remove the current testing logic (L128-L144) and keep it in favor of only using the ersilia test command. We also want to upload the logs generated from this command, as well as the results of this command as artifacts with a retention period of 14 days.
  • Same as above, in the test-model-pr.yml workflow, we should only keep to using the ersilia test command. Same conditions apply for handling and uploading the logs and results as artifacts with retention of 14 days.
  • Refactor the upload-ersilia-pack.yml, and upload-bentoml.yml workflows to only build and publish model images (both for ARM and AMD), ie we can remove the testing logic from these workflows. These images should be tagged dev.
  • Refactor the testing playground to work with specific model ids, as well as image tags.
  • Create a new test workflow for docker builds that is triggered after the Upload model to DockerHub workflow. This workflow should utilise the Testing Playground utility from Ersilia and test the built model image (however it gets built, ie using Ersilia Pack or legacy approaches). This workflow should run on a matrix of ubuntu-latest, and macos-latest, to ensure that we are also testing the AMD images. Based on the result of this workflows, we can tag the images latest and identify which architectures they successfully work on.
  • The Post model upload workflow should run at the very last and update necessary metadata stores (Airtable, S3 JSON), and README. We can remove the step to create testing issues for community members at this point from this workflow.

Documentation

  1. ModelTester class used in the test CLI command: https://ersilia.gitbook.io/ersilia-book/ersilia-model-hub/developer-docs/model-tester
  2. Testing Playground utility: https://ersilia.gitbook.io/ersilia-book/ersilia-model-hub/developer-docs/testing-playground
@DhanshreeA DhanshreeA removed the status in Ersilia Model Hub Jan 3, 2025
@Abellegese Abellegese self-assigned this Jan 3, 2025
@GemmaTuron GemmaTuron changed the title 🐕 Batch: Refactoring Test workfllows in models 🐕 Batch: Refactoring Test workflows in models Jan 3, 2025
@GemmaTuron
Copy link
Member

Hi @Abellegese or @DhanshreeA

Can you clarify if the test command needs to be modified (according to point 1) or both the test command and the playground will be modified?
The test command only tests the model from source, right? And the only modification we will do to it currently is to test all different combinations of input and output which currently was not happening? Once an output is generated, whichever the format, the next step is to check that the output has the required length, is not none etc?
What are the modifications to do in the testing playground more specifically? Maybe opening one issue with more details for each task would be helpful as those get tackled.
I would also add that Documenting in GitBook is an important part of each task

@Abellegese
Copy link
Contributor

Hey @GemmaTuron our plan is to update both pipeline for this functionality. I am creating one issue for both.

@Abellegese
Copy link
Contributor

A few more detail about the features has been given here #1488.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

3 participants