Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop setUpClass in reward tester #1895

Merged
merged 5 commits into from
Aug 5, 2024
Merged

Drop setUpClass in reward tester #1895

merged 5 commits into from
Aug 5, 2024

Conversation

qgallouedec
Copy link
Member

@qgallouedec qgallouedec commented Aug 2, 2024

Main change

I realized that the tests interfered with each other because they shared the same model object.

(For a bit of context: a test failed when it was named my_test2, but not when it was named my_test, as the tests are run in alphabetical order).

The solution is to reload the model each time, which since it's cached, isn't an overhead.

Should we propage this change everywhere?

Also

Do not report to wandb: when tests are not run in parallel, wandb instances interfere, and create weird errors, e.g. :

wandb.sdk.lib.config_util.ConfigError: Attempted to change value of key "model/num_parameters" from 1658681 to 1663801

This problem doesn't occur in the CI, probably because they are run in parallel, but locally it does occur.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@qgallouedec qgallouedec requested a review from kashif August 2, 2024 15:23
@qgallouedec qgallouedec merged commit 3320623 into main Aug 5, 2024
4 of 10 checks passed
@qgallouedec qgallouedec deleted the drop_setup_in_test branch August 5, 2024 14:01
qgallouedec added a commit to yumeng5/trl that referenced this pull request Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants