-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Checksum validation with hf_hub_download on model files. #2364
Comments
Hi @JGSweets, thanks for opening the issue. The 2 PRs you've linked are only related to "downloading to a local directory", not the generic "downloading into the HF cache directory" workflow. If we add such a validation, we would do it for both. The main problem with checking the file integrity after a download is the time it takes to do it:
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Is your feature request related to a problem? Please describe.
After reviewing: #1738 and #2223 it looks like file checksums are only computed on the cache dir in specific conditions. Ideally, a user could knowingly force a checksum post download as well as on retrieval from cache to ensure integrity of the files with any usage.
It's possible I misunderstood the code or discussion though.
Describe the solution you'd like
Add an input arg and environment variable to enforce checksums on files for each
hf_hub_download
call on the retrieved files.Describe alternatives you've considered
Pre-downloading files manually and manually checking file integrity before using the cached files.
The text was updated successfully, but these errors were encountered: