Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support proper serialisation of variational states when running under MPI #1831

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

PhilipVinc
Copy link
Member

When running under MPI we serialise only the sampler_state on the first rank, and ignore the one on the other ranks.
This is wrong.
When you load it, you get the same seed and chains on every rank (which is wrong!)

I want to get netket working with proper checkpointing, so getting serialisation to work correctly is necessary.

This seems to be what needs to be done to support proper full serialisation of variational states and the stream of samples they generate.

To avoid raising errors, if you attempt to load a saved state there are three cases:

  • If you have same number of ranks and of chains, then just load it and all is good
  • If you have different number of ranks , but same number of total chains, then just load the chains and use some random seed (we cannot restore perfectly, but at least the chains are already thermalised)
  • If you have different number of chains, then simply load the weights and throw a warning.

I'd like some opinions..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant