You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I would love to train GPT2 with a large BPE tokenizer maybe even with llama3's tokenizer as it has a vocab size of 128K. However this code will not work with a tokenizer that has a large vocab. Is there an easy way to add this
The text was updated successfully, but these errors were encountered:
I would love to train GPT2 with a large BPE tokenizer maybe even with llama3's tokenizer as it has a vocab size of 128K. However this code will not work with a tokenizer that has a large vocab. Is there an easy way to add this
The text was updated successfully, but these errors were encountered: