-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Running Llama 3.1 on Mac OS with m2 chip has errors #1784
Comments
Looks like transformers bug: huggingface/transformers#31744 But need new transformers for llama 3.1. Maybe stick to GGUF? |
It worked fine with GGUF, I had to install different package version than what is recommended though. I will propose a pull request to be able to start with llama 3.1 |
See #1789 which is how I made llama 3.1 gguf work on Mac M2 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I tried to run h2ogpt with this command :
python generate.py --base_model=meta-llama/Meta-Llama-3.1-8B-Instruct --use_auth_token=...
and it triggered errors
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. Setting `pad_token_id` to `eos_token_id`:128009 for open-end generation. The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results. thread exception: Traceback (most recent call last): File "/Users/.../h2ogpt/src/utils.py", line 524, in run self._return = self._target(*self._args, **self._kwargs) File "/Users/.../h2ogpt/src/gen.py", line 4288, in generate_with_exceptions func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 1727, in generate model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation( File "/Users/.../miniconda3/envs/h2ogpt/lib/python3.10/site-packages/transformers/generation/utils.py", line 493, in _prepare_attention_mask_for_generation raise ValueError( ValueError: Can't infer missing attention mask on `mps` device. Please provide an `attention_mask` or use a different device.
It worked fine when I tried to run other older models like llama2 for example.
Do you know what could be the source of this issue ?
The text was updated successfully, but these errors were encountered: