generated from fastai/nbdev_template
-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Issues: huggingface/trl
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
How to calculate the loss of multi-turn dialogue training data?
❓ question
Seeking clarification or more information
🏋 SFT
Related to SFT
#2424
opened Dec 2, 2024 by
NUMB1234
precompute_ref_log_probs not working correctly?
🏋 DPO
Related to DPO
#2423
opened Dec 2, 2024 by
dakru012
7 of 9 tasks
Let DPOTrainer Support padding_free
🏋 DPO
Related to DPO
✨ enhancement
New feature or request
🧒 good second issue
Good for contributors with basic project familiarity
🙋 help wanted
Open invitation for community members to contribute
#2422
opened Dec 1, 2024 by
fzyzcjy
Does simply passing the adapter folders in the model_path allow continued training of the LoRA adapters?
#2418
opened Nov 29, 2024 by
iBibek
Online DPO Quite Slow Compared with Previous Versions
#2416
opened Nov 29, 2024 by
zcw0201
7 of 9 tasks
Add gen_text Argument for Custom Text Generation During Fine-tuning
#2415
opened Nov 29, 2024 by
dame-cell
Online DPO Meets Error When Using Deepspeed for Speed Up.
🐛 bug
Something isn't working
🚀 deepspeed
Related to deepspeed
🏋 Online DPO
Related to Online DPO
#2410
opened Nov 28, 2024 by
zcw0201
7 of 9 tasks
Add use_dora and init_lora_weights to ModelConfig and get_peft_config
✨ enhancement
New feature or request
#2406
opened Nov 28, 2024 by
hommayushi3
SFTTrainer usage
❓ question
Seeking clarification or more information
🏋 SFT
Related to SFT
#2390
opened Nov 25, 2024 by
Humauaca
2 of 3 tasks
eos_token config in PPOTrainer
✨ enhancement
New feature or request
👶 good first issue
Good for newcomers
🏋 PPO
Related to PPO
#2387
opened Nov 23, 2024 by
kechunFIVE
adding DRO trainer
✨ enhancement
New feature or request
🙋 help wanted
Open invitation for community members to contribute
#2383
opened Nov 22, 2024 by
morLev
2 of 3 tasks
DPO does not work for FIM task with non-instruct model
🏋 DPO
Related to DPO
❓ question
Seeking clarification or more information
#2382
opened Nov 22, 2024 by
AML14
7 of 9 tasks
PPO Example Script Accelerator error: initialize your accelerator via
accelerator = Accelerator()
#2377
opened Nov 21, 2024 by
hitzkrieg
2 of 4 tasks
ValueError: Predictions and/or references don't match the expected format.
🐛 bug
Something isn't working
🏋 SFT
Related to SFT
#2376
opened Nov 20, 2024 by
scarafoni
3 of 4 tasks
AttributeError: 'DistributedDataParallel' object has no attribute 'policy' when saving model using PPOTrainer
🐛 bug
Something isn't working
🏋 PPO
Related to PPO
#2375
opened Nov 20, 2024 by
AsiaLootus
8 of 9 tasks
The DPO reward accuracy value is only 0 or 1
🏋 DPO
Related to DPO
⏳ needs more info
Additional information or clarification is required to proceed
❓ question
Seeking clarification or more information
#2371
opened Nov 20, 2024 by
carrot0117
PPO manual reward functions
🏋 PPO
Related to PPO
❓ question
Seeking clarification or more information
#2363
opened Nov 18, 2024 by
schmidtj3
Contributing new distillation related trainers
#2361
opened Nov 16, 2024 by
YihanCao123
1 of 3 tasks
Question about the logprobs of the policy-generated sentences in PPO trainer
#2358
opened Nov 15, 2024 by
yanghh2000
6 of 9 tasks
PPOTrainer with HuggingFace PreTrainedModelWrapper Models
#2357
opened Nov 14, 2024 by
Mrinh212375
7 of 9 tasks
How to train from scratch? Can you provide the code
❓ question
Seeking clarification or more information
#2356
opened Nov 14, 2024 by
sankexin
5 of 9 tasks
Previous Next
ProTip!
Updated in the last three days: updated:>2024-11-29.