Skip to content

Github repo for JAMIA paper "BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing"

Notifications You must be signed in to change notification settings

bio-nlp/BioInstruct

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 

Repository files navigation

BioInstruct

🔬 Exciting breakthrough in BioNLP! 🧬

We're thrilled to introduce BioInstruct—a dataset enhancing LLMs like Llama with 25,000 tailored instructions for biomedical tasks. Our research shows remarkable gains in question answering (QA), information extraction (IE), and text generation.

🌟 Highlights:

  • 17.3% boost in QA accuracy
  • 5.7% increase in IE F1 score
  • 96% improvement in text generation tasks

By marrying instruction tuning with multi-task learning, our results also show that the performance gain is significantly higher when the LLM is instruction fine-tuned on closely related tasks.

For more details, please check out our paper.

Dataset

The BioInstruct dataset is available through huggingface dataset.

Citation Information

@article{Tran2024Bioinstruct,
    author = {Tran, Hieu and Yang, Zhichao and Yao, Zonghai and Yu, Hong},
    title = "{BioInstruct: instruction tuning of large language models for biomedical natural language processing}",
    journal = {Journal of the American Medical Informatics Association},
    pages = {ocae122},
    year = {2024},
    month = {06},
    issn = {1527-974X},
    doi = {10.1093/jamia/ocae122},
    url = {https://doi.org/10.1093/jamia/ocae122},
    eprint = {https://academic.oup.com/jamia/advance-article-pdf/doi/10.1093/jamia/ocae122/58084577/ocae122.pdf},
}

Contribute

Have a specific task and instruction you'd like an LLM to perform in a clinical setting? Raise a new issue here! Your contributions will aid in refining LLMs to be more effective and relevant in healthcare environments.

About

Github repo for JAMIA paper "BioInstruct: Instruction Tuning of Large Language Models for Biomedical Natural Language Processing"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published