Skip to content

EunjuYang/jump2llm_

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 

Repository files navigation

🎛 Jump into LLM

This is an awesome repository designed to guide individuals into the world of LLMs (Large Language Models). It is intended for those who already have some knowledge of deep learning, so it may not be suitable for complete beginners.

The repository contains a curated list of papers, websites, and videos to help deepen your understanding of the LLM field. Contributions and discussions are always welcome!

Index

Step Section Subsections
1 Starting Point Transformer
2 Understanding the Training Paradigm of LLMs Pre-Training & Fine-Tuning
3 Understanding the Training Paradigm of LLMs Parameter-Efficient Fine-Tuning (PEFT)
4 Understanding the Training Paradigm of LLMs In-Context Learning (ICL)
Appendix Understanding NLP Tasks

🚩 Starting Point

🤖 Transformer

If you're new to exploring the LLM field, start by understanding the Transformer architecture, which is the foundational building block of most LLMs. Keep in mind that the T in GPT (a model you likely associate with LLMs) stands for Transformer!

⚙️ Undestanding the Training Paradigm of LLM

Pre-Training & Fine-Tuning

Once you understand the Transformer architecture and its functionality, it's essential to grasp the key training paradigms. Understanding LLMs goes beyond just knowing the network architecture—it also involves learning how these models acquire knowledge and how that knowledge is integrated into the network.

When you start reading papers in the LLM field, you'll likely come across terms like pre-training and fine-tuning (as well as zero-shot, one-shot, etc.). Before diving deeper, it's important to understand these concepts, which were introduced in the original GPT paper. Remember, GPT stands for Generative Pretrained Transformer!

Parameter Efficient Fine-Tuning (PEFT)

Pre-training involves acquiring knowledge from large corpus data, while fine-tuning focuses on adapting the model to a specific task using a corresponding dataset. However, fully fine-tuning all the parameters of a network (known as full fine-tuning) is resource-intensive. To address this, several approaches that fine-tune only a subset of parameters have been introduced. These are referred to as Parameter-Efficient Fine-Tuning (PEFT).

In-Context Learning (ICL)

The pre-training and fine-tuning paradigm supports building task-specific expert models. Following this approach, we need to fine-tune a model for each individual task.

However, a new paradigm has emerged that removes the boundaries between tasks, suggesting that pre-training alone is sufficient to handle multiple tasks without the need for fine-tuning. This approach, known as In-Context Learning (ICL), fully leverages the power of pre-training by eliminating the fine-tuning step. In ICL, task information—referred to as context—is provided as input, enabling the pre-trained model to adapt to specific tasks.

The concept of In-Context Learning (ICL) was introduced with GPT-3. However, it is also valuable to read both GPT-2 and GPT-3 to gain a comprehensive understanding of the evolution and capabilities of these models.

🧩 Understanding NLP Tasks

To fully grasp LLMs, it’s important to have a foundational understanding of Natural Language Processing (NLP) tasks. Each task involves a pair of input and desired output, with specific objectives, benchmarks, and evaluation metrics.

NLP Task Type Benchmarks
Token Classification Classification Benchmarks
Translation Seq2Seq Benchmarks
Summarization Seq2Seq Benchmarks
Question Answering Span Extraction / Extractive Benchmarks
  • Please find more details explanation from here!

🎊 Update Info

  • 24.09.08: The initial README has been updated.
    • The section on LLM models will be updated soon.
    • The section on understanding the internals of LLMs will be updated soon.
  • This README currently includes a curated selection of papers for beginners. Additional sub-README files for each section are being prepared.

About

Guide to jump into LLM.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published