Skip to content
View YichenZW's full-sized avatar
👋
Welcome to chat with me!
👋
Welcome to chat with me!

Block or report YichenZW

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Beta Lists are currently in beta. Share feedback and report bugs.
Showing results

Reasoning in Large Language Models: Papers and Resources, including Chain-of-Thought and OpenAI o1 🍓

1,724 96 Updated Sep 28, 2024

Awesome-Jailbreak-on-LLMs is a collection of state-of-the-art, novel, exciting jailbreak methods on LLMs. It contains papers, codes, datasets, evaluations, and analyses.

167 14 Updated Oct 14, 2024

An official codebase for the paper, "Measuring and Improving Semantic Diversity of Dialogue Generation", EMNLP 2022 Findings

Python 5 Updated Oct 17, 2022

Scripts for fine-tuning Meta Llama with composable FSDP & PEFT methods to cover single/multi-node GPUs. Supports default & custom datasets for applications such as summarization and Q&A. Supporting…

Jupyter Notebook 12,092 1,936 Updated Oct 15, 2024

clash-for-linux

Shell 1,132 439 Updated Dec 12, 2023
Python 896 188 Updated Jun 27, 2024

Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 29,349 3,393 Updated Oct 14, 2024

Core: Robust Factual Precision Scoring with Informative Sub-Claim Identification

Python 4 Updated Jul 3, 2024

A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation"

Python 280 41 Updated May 19, 2024

Official Github repo for the paper "Evaluating the Evaluation of Diversity in Natural Language Generation"

Python 19 2 Updated Feb 23, 2021

An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.

Jupyter Notebook 1,471 234 Updated Oct 9, 2024
Python 1,226 166 Updated Oct 15, 2024

A recipe for online RLHF and online iterative DPO.

Python 392 44 Updated Oct 7, 2024

AbstainQA, ACL 2024

Python 19 Updated Oct 11, 2024

Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Python 34 5 Updated Jul 5, 2024

A programming framework for agentic AI 🤖

C# 31,925 4,644 Updated Oct 15, 2024

RAID is the largest and most challenging benchmark for machine-generated text detectors. (ACL 2024)

Python 29 10 Updated Oct 4, 2024

List of papers on hallucination detection in LLMs.

638 50 Updated Oct 9, 2024
Python 29 Updated Feb 8, 2024

This repository collects all relevant resources about interpretability in LLMs

265 16 Updated Sep 19, 2024

Awesome things about LLM-powered agents. Papers / Repos / Blogs / ...

1,491 114 Updated Sep 15, 2024

Alpaca dataset from Stanford, cleaned and curated

Python 1,504 150 Updated Apr 14, 2023

A framework for few-shot evaluation of language models.

Python 6,686 1,776 Updated Oct 14, 2024

Code for the AAAI 2023 Paper "Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text"

Jupyter Notebook 16 Updated Jul 26, 2023

Instruction Tuning with GPT-4

HTML 4,182 301 Updated Jun 11, 2023

Set of tools to assess and improve LLM security.

Python 2,619 439 Updated Oct 14, 2024

The MiniAgents visualization tool for simulacra.

Python 13 Updated Apr 18, 2024

Robust recipes to align language models with human and AI preferences

Python 4,582 397 Updated Oct 7, 2024
Next