Skip to content
View ydyjya's full-sized avatar

Block or report ydyjya

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

An awesome repository & A comprehensive survey on interpretability of LLM attention heads.

TeX 247 6 Updated Oct 26, 2024
Python 3 Updated Oct 19, 2024

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 and reasoning techniques.

4,883 271 Updated Oct 23, 2024

The code for AED which's a method to help LLM defend jailbreaks

Python 2 Updated Jul 29, 2024

Official Code for Paper: Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications

Python 57 8 Updated Oct 4, 2024

S-Eval: Automatic and Adaptive Test Generation for Benchmarking Safety Evaluation of Large Language Models

39 3 Updated Oct 27, 2024

[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"

78 1 Updated Sep 21, 2024

Using sparse coding to find distributed representations used by neural networks.

Jupyter Notebook 178 28 Updated Nov 10, 2023
Python 307 32 Updated Jul 19, 2024
Jupyter Notebook 30 3 Updated Jun 13, 2024

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Language Models [NeurIPS 2024 Datasets and Benchmarks Track]

Python 213 20 Updated Sep 26, 2024

Repository for "StrongREJECT for Empty Jailbreaks" paper

Jupyter Notebook 105 5 Updated Aug 11, 2024

LLM training in simple, raw C/CUDA

Cuda 24,236 2,722 Updated Oct 2, 2024

Train transformer language models with reinforcement learning.

Python 9,858 1,248 Updated Oct 28, 2024

An Extensible Toolkit for Finetuning and Inference of Large Foundation Models. Large Models for All.

Python 8,258 826 Updated Oct 25, 2024

Papers and resources related to the security and privacy of LLMs 🤖

Python 423 31 Updated Sep 9, 2024

[ICML 2024] TrustLLM: Trustworthiness in Large Language Models

Python 456 43 Updated Sep 29, 2024

Persuasive Jailbreaker: we can persuade LLMs to jailbreak them!

HTML 255 19 Updated Oct 10, 2024

Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.

Jupyter Notebook 38,474 4,068 Updated Jul 28, 2024

SC-Safety: 中文大模型多轮对抗安全基准

101 7 Updated Mar 15, 2024

Use ChatGPT to summarize the arXiv papers. 全流程加速科研,利用chatgpt进行论文全文总结 专业翻译 润色 审稿 审稿回复

Python 18,404 1,930 Updated Apr 4, 2024

Reference implementation for DPO (Direct Preference Optimization)

Python 2,122 172 Updated Aug 11, 2024

Set of tools to assess and improve LLM security.

Python 2,658 442 Updated Oct 22, 2024

LLMs build upon Evol Insturct: WizardLM, WizardCoder, WizardMath

Python 9,240 716 Updated Aug 5, 2024

Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"

932 49 Updated Sep 4, 2024

The repository for the survey paper <<Survey on Large Language Models Factuality: Knowledge, Retrieval and Domain-Specificity>>

325 27 Updated Apr 25, 2024

Official inference library for Mistral models

Jupyter Notebook 9,668 855 Updated Oct 16, 2024

Generative Agents: Interactive Simulacra of Human Behavior

17,267 2,220 Updated Aug 5, 2024

A curated list of awesome LLM agents.

481 42 Updated Jul 1, 2024
Next