hallucination

Star

Here are 49 public repositories matching this topic...

Libr-AI / OpenFactVerification

Star

Loki: Open-source solution designed to automate the process of verifying factuality

ai hallucination factuality

Updated Oct 3, 2024
Python

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

Star

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

reliability calibration safety awesome-list uncertainty-quantification uncertainty-estimation robustness hallucination gpt-3 gpt-4 in-context-learning large-language-models prompt-engineering prompting llms chain-of-thought chatgpt

Updated Jun 18, 2024

BradyFU / Woodpecker

Star

✨✨Woodpecker: Hallucination Correction for Multimodal Large Language Models. The first work to correct hallucinations in MLLMs.

multimodality hallucination hallucinations large-language-models llm mllm multimodal-large-language-models

Updated Jun 17, 2024
Python

amazon-science / RefChecker

Star

RefChecker provides automatic checking pipeline and benchmark dataset for detecting fine-grained hallucinations generated by Large Language Models.

hallucination factuality llms

Updated Oct 12, 2024
Python

FuxiaoLiu / LRV-Instruction

Star

[ICLR'24] Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

evaluation vision vqa llama object-detection gpt evaluation-metrics iclr multimodal vision-and-language hallucination vicuna gpt-4 foundation-models prompt-engineering chatgpt llava iclr2024

Updated Mar 13, 2024
Python

tianyi-lab / HallusionBench

Star

[CVPR'24] HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models

benchmark benchmarks lmm hallucination gpt-4 large-language-models llm llava large-vision-language-models vlms gpt-4v

Updated Sep 30, 2024
Python

IAAR-Shanghai / UHGEval

Star

[ACL 2024] User-friendly evaluation framework: Eval Suite & Benchmarks: UHGEval, HaluEval, HalluQA, etc.

benchmark evaluation dataset openai hallucination huggingface huggingface-transformers ceval gpt-3 openai-api hallucinations gpt-4 large-language-models llm chatgpt qwen hallucination-evaluation hallucination-detection

Updated Oct 8, 2024
Python

IAAR-Shanghai / ICSFSurvey

Star

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

decoding self-improvement knowledge-distillation data-augmentation reasoning self-consistency preference-learning hallucination self-correction attention-head large-language-models chain-of-thought large-language-model internal-consistency self-feedback self-refine self-correct

Updated Oct 26, 2024
Jupyter Notebook

xieyuquanxx / awesome-Large-MultiModal-Hallucination

Star

😎 up-to-date & curated list of awesome LMM hallucinations papers, methods & resources.

multi-modal multimodal lmm hallucination

Updated Mar 23, 2024

ictnlp / TruthX

Star

Code for ACL 2024 paper "TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space"

safety llama representation language-model mistral explainable-ai hallucination baichuan hallucinations gpt-4 truthfulness llm llms chatgpt chatglm llm-inference llama2 llama3

Updated Mar 26, 2024
Python

zjunlp / FactCHD

Star

[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

benchmark natural-language-processing knowledge dataset factual hallucination large-language-models factchd

Updated Apr 28, 2024
Python

yfzhang114 / LLaVA-Align

Star

This is the official repo for Debiasing Large Visual Language Models, including a Post-Hoc debias method and Visual Debias Decoding strategy.

hallucination debiasing large-vision-language-models

Updated Mar 28, 2024
Python

zjunlp / KnowledgeCircuits

Star

[NeurIPS 2024] Knowledge Circuits in Pretrained Transformers

natural-language-processing artificial-intelligence transformer circuit interpretability hallucination large-language-models model-editing knowledge-editing knowledge-edting knowledge-circuit

Updated Oct 18, 2024
Python

HillZhang1999 / ICD

Star

Code & Data for our Paper "Alleviating Hallucinations of Large Language Models through Induced Hallucinations"

decoding-algorithm hallucination large-language-models

Updated Feb 27, 2024
Python

AmourWaltz / Reliable-LLM

Star

knowledge uncertainty reliable hallucination

Updated Sep 10, 2024
JavaScript

Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.

nlp ai evaluation ml pytorch judge feedback-collection sota custom-dataset finetuning hallucination llm llm-evaluation hallucination-detection phi-3

Updated Jul 10, 2024
Jupyter Notebook

anlp-team / LTI_Neural_Navigator

Star

"Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases" by Jiarui Li and Ye Yuan and Zehua Zhang