llama.cpp
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
DESCRIPTION llama.cpp is a machine learning library for large language models LICENSE MIT ORIGIN https://github.com/ggerganov/llama.cpp/ 8b3befc0e2ed8fb18b903735831496b8b0c80949 2024-08-16 LOCAL MODIFICATIONS - See [jart] and [kawrakow] annotations - Remove MAP_POPULATE because it makes mmap(tinyllama) block for 100ms - Refactor ggml.c, llama.cpp, and llava to use llamafile_open() APIs - Unify main, server, and llava-cli into single llamafile program - Make cuBLAS / hipBLAS optional by introducing tinyBLAS library - Add support to main() programs for Cosmo /zip/.args files - Introduce pledge() SECCOMP sandboxing to improve security - Call exit() rather than abort() when GGML_ASSERT() fails - Clamp bf16/f32 values before passing to K quantizers - Make GPU logger callback API safer and less generic - Write log to /dev/null when main.log fails to open - Make main and llava-cli print timings on ctrl-c - Make emebeddings CLI program shell scriptable - Avoid bind() conflicts on port 8080 w/ server - Use runtime dispatching for matmul quants - Remove operating system #ifdef statements - Remove stdout logging from LLaVA