-
Notifications
You must be signed in to change notification settings - Fork 10.1k
Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Cosine similarity is undefined when any vector is zero.
#10968
opened Dec 24, 2024 by
AndyM3
Loading…
server : add support for "encoding_format": "base64" to the */embeddings endpoints
examples
python
python script changes
server
#10967
opened Dec 24, 2024 by
elk-cloner
Loading…
vulkan: im2col and matmul optimizations for stable diffusion
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
Vulkan
Issues specific to the Vulkan backend
#10942
opened Dec 22, 2024 by
jeffbolznv
Loading…
Allow user to compile with any cuda version using github actions
devops
improvements to build systems and github actions
#10928
opened Dec 21, 2024 by
jianlins
Loading…
llamafile_sgemm API - INT8 implementation
ggml
changes relating to the ggml tensor library for machine learning
testing
Everything test related
#10912
opened Dec 20, 2024 by
amritahs-ibm
Loading…
llama : add support for Cohere2ForCausalLM
python
python script changes
#10900
opened Dec 19, 2024 by
dranger003
Loading…
ASCII/Romanization for OuteTTS Multilingual Processing
demo
Demonstrate some concept or idea, not intended to be merged
examples
#10894
opened Dec 19, 2024 by
edwko
Loading…
SYCL: Fixes for building SYCL backend for AMD GPUs
documentation
Improvements or additions to documentation
ggml
changes relating to the ggml tensor library for machine learning
SYCL
https://en.wikipedia.org/wiki/SYCL - GPU programming language
#10851
opened Dec 16, 2024 by
lhl
Loading…
vulkan: multi-row k quants
ggml
changes relating to the ggml tensor library for machine learning
Vulkan
Issues specific to the Vulkan backend
#10846
opened Dec 16, 2024 by
netrunnereve
Loading…
Fix compilation on Pop!_OS 22.04 LTS CUDA
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
#10835
opened Dec 15, 2024 by
mika314
Loading…
add changes relating to the ggml tensor library for machine learning
ggml_backend_sched_dump_dot
ggml
#10825
opened Dec 14, 2024 by
foldl
Loading…
Bamba architecture
Apple Metal
https://en.wikipedia.org/wiki/Metal_(API)
ggml
changes relating to the ggml tensor library for machine learning
python
python script changes
testing
Everything test related
#10810
opened Dec 12, 2024 by
gabe-l-hart
•
Draft
3 tasks
server: bench: minor fixes
examples
performance
Speed related topics
python
python script changes
server
add verbosity -1 to log token, so can output only tokens with -lv -1
examples
#10744
opened Dec 10, 2024 by
YannFollet
Loading…
Cuda build doc
documentation
Improvements or additions to documentation
#10743
opened Dec 10, 2024 by
YannFollet
Loading…
server: Add timeout to stop the server automatically when idling for too long.
examples
server
#10742
opened Dec 9, 2024 by
Sumandora
Loading…
Make->CMake
devops
improvements to build systems and github actions
#10663
opened Dec 4, 2024 by
jboero
Loading…
server: add request aggregation functionallity
examples
server
#10660
opened Dec 4, 2024 by
kalabYibeltal
Loading…
ggml-cpu: replace AArch64 NEON assembly with intrinsics in ggml_gemm_q4_0_4x4_q8_0()
ggml
changes relating to the ggml tensor library for machine learning
#10624
opened Dec 2, 2024 by
angt
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.