-
Notifications
You must be signed in to change notification settings - Fork 318
Insights: EricLBuehler/mistral.rs
Overview
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v0.3.4
published
Nov 28, 2024
18 Pull requests merged by 4 people
-
Bitsandbytes quantization: loading and kernels
#967 merged
Dec 4, 2024 -
Fix example gguf_locally to match chat template requirements
#966 merged
Dec 3, 2024 -
Fix metal fp8 quantization
#962 merged
Dec 2, 2024 -
use llguidance library for constraints (including json schemas)
#899 merged
Dec 2, 2024 -
Improve test speeds on Windows
#961 merged
Dec 1, 2024 -
Ensure support for cuda cc 5.3
#960 merged
Dec 1, 2024 -
Fix completion api behavior of best_of
#959 merged
Dec 1, 2024 -
Fix append_sliding_window
#958 merged
Dec 1, 2024 -
set minimum rustc version to 1.82
#957 merged
Dec 1, 2024 -
Perplexity calculations with imatrix
#952 merged
Dec 1, 2024 -
Support imatrix quantization for vision models
#950 merged
Nov 30, 2024 -
Implement Imatrix for ISQ
#949 merged
Nov 30, 2024 -
Better diffusion interactive mode
#948 merged
Nov 29, 2024 -
fix(docs): fix broken link
#945 merged
Nov 29, 2024 -
Use CUDA_COMPUTE_CAP if nvidia-smi not found
#944 merged
Nov 28, 2024 -
Prepare for v0.3.4
#942 merged
Nov 28, 2024 -
Expose a public tokenization API
#940 merged
Nov 28, 2024 -
Implement the Idefics 3 models (Idefics 3, SmolVLM-Instruct)
#939 merged
Nov 27, 2024
1 Pull request opened by 1 person
-
Bitsandbytes quantization ISQ support
#968 opened
Dec 4, 2024
37 Issues closed by 5 people
-
Failed to parse Cargo.toml: [workspace] missing field `package`
#883 closed
Dec 2, 2024 -
0.3.2 #891 build failure on Windows 11
#896 closed
Dec 2, 2024 -
AICI -> llguidance?
#754 closed
Dec 2, 2024 -
mistral-server with n>1 only returns one result
#955 closed
Dec 2, 2024 -
phi3 output garbage on master
#956 closed
Dec 2, 2024 -
is_streaming: true gives unreachable code panic
#953 closed
Dec 1, 2024 -
Build on ubuntu 24.04 with src/cast.cu
#951 closed
Dec 1, 2024 -
Feature Req: Add Importance Matrix / RAM avail calculations to ISQ
#377 closed
Nov 30, 2024 -
Container image fails to start with 'Unable to dynamically load the "cuda" shared library'
#478 closed
Nov 28, 2024 -
RequestMessage with tool_calls in Assistant message.
#793 closed
Nov 28, 2024 -
How's the M1 performance compare with llama.cpp or ollama?
#673 closed
Nov 28, 2024 -
Cuda error when running gemma2
#715 closed
Nov 28, 2024 -
mistral does not support NVIDIA V100 (compute_cap <= 800)
#305 closed
Nov 28, 2024 -
Batched & chunked prefill
#216 closed
Nov 28, 2024 -
Add C api and provide shared and static libraries.
#258 closed
Nov 28, 2024 -
Enabling prefix cache for llama3 gguf
#347 closed
Nov 28, 2024 -
Is there any plans to support AWQ and GPTQ in the future?
#418 closed
Nov 28, 2024 -
Support loading tokenizer from `sentencepiece` model
#407 closed
Nov 28, 2024 -
Compilation failure with `--features flash-attn`
#485 closed
Nov 28, 2024 -
Extending AnyMoE to support heterogeneous expert types
#544 closed
Nov 28, 2024 -
Installation from PyPi doesn't work
#548 closed
Nov 28, 2024 -
/usr/bin/ld: cannot find -lstdc : No such file or directory
#553 closed
Nov 28, 2024 -
Python metal package runs on CPU
#555 closed
Nov 28, 2024 -
Streamed inference not as smooth (fast?) as with e.g. Ollama - Llama 3.1
#630 closed
Nov 28, 2024 -
Llama 3.2 on macOS: "Metal contiguous affine I64 not implemented"
#807 closed
Nov 28, 2024 -
Unexpected CUDA out of memory for minimal example
#819 closed
Nov 28, 2024 -
How to use UQFF File locally without sending requests to Hugging Face?
#821 closed
Nov 28, 2024 -
0.3.1 #862 new build failure, stop at mistralrs-quant
#866 closed
Nov 28, 2024 -
Docker Build Failure: mistralrs-quant Fails with "No such file or directory" Error
#893 closed
Nov 28, 2024 -
Vision interactive mode for gguf models
#882 closed
Nov 28, 2024 -
CUDA out of memory with a presumed "full" offload to CPU
#751 closed
Nov 28, 2024 -
cuda error not found
#654 closed
Nov 28, 2024 -
Pre-built binary for macOS Silicon does not seem to use Metal / GPU
#629 closed
Nov 28, 2024 -
WSL2 Docker error loading llama-3.1 gguf
#679 closed
Nov 28, 2024 -
Slow CUDA inference speed
#763 closed
Nov 28, 2024 -
CUDA_ERROR_ILLEGAL_ADDRESS when running Llama3 and Llama3.1
#783 closed
Nov 28, 2024
7 Issues opened by 5 people
-
fast-forward tokens with llguidance
#965 opened
Dec 2, 2024 -
parallel computation of mask in constrained sampling
#964 opened
Dec 2, 2024 -
rejection sampling for `top_p` etc
#963 opened
Dec 2, 2024 -
Possible problem with candle 0.8.0 - doesn't build on a GTX1650 (CI 75) nor a GTX1070 (CI 61)
#954 opened
Dec 1, 2024 -
Create and load standalone quantized UQFF models
#947 opened
Nov 29, 2024 -
DiffusionArchitecture not found in python package
#943 opened
Nov 28, 2024 -
Flash Attention not building
#941 opened
Nov 27, 2024
9 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Phi-3.5-vision-Instruct muliples images loading
#795 commented on
Nov 27, 2024 • 0 new comments -
Error: Enable to run Lora - Adapter files are empty
#929 commented on
Nov 28, 2024 • 0 new comments -
Error: DriverError(CUDA_ERROR_INVALID_PTX, "a PTX JIT compilation failed") when loading utanh_bf16
#850 commented on
Nov 28, 2024 • 0 new comments -
Integrating Mistral.rs with Swiftide
#843 commented on
Nov 28, 2024 • 0 new comments -
Multi Image an Multi Prompt issue using Mistral.rs
#853 commented on
Nov 30, 2024 • 0 new comments -
Model Wishlist
#156 commented on
Dec 2, 2024 • 0 new comments -
Confusion around loading a GGUF locally
#922 commented on
Dec 2, 2024 • 0 new comments -
Couldnt run any vision model
#935 commented on
Dec 2, 2024 • 0 new comments -
Add Parler TTS
#791 commented on
Dec 2, 2024 • 0 new comments