feat: IQ quants support #2631

mr-september · 2024-04-05T13:25:50Z

Problem
GGUF models quantized with IQ quants fail to load.

Success Criteria
Load and play as usual

Additional context
IQ quants: ggerganov/llama.cpp#4773

Example model with both traditional Q and new IQ quants: https://huggingface.co/bartowski/Starling_Monarch_Westlake_Garten-7B-v0.1-GGUF

Van-QA · 2024-04-06T02:53:56Z

hi @mr-september,

As I tested, the imported gguf model (Starling_Monarch_Westlake_Garten-7B-v0.1-Q2_K) is working. Can you tell us more details about the issue that you are facing?

If the issue is about broken UI due to long model name, we will resolve it soon 🙏
In case you need it, here is the guideline on how to import GGUF model: https://jan.ai/docs/models#import-or-symlink-local-models.

Many thanks

mr-september · 2024-04-06T05:38:21Z

Thanks for the quick reply. I think the models you downloaded are using traditional quants, could you please check with an IQ quant model? For example:

Also, is there some logs or some other outputs I could share which may help with troubleshooting?

Van-QA · 2024-04-06T06:36:44Z

thank @mr-september,
We were able to reproduce the issue using Starling_Monarch_Westlake_Garten-7B-v0.1-IQ4_XS.gguf. Dev team will investigate the issue soon.

Van-QA · 2024-04-10T01:34:10Z

hi @mr-september,

Using Jan v0.4.10-368 ✅, the Starling_Monarch_Westlake_Garten-7B-v0.1-IQ4_XS.gguf is able to generate response, would you like to try it as well?

Thank you

mr-september · 2024-04-10T08:33:03Z

Beautiful, it's working flawlessly! Very impressive turnaround!

mr-september · 2024-04-14T02:49:51Z

Hi, I think the latest nightly (-376) broke support again. It was a prompted update at startup. Rolling back to -368 still works.

Van-QA · 2024-04-16T04:33:25Z

hi @mr-september, sorry for the inconvenience, due to the Nitro that supports IQ quants is currently facing many issues. Which we have to temporally revert it, and currently working on a fix atm.
cc: @CameronNg @vansangpfiev

Van-QA · 2024-04-17T08:58:40Z

hi @mr-september, the latest nightly build Jan v0.4.11-386 resolved the issue with IQ Quant. Thanks.

mr-september added the type: feature request A new feature label Apr 5, 2024

Van-QA added type: bug Something isn't working type: feature request A new feature and removed type: feature request A new feature type: bug Something isn't working labels Apr 6, 2024

Van-QA self-assigned this Apr 6, 2024

Van-QA assigned hahuyhoang411, louis-jan, CameronNg and vansangpfiev and unassigned Van-QA, louis-jan and hahuyhoang411 Apr 6, 2024

This was referenced Apr 9, 2024

Bump nitro version to 0.3.18 #2652

Merged

milestone: Release 0.4.11 #2627

Closed

Van-QA added this to the v0.4.11 milestone Apr 9, 2024

mr-september closed this as completed Apr 10, 2024

mr-september reopened this Apr 14, 2024

Van-QA modified the milestones: v0.4.11, v0.4.12 Apr 15, 2024

Van-QA mentioned this issue Apr 15, 2024

milestone: Release 0.4.12 #2726

Closed

6 tasks

mr-september closed this as completed Apr 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: IQ quants support #2631

feat: IQ quants support #2631

mr-september commented Apr 5, 2024

Van-QA commented Apr 6, 2024

mr-september commented Apr 6, 2024 •

edited

Loading

Van-QA commented Apr 6, 2024

Van-QA commented Apr 10, 2024

mr-september commented Apr 10, 2024

mr-september commented Apr 14, 2024

Van-QA commented Apr 16, 2024

Van-QA commented Apr 17, 2024 •

edited

Loading

feat: IQ quants support #2631

feat: IQ quants support #2631

Comments

mr-september commented Apr 5, 2024

Van-QA commented Apr 6, 2024

mr-september commented Apr 6, 2024 • edited Loading

Van-QA commented Apr 6, 2024

Van-QA commented Apr 10, 2024

mr-september commented Apr 10, 2024

mr-september commented Apr 14, 2024

Van-QA commented Apr 16, 2024

Van-QA commented Apr 17, 2024 • edited Loading

mr-september commented Apr 6, 2024 •

edited

Loading

Van-QA commented Apr 17, 2024 •

edited

Loading