feat: User can use Stop Words presets and add custom #3536

dan-homebrew · 2024-09-03T04:07:00Z

Problem

There are many users looking to enter multiple stop words
The "stop" section for configuring the model does not make it clear whether it should be a comma-separated list, a newline-separated list, or something else.
It would be nifty (and I think aligned with how other apps do it) if as you insert a token, it creates a standalone entry in the box. - Something like: <endofstring>, <new_sentence>, <this_is_the_end>.
And then you could individually click-to-delete any of the "stop" tags you created. Hopefully that is comprehensible.

Solution

From technical aspect, there are 2 cases:

Models from cortexso source: the stop token is included in <model_id>.yaml file, so Jan app just read from it to place it as default.
Models from other source: With cortex.cpp, when read model from gguf file, it also contains the stop words list, cortex.cpp will read and write to <model_id>.yaml, Jan app can read from <model_id>.yaml to set as default

Format

Most of model introduce stop tokens with this format <{content}> . The content is different for each model arch. So I think we can predefine a list option of map model's arch : list stop words like this for user to choses:

llama3: ["<|eot_id|>", "<|end_of_text|>"]
mistral: ["< /s > "]
...

Maybe 5 or 6 popular models arch is enough and another option to let users input whatever they want (this feature may be only for power user or dev because normal user might only use default configuration)

Design

Figma: https://www.figma.com/design/DYfpMhf8qiSReKvYooBgDV/Jan-App-(3rd-version)?node-id=8281-97234&t=qb7yU8r2PAayVdNW-4

Use a tag-like interface where each stop word is in its own removable "pill"
Users can add new tags and remove existing ones

Default Stop Words:

These come from the model's YAML file
Should be visually distinct and are recommended not to be removed (they're usually carefully chosen by the model creators for optimal performance, removing these could potentially cause issues with the model's behavior or output quality).
Users should understand these are recommended for the model

Predefined Options:

Offer a dropdown preset or quick-select for common stop words based on model architecture

Custom/User-Added Stop Words:

Added by the user & should be removable

Task

Figure out what stop tokens format should be in @nguyenhoangthuan99
Design @imtuyethan
UI implementation @urmauur

The text was updated successfully, but these errors were encountered:

dan-homebrew · 2024-09-03T04:14:08Z

@nguyenhoangthuan99 however there are some clarifications needed from Inference team. What is the stop token format?

I think the existing "Stop word" (as per the UX above) is incorrect
We need to be technically accurate
Should we prefill the <> for the user?

<|special_token|>
<|end_of_text|>
<|eom_id|>

nguyenhoangthuan99 · 2024-09-04T05:12:23Z

Stop words of a model can be a list so I think we can make it like this

From technical aspect, I think there are 2 cases we can follow:

Models from cortexso source: the stop token is included in <model_id>.yaml file, so Jan app just read from it to place it as default.
Models from other source: With cortex.cpp, when read model from gguf file, it also contains the stop words list, cortex.cpp will read and write to <model_id>.yaml, Jan app can read from <model_id>.yaml to set as default

Format
Most of model introduce stop tokens with this format <{content}> . The content is different for each model arch. So I think we can predefine a list option of map model's arch : list stop words like this for user to choses:

llama3: ["<|eot_id|>", "<|end_of_text|>"]
mistral: ["< /s > "]
...

Maybe 5 or 6 popular models arch is enough and another option to let users input whatever they want (this feature may be only for power user or dev because normal user might only use default configuration)

dan-homebrew mentioned this issue Sep 3, 2024

chore: Minor copy edits for model settings section #3025

Closed

1 task

dan-homebrew changed the title ~~Stop Word Settings~~ ux: Stop Word Settings Sep 3, 2024

dan-homebrew added the feat: model settings label Sep 3, 2024

dan-homebrew assigned nguyenhoangthuan99 and urmauur Sep 3, 2024

dan-homebrew added the needs verification Needs to be verified, unsure if true label Sep 3, 2024

dan-homebrew changed the title ~~ux: Stop Word Settings~~ feat: Stop Word Settings Sep 3, 2024

imtuyethan added the needs designs Needs designs label Sep 3, 2024

imtuyethan self-assigned this Sep 3, 2024

imtuyethan removed needs designs Needs designs needs verification Needs to be verified, unsure if true labels Sep 5, 2024

dan-homebrew changed the title ~~feat: Stop Word Settings~~ feat: User can use Stop Words presets and add custom Sep 11, 2024

0xSage added category: threads & chat Threads & chat UI UX issues P2: nice to have Nice to have feature type: feature request A new feature category: model settings Inference params, presets, templates and removed category: engines labels Oct 14, 2024

imtuyethan mentioned this issue Oct 18, 2024

chore: Structure Icebox in Github Projects #3840

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: User can use Stop Words presets and add custom #3536

feat: User can use Stop Words presets and add custom #3536

dan-homebrew commented Sep 3, 2024 •

edited by imtuyethan

Loading

dan-homebrew commented Sep 3, 2024 •

edited

Loading

nguyenhoangthuan99 commented Sep 4, 2024 •

edited

Loading

feat: User can use Stop Words presets and add custom #3536

feat: User can use Stop Words presets and add custom #3536

Comments

dan-homebrew commented Sep 3, 2024 • edited by imtuyethan Loading

Problem

Solution

From technical aspect, there are 2 cases:

Format

Design

Default Stop Words:

Predefined Options:

Offer a dropdown preset or quick-select for common stop words based on model architecture

Custom/User-Added Stop Words:

Task

dan-homebrew commented Sep 3, 2024 • edited Loading

nguyenhoangthuan99 commented Sep 4, 2024 • edited Loading

dan-homebrew commented Sep 3, 2024 •

edited by imtuyethan

Loading

dan-homebrew commented Sep 3, 2024 •

edited

Loading

nguyenhoangthuan99 commented Sep 4, 2024 •

edited

Loading