Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: allow filtering llama server response fields #10940

Merged
merged 7 commits into from
Dec 24, 2024

Conversation

nvrxq
Copy link
Contributor

@nvrxq nvrxq commented Dec 21, 2024

This adds a new feature. Allow Filtering LLama Server Response Fields from #10819.
Usage:
Curl:

curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128, "response_fields": ["content", "generation_settings/n_predict"]}'

Response:

{"content":" 1. Choose a domain name...","generation_settings/n_predict":128}
Retrievals from multiple nested lists are also supported, example:
Given the original data: {"key1" : {"key2": 3}}
With response_fields = ["key1/key2"]
Then, the final response will be: {"key1/key2": 3}.

If response does not contain a path, we will return empty json for path.

Copy link
Collaborator

@ngxson ngxson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also include a guide in the /completion endpoint section of server's documentation.

Adding test case is optional, but is highly recommended. See an example in examples/server/tests/unit/test_completion.py

examples/server/utils.hpp Outdated Show resolved Hide resolved
examples/server/utils.hpp Outdated Show resolved Hide resolved
@github-actions github-actions bot added the python python script changes label Dec 22, 2024
examples/server/README.md Outdated Show resolved Hide resolved
@ngxson ngxson merged commit 09fe2e7 into ggerganov:master Dec 24, 2024
50 checks passed
@isaac-mcfadyen
Copy link
Contributor

isaac-mcfadyen commented Dec 24, 2024

When filtering nested response fields, is this new option un-nesting them?

In the example provided:

curl --request POST \
    --url http://localhost:8080/completion \
    --header "Content-Type: application/json" \
    --data '{"prompt": "Building a website can be done in 10 simple steps:","n_predict": 128, "response_fields": ["content", "generation_settings/n_predict"]}'
{"content":" 1. Choose a domain name...","generation_settings/n_predict":128}

It seems like it's changing the previous format of {"generation_settings": {"n_predict": 128}} (nested) to {"generation_settings/n_predict": 128} (un-nested, with the key having the slash instead).

Is this intended or am I misunderstanding somewhere?

If it's intended it should probably be mentioned in the docs for the server, since it's a bit confusing behavior for a "filter" to actually change the response format IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants