Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: GPT-4 doesn't support vision & file search in Jan #3520

Open
1 task done
Tracked by #3505
imtuyethan opened this issue Sep 2, 2024 · 3 comments
Open
1 task done
Tracked by #3505

bug: GPT-4 doesn't support vision & file search in Jan #3520

imtuyethan opened this issue Sep 2, 2024 · 3 comments
Assignees
Labels
category: multimodal Vision, audio, video, etc category: providers Local & remote inference providers category: tools RAG, web search, files, function calling type: bug Something isn't working

Comments

@imtuyethan
Copy link
Contributor

imtuyethan commented Sep 2, 2024

  • I have searched the existing issues

Current behavior

https://discord.com/channels/1107178041848909847/1279700963439022090
Users are unable to use file-related features in Jan, including GPT-4 Vision capabilities (image analysis) and document upload for RAG (Retrieval-Augmented Generation), despite having added their OpenAI API key and selecting the GPT-4 model. These functionalities should be available but appear to be non-functional.

Minimum reproduction step

  1. Add OpenAI API key to Jan
  2. Select GPT-4 model
  3. Attempt to send an image for analysis
  4. Attempt to upload a document (e.g., PDF) for RAG

Expected behavior

Jan should be able to:

  • Process images using GPT-4's vision capabilities
  • Allow document uploads for RAG
  • Provide responses or allow for questions about the uploaded content

Screenshots / Logs

Screenshot 2024-09-02 at 5 44 24 PM

Jan version

v0.5.3

In which operating systems have you tested?

Operating System: Pop!_OS 22.04
KDE Plasma Version: 5.24.7
KDE Frameworks Version: 5.92.0
Qt Version: 5.15.3
Kernel Version: 6.9.3-76060903-generic (64-bit)
Graphics Platform: X11
Processors: 16 × 13th Gen Intel® Core™ i5-13500H
Memory: 15.3 GiB of RAM
Graphics Processor: Mesa Intel® Graphics

Btw, I have a dedicated Nvidia 4060 as well.

@imtuyethan imtuyethan added the type: bug Something isn't working label Sep 2, 2024
@imtuyethan imtuyethan changed the title bug: GPT-4 doesn't support have vision & file search in Jan bug: GPT-4 doesn't support vision & file search in Jan Sep 18, 2024
@imtuyethan imtuyethan added the category: providers Local & remote inference providers label Sep 18, 2024
@0xSage
Copy link
Contributor

0xSage commented Oct 13, 2024

related #3505

@0xSage 0xSage added category: tools RAG, web search, files, function calling category: multimodal Vision, audio, video, etc labels Oct 14, 2024
@imtuyethan
Copy link
Contributor Author

Close this ticket as dup?

@0xSage
Copy link
Contributor

0xSage commented Oct 17, 2024

discussions: Remote API Extension #3505

Leave it open bc:

  1. We haven't really impl RAG yet
  2. We don't know RAG will work with remote endpoints

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: multimodal Vision, audio, video, etc category: providers Local & remote inference providers category: tools RAG, web search, files, function calling type: bug Something isn't working
Projects
Status: Investigating
Development

No branches or pull requests

3 participants