Skip to content

This is a demo repository for parallel multi-index question answering using streamlit and llama index

License

Notifications You must be signed in to change notification settings

vdeeplearn/LLM_Mini_Series_Part_II

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This repo contains a demo streamlit aplication with which it is possible to perform parallel question answering across multiple pdf Documents. This functionality is enabled by Llama index and Large Language Models (LLMs). The code is tested with text-davinci-003 from OpenAI with the configuration in app_config.yaml. Every OpenAI model can be used in the configuration app_config.yaml.

General

This repo provides some example pdf Documents for indexing which have been generated via ChatGPT !!! Please note that it is not legally compliant to send personal identifiable information to LLM Apis like OpenAI Make sure to test the App only with fictional CV documents, anynomize the CV documents or execute queries against a locally deployed LLM model instead of using OpenAI. PTC takes no legal responsibility for what data you send to OpenAI via this application !!!

Getting Started

  1. Download the ESCO dataset version 1.1.0 Link to ESCO Download
  • Version: ESCO dataset - v1.1.0
  • Content: classification
  • Language: en
  • File type: csv

1.1. Unzip the .csv file you get send via Email and set the path as an environment variable

  1. Setup your environment variables e.g. in an .env file

OPENAI_API_KEY = "here comes your openai api key" (example)

ESCO_NER_SEARCHTERMS= "your_path_to_esco_searchterms_skill_ner_csv/ESCO dataset - v1.1.0 - classification - en - csv/searchterms_skill_ner.csv" (example)

  1. Install Poetry Link to Poetry CLI installation tutorial

  2. Create Poetry environment and install the package

Link to Poetry Environment Management

  • In your terminal confirm that poetry is available:

poetry --version

  • Start a poetry console:

poetry shell

  • Install the package and dependencies via the pyproject.toml:

poetry install

If you cannot use Poetry for your dependency management you can alternatively install the requirements via

pip install -r requirements.txt

  1. Launch the streamlit app via the poetry shell:
  • streamlit run <your_absolute_path_2_the_app>\multi_index_demo\app.py

Overview of the application architecture

rag_overview indexing_stage multi_index_queries

About

This is a demo repository for parallel multi-index question answering using streamlit and llama index

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.2%
  • CSS 0.8%