Following a rag tutorial to get up to speed (this one).
- Use DSPy to organise LLM interaction code
- move chromadb interactions into DSPy module
- use DSPy optimisation feature to improve performance
- using gpt4 to assess similarity between real and predicted answer
- made a simple set of examples in the data/testing_data.csv file
books found in data/books/.
Run | Performance |
---|---|
gpt-3 | 61% |
gpt-3 optimised | 74% |
gpt-3 CoT | 64% |
gpt-3 CoT optimised | 73% |
gpt-4o CoT | 73% |
gpt-4o CoT optimised | 82% |
Seems CoT isn't that important here.
But we see prompt optimisation can easily improve our performance, even if the training data I created isn't that good.
pip install -r requirements.txt
export OPENAI_API_KEY=<your-key>
python -m create_database
python -m main -q "when did the US declare independence?"
# currently set to run with gpt-3
python -m optimise