This project automates the evaluation of LLM-generated answers to computer science questions using LangSmith and Google's Gemini model. It reads a dataset of questions and answers, generates predictions, and grades them using custom and LLM-based evaluators.
.env # Environment variables (API keys, config)
data.csv # CSV file with questions and answers
eval.py # Script to run evaluation experiments
main.py # Script to create LangSmith dataset from CSV
prompt.txt # Prompt template for grading
requirements.txt # Python dependencies
-
Clone the repository and navigate to the project folder.
-
Install dependencies:
pip install -r requirements.txt
-
Configure API keys:
- Edit
.envwith your LangSmith and Google Gemini API keys. - Example:
LANGCHAIN_API_KEY=your_langsmith_api_key GOOGLE_API_KEY=your_google_api_key
- Edit
- Ensure
data.csvcontains your questions and answers in the format:Question,Answer What is an algorithm?,A step-by-step set of operations to solve a problem or perform a task. ...
Run [main.py](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/main.py) to upload the questions and answers to LangSmith:
python main.pyThis will create a dataset named test_data in your LangSmith project.
Run [eval.py](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/eval.py) to evaluate LLM-generated answers:
python eval.py- The script uses Gemini to generate answers for each question.
- It grades each answer using:
- A custom length-based metric.
- An LLM-based QA evaluator using the prompt in [
prompt.txt](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/prompt.txt).
- Results are stored and viewable in your LangSmith dashboard under the experiment prefix
google-gemini.
- Prompt: Edit [
prompt.txt](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/prompt.txt) or thetemplatevariable in [eval.py](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/eval.py) to change grading instructions. - Evaluation Metrics: Add or modify evaluators in [
eval.py](c:/Users/HP/Desktop/GenAI_Projects/LLM RESPONSE/eval.py).
- Ensure your API keys are correct and active.
- Check that your LangSmith project name matches
.env. - For CSV parsing issues, ensure no extra commas in questions/answers.
This project is for educational and research purposes.
Contact: For issues or questions, open an issue