LLM Evaluation

This library provides a collection of classes and functions to evaluate and compare different large language models (LLMs). The main purpose of the library is to build chatbots and evaluate their responses based on given objectives.

Modules and Classes

LanguageModelWrapper A base class for wrapping different language models.
Prompt A class for managing prompt templates.
BinaryPreference A class for managing binary preferences between two different responses.
BinaryEvaluator A base class for evaluating binary preferences between two different responses.
GPT35Evaluator A class for evaluating binary preferences using the GPT-3.5 LLM.
OpenAIModel An enumeration class for listing available OpenAI LLM models.
OpenAIGPTWrapper A class for wrapping OpenAI's GPT models.
ClaudeWrapper A class for wrapping Anthropic's Claude LLM.
CohereWrapper A class for wrapping Cohere's LLM.
GrokWrapper A class for wrapping Grok's models.
MistralWrapper A class for wrapping Mistral's models.
DeepSeekWrapper A class for wrapping DeepSeek's models.
Llama3Wrapper A class for wrapping Llama 3 models via DeepInfra.
ChatBot A class for creating chatbot instances based on provided LLMs.

Required Setup

Install all from requirements.txt

pip install -r requirements.txt

Create a .env file in the root of the project and add the following API keys:

OPENAI_API_KEY=your_openai_api_key
COHERE_API_KEY=your_cohere_api_key
GROK_API_KEY=your_grok_api_key
MISTRAL_API_KEY=your_mistral_api_key
DEEPSEEK_API_KEY=your_deepseek_api_key
DEEPINFRA_API_KEY=your_deepinfra_api_key

Example Usage

The main.py script provides an example of how to use the library. It initializes all the supported models, defines an objective, and then runs a series of evaluations comparing each model to GPT-3.5.

To run the example:

python main.py

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
README.md		README.md
llm_eval.py		llm_eval.py
main.py		main.py
requirements.txt		requirements.txt
sweep.yaml		sweep.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LLM Evaluation

Modules and Classes

Required Setup

Example Usage

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Languages

spenceryonce/LLMeval

Folders and files

Latest commit

History

Repository files navigation

LLM Evaluation

Modules and Classes

Required Setup

Example Usage

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Languages

Packages