Ranking Simulation

A simulation environment for evaluating different ranking methods for ForecastBench.

Overview

The simulation uses 2024 July forecasting round which had 111 forecasters (humans & LLMs) answering 471 questions. Each forecaster provided a forecast for each question, providing a very clean dataset that can be used to construct a simulation environment.

This project currently compares three ranking methods:

Brier Score: Standard accuracy metric for probabilistic predictions
Brier Skill Score (BSS): Performance relative to a reference model
Peer Score: Performance relative to the average of all models

The simulation tests how robust these rankings are across simulations.

Project Structure

ranking-simulation/
├── src/                    # Main source code
│   └── ranking_sim.py      # Core simulation functions
├── tests/                  # Unit tests
│   └── test_ranking_sim.py # Test suite
├── notebooks/              # Jupyter notebooks for development
│   └── dev.ipynb           # Development playground
├── data/                   # Data directory (contents not tracked)
│   ├── raw/                # Input data
│   ├── processed/          # Processed data
│   └── results/            # Simulation outputs
└── run_simulation.py       # Main script to run simulations

Setup

Clone the Github repo. Create the expected data directories

mkdir -p data/{raw,processed,results}
pip install -r requirements.txt
cp tests/data/2024_07_21_llm_and_human_leaderboard.pkl data/raw/llm_and_human_leaderboard.pkl

Usage

Run the simulation:

python run_simulation.py

Run tests:

make test

Run linter:

make lint

Requirements

GNU Make
Python 3.7+
See requirements.txt for required Python packages

Name		Name	Last commit message	Last commit date
Latest commit History 104 Commits
.github/workflows		.github/workflows
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_simulation.py		run_simulation.py
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ranking Simulation

Overview

Project Structure

Setup

Usage

Requirements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

forecastingresearch/forecastbench-ranking-simulation

Folders and files

Latest commit

History

Repository files navigation

Ranking Simulation

Overview

Project Structure

Setup

Usage

Requirements

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages