Skip to content

Pull requests: EleutherAI/lm-evaluation-harness

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Remove newline that broke unimodal task list table
#3489 opened Jan 9, 2026 by ntenenz Loading…
feat: support local directory as dataset_path
#3485 opened Jan 5, 2026 by fanjingxiang Loading…
Fix utils.py for MATH500 evaluation
#3478 opened Dec 24, 2025 by sheriyuo Loading…
nit: double logging
#3468 opened Dec 16, 2025 by baberabb Loading…
Update command for listing tasks in README
#3464 opened Dec 15, 2025 by DavidUlloa6310 Loading…
Multiple Bangla Benchmark datasets added
#3454 opened Dec 9, 2025 by Ismail-Hossain-1 Loading…
Add Uncheatable Eval
#3442 opened Dec 2, 2025 by ziqing-huang Loading…
Refactor TaskManager
#3432 opened Nov 26, 2025 by baberabb Loading…
Fix gsm8k_platinum description
#3411 opened Nov 17, 2025 by fxmarty-amd Loading…
feat: refine Chain-of-Thought removal logic
#3386 opened Nov 6, 2025 by Co-Cl2 Loading…
[feat] Add Countdown Task
#3384 opened Nov 4, 2025 by StephenXie Loading…
Math 500
#3381 opened Nov 1, 2025 by seldereyy Loading…
Add gsm_symbolic and gsm_symbolic_cot tasks
#3354 opened Oct 19, 2025 by MengAiDev Loading…
Added ULQA benchmark
#3340 opened Oct 13, 2025 by keramjan Loading…
Support torchrun vllm DP
#3304 opened Sep 19, 2025 by luccafong Loading…
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.