SheetAgent

SheetAgent is a general-purpose agent for spreadsheet reasoning and manipulation powered by large language models. This repository contains the official implementation of the paper "SheetAgent: Towards A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models", which has been accepted as an oral presentation at WWW 2025.

Project Overview

SheetAgent is an innovative autonomous agent that achieves advanced spreadsheet reasoning and precise manipulation through three collaborative modules: Planner, Informer, and Retriever. It excels at handling complex real-world tasks, including those requiring multi-step reasoning and dealing with ambiguous requirements.

Quick Start

Follow these steps to run SheetAgent:

Configure your API key and base URL in openai.yaml
Set the workbook_path and instruction parameters, then run the following command:

python main.py --workbook_path "example_sheets/BoomerangSales.xlsx" \
--instruction "Count the number of each Product and put the results in a new sheet" \
--output_dir "./output" \ 
--few_shot_planner --verbose

The processed workbook (named workbook_new) will be saved in the output_dir directory

For more parameter settings, please refer to main.py.

We have provided several example spreadsheets in the example_sheets directory. You can try different task instructions as needed.

To use the Retriever module, you need to complete the configuration in milvus.yaml and install Milvus first (refer to https://github.com/milvus-io/milvus for detailed installation instructions).

SheetRM Dataset

We have released approximately 60% of the SheetRM dataset, including 25 spreadsheets and 180 tasks. The data is stored in the ./sheetrm directory, with tasks.xlsx containing metadata for the 180 tasks, and the spreadsheets stored in the spreadsheets directory. You can try these challenging tasks following the "Quick Start" guide.

As part of our commitment to open research, we are excited to announce the release of a portion of our dataset. We understand the importance of data accessibility and are actively working on organizing and maintaining the remaining dataset to avoid potential privacy or other issues. Once ready, we will make it available to the public. We strive to ensure the highest quality and usability of our data.

Other Datasets

Our experiments also involve the following datasets:

SheetCopilot Benchmark: https://github.com/BraveGroup/SheetCopilot
WikiTableQuestions: https://github.com/ppasupat/WikiTableQuestions
TabFact: https://github.com/wenhuchen/Table-Fact-Checking
FeTaQA: https://github.com/Yale-LILY/FeTaQA

You can download these datasets via the provided links and test SheetAgent on your preferred dataset. We provide implementation details of SheetAgent on these datasets (see Implementation Details section).

Citation

If you use our code or dataset, please cite our paper:

@article{chen2024sheetagent,
  title={SheetAgent: Towards A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models},
  author={Chen, Yibin and Yuan, Yifu and Zhang, Zeyu and Zheng, Yan and Liu, Jinyi and Ni, Fei and Hao, Jianye and Mao, Hangyu and Zhang, Fuzheng},
  journal={arXiv preprint arXiv:2403.03636},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
core		core
dataset		dataset
example_sheets		example_sheets
prompt		prompt
sheetrm		sheetrm
utils		utils
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SheetAgent

Project Overview

Quick Start

SheetRM Dataset

Other Datasets

Citation

About

Uh oh!

Releases

Packages

Languages

cybisolated/SheetAgent

Folders and files

Latest commit

History

Repository files navigation

SheetAgent

Project Overview

Quick Start

SheetRM Dataset

Other Datasets

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages