Logo SheetAgent: A Generalist Agent for Spreadsheet Reasoning and Manipulation via Large Language Models

Abstract

Spreadsheet manipulation is widely-existing in most daily works and significantly improves the working efficiency. Large language model (LLM) has been recently attempted for automatic spreadsheet manipulation, but not yet been investigated in complicated and realistic tasks where reasoning challenges exist (e.g., long horizon manipulation with multi-step reasoning and ambiguous requirements). To bridge the gap with the real-world requirements, we introduce SheetRM, a benchmark featuring long-horizon and multi-category tasks with reasoning dependent manipulation caused by real-life challenges. To mitigate the above challenges, we further propose SheetAgent, an novel autonomous agent that utilizes the power of LLMs. SheetAgent consists of three collaborative modules: Planner, Informer and Retriever, achieving both advanced reasoning and accurate manipulation over spreadsheets without human interaction through iterative task reasoning and reflection. Extensive experiments demonstrate that SheetAgent delivers 20-30% pass rate improvements on multiple benchmarks over baselines, achieving enhanced precision in spreadsheet manipulation and demonstrating superior table reasoning abilities.

Overview

overview

SheetAgent can handle diverse spreadsheet reasoning and manipulation tasks automatically. Given a large-scale spreadsheet with multiple sheets, SheetAgent showcases its proficiency in visualization (f), achieves accurate manipulation on long horizon and multi-step tasks (a, b) with consistent reasoning capabilities (c, d), even faced with the challenges like unclear requirements (e).

Demo

This video demonstrates the use of SheetAgent to automatically complete the given instruction Fill the "Bookname" column. Plot a column chart showing the prices of different books in a new sheet named "Chart" on a target spreadsheet. SheetAgent decomposes the task into the following 4 subtasks:

Subtask 1: Load the "Order Details" and "Numbering Reference" sheets from the workbook to prepare for filling the "Bookname" column in "Order Details".

# Load the sheets
order_details_sheet = workbook["Order Details"]
numbering_reference_sheet = workbook["Numbering Reference"]

Subtask 2: Create a dictionary using "Numbering Reference" sheet to map the "Bookid" to the corresponding "Bookname".

# Create a dictionary to map Bookid to Bookname from the "Numbering Reference" sheet
bookname_mapping = {}
for row in numbering_reference_sheet.iter_rows(min_row=2, max_row=18, min_col=1, max_col=2, values_only=True):
    book_id, book_name = row
    bookname_mapping[book_id] = book_name

Subtask 3: Fill the "Bookname" column in the "Order Details" sheet using the `bookname_mapping` dictionary created.

# Fill the 'Bookname' column in 'Order Details' using the bookname_mapping
for row in range(2, 636):  # Start from row 2 to skip the header
    book_id = order_details_sheet.cell(row, 4).value
    book_name = bookname_mapping.get(book_id, None)  # Use None for default if book ID is not found
    order_details_sheet.cell(row, 5).value = book_name

Subtask 4: Create and format a column chart showing the prices of different books using data from the "Numbering Reference" sheet in a new sheet named "Chart".

from openpyxl.chart import BarChart, Reference

# Create a new sheet named 'Chart' if it doesn't exist
if "Chart" not in workbook.sheetnames:
    workbook.create_sheet("Chart")
chart_sheet = workbook["Chart"]

# Create a column chart
chart = BarChart()
chart.type = "col"
chart.style = 10
chart.title = "Book Prices"
chart.y_axis.title = "Unit Price"
chart.x_axis.title = "Bookname"

# Data for the column chart
data = Reference(numbering_reference_sheet, min_col=3, min_row=1, max_row=18)  # Unit Prices
categories = Reference(numbering_reference_sheet, min_col=2, min_row=2, max_row=18)  # Booknames

# Add data and categories to the chart
chart.add_data(data, titles_from_data=True)
chart.set_categories(categories)

# Add the chart to the 'Chart' sheet
chart_sheet.add_chart(chart, "A1")