RoboSet — A Diverse Multi-Task Robot Learning Dataset

RoboSet is a large-scale multi-task robot manipulation dataset developed at the University of Texas at Austin RAIL Lab using the Franka Panda robotic arm. Released in 2023 under the CC BY 4.0 license for commercial use, it contains over 100,000 demonstrations spanning diverse tabletop manipulation tasks including pick-and-place, stacking, sorting, tool use, and food preparation operations. The dataset was designed specifically to support multi-task and multi-skill learning, with demonstrations collected across standardised task suites enabling systematic evaluation of generalisation. Data is distributed in HDF5 format via Hugging Face. RoboSet fills a gap between single-task expert datasets and unconstrained in-the-wild collections, providing structured diversity that enables researchers to study task transfer and multi-task policy learning with controlled experimental conditions.

Dataset specifications
Year	2023
Episodes	100,000
Embodiments	Franka Panda
Modalities	rgb, proprioception
View type	third-person, wrist-cam
Task categories	manipulation, pick-and-place, cooking, inspection
Data format	hdf5, lerobot
License	CC BY 4.0
Access	open — commercial use permitted
Maintainer	UT Austin RAIL Lab
Origin country	US

What is it?

RoboSet is a large-scale multi-task robot manipulation dataset developed at the University of Texas at Austin RAIL Lab using the Franka Panda robotic arm. Released in 2023 under CC BY 4.0, it contains over 100,000 demonstrations spanning structured task suites enabling systematic evaluation of multi-task and multi-skill policy learning. Tasks include pick-and-place, stacking, sorting, tool use, and food preparation across standardised tabletop environments.

Who is it for?

RoboSet is designed for researchers studying multi-task robot learning, task transfer, and skill compositionality. Unlike unconstrained in-the-wild datasets, RoboSet's structured task suites enable controlled experimental evaluation of which tasks transfer to each other and how multi-task training affects individual task performance.

Key specifications

Episodes: 100,000+ demonstrations
Embodiment: Franka Panda
Tasks: Pick-and-place, stacking, sorting, tool use, food preparation
Format: HDF5, LeRobot
License: CC BY 4.0 — commercial use permitted
Access: Open — Hugging Face

How it compares

RoboSet is the largest structured multi-task manipulation dataset. DROID (76,000 episodes) covers more environments in-the-wild. BridgeData V2 (60,096 episodes) provides more environmental diversity but less structured task organisation. RoboSet's value is its systematic task design enabling controlled multi-task learning research.

Limitations and access notes

RoboSet covers standardised tabletop tasks — less environmental diversity than DROID or BridgeData V2. Single-arm Franka Panda only. CC BY 4.0 permits commercial use with attribution.

Linked professions

Frequently asked questions

What makes RoboSet different from other large manipulation datasets?

RoboSet is structured around task suites — organised collections of related tasks designed to enable systematic study of multi-task learning and task transfer. Most large datasets collect diverse demonstrations without structured task organisation, making controlled comparisons difficult.

Can RoboSet be used commercially?

Yes. RoboSet is licensed under CC BY 4.0, which permits commercial use with attribution to the UT Austin RAIL Lab.

How do I access RoboSet?

RoboSet is available on Hugging Face at huggingface.co/datasets/jxu124/roboset and via the project GitHub. No registration is required.

What food preparation tasks does RoboSet include?

RoboSet includes food preparation operations such as moving produce between containers, assembling simple food items, and organising kitchen objects. These overlap with automation risk professions including fast food workers and kitchen staff.

Is RoboSet in LeRobot format?

Yes. RoboSet is available in LeRobot format on Hugging Face, making it directly compatible with the Hugging Face LeRobot library for robot learning without additional data conversion.