PushT — Diffusion Policy Reference Dataset

PushT is a simulated 2D manipulation benchmark dataset developed at Columbia University and distributed by Hugging Face as part of the LeRobot library. Released under Apache 2.0, it contains 206 demonstration episodes of a circular end-effector pushing a T-shaped block to a target position in a 2D environment. Despite its small scale and simulated nature, PushT became the canonical benchmark dataset for evaluating diffusion-based robot learning policies following the Diffusion Policy paper. It is the primary reference dataset for comparing imitation learning algorithms due to its controlled and reproducible evaluation conditions. The dataset is fully integrated into the LeRobot library and is used by most robot learning researchers as a first validation benchmark before testing on real-robot datasets. The task requires non-trivial contact-rich manipulation and precise spatial reasoning.

Dataset specifications
Year2023
Episodes206
Embodimentssimulated 2D pusher
Modalitiesrgb
View typethird-person
Task categoriesmanipulation, pick-and-place
Data formatlerobot, parquet
LicenseApache 2.0
Accessopen — commercial use permitted
MaintainerHugging Face, Columbia University
Origin countryUS

What is it?

PushT is a simulated 2D manipulation benchmark dataset developed at Columbia University and distributed by Hugging Face as part of the LeRobot library. Released under Apache 2.0, it contains 206 demonstration episodes of a circular end-effector pushing a T-shaped block to a target position in a 2D environment. Despite its small scale, PushT became the canonical benchmark for evaluating diffusion-based robot learning policies following the Diffusion Policy paper (Chi et al., 2023).

Who is it for?

PushT is used by virtually all robot learning researchers as a first algorithm validation benchmark. Before testing a new policy learning method on expensive real-robot datasets, researchers validate on PushT due to its fast evaluation cycle, reproducible environment, and well-established baseline results. It is particularly useful for comparing imitation learning algorithms including behaviour cloning, DDPM-based diffusion policies, and flow matching approaches.

Key specifications

How it compares

PushT is not comparable to real-robot datasets in scale or complexity — it is a benchmark, not a pretraining source. Its value is standardisation: any researcher reporting results on PushT can be directly compared to hundreds of other papers using the same evaluation. It is to robot learning what MNIST is to computer vision — a simple, universal entry-point benchmark.

Limitations and access notes

PushT is simulated and 2D — results do not directly translate to real-robot performance. It is intended as an algorithm validation benchmark, not a pretraining dataset. Apache 2.0 permits unrestricted commercial use.

Frequently asked questions

Why is PushT important if it only has 206 episodes?

PushT is the canonical benchmark for robot learning algorithms, not a pretraining dataset. Its value is standardisation — hundreds of papers report results on PushT, making it the universal comparison point for imitation learning methods. Small scale makes evaluation fast and reproducible.

What is Diffusion Policy?

Diffusion Policy is an influential robot learning algorithm from Columbia University that applies denoising diffusion probabilistic models to robot action prediction. It was first benchmarked on PushT and demonstrated strong results on contact-rich manipulation tasks. Diffusion Policy sparked significant follow-on research and PushT became the standard comparison benchmark.

Is PushT a real robot dataset?

No. PushT is a simulated 2D environment — a circular end-effector pushes a T-shaped block on a flat surface. There is no physical robot involved. It is used to quickly validate algorithm correctness before testing on real-robot hardware.

How is PushT related to LeRobot?

PushT is included as a built-in example dataset in the Hugging Face LeRobot library. It is the default dataset for testing LeRobot installation and is used in all LeRobot tutorials and quickstart guides.

Can PushT be used commercially?

Yes. PushT is licensed under Apache 2.0, permitting unrestricted commercial use, modification, and redistribution.