Stay Tuned for Updates

The Great
March 100

By RHOS AI Team

We don't aim to create just another benchmark; instead, GM-100 serves as a foundational task list for evaluating embodied AI systems in real-world settings.

Pipeline Diagram

Comprehensive
Task Design Pipeline

"GM-100 comprises 100+ carefully curated tasks derived from a rigorous analysis of human-object interaction primitives and object affordances. This design philosophy ensures coverage of diverse, long-tail, and rare behaviors that critically test the generalization limits of robotic agents. "

100+Tasks

Diverse Skills

HighDifficulty

Challenging Cases

3+Baselines

Feasible to Execute

3+Views

Multi-camera

For each task, we provide over 130 expert teleoperation demonstrations.

We also showcase benchmark results from at least 3 baseline models, accompanied by evaluation videos to help assess task complexity and scoring standards.

This website is currently in a preview phase. Some data may be missing, but we are working to complete it in the coming days. Thank you for your understanding.

Recent UpdatesNEW

2026.01.26Language Prompt (Default in Data, use as reference)

Added language prompts for each task in the dataset to facilitate easier understanding and usage.

Task Showcase

Explore the 100 diverse manipulation tasks included in the benchmark. Click on any card to view detailed performance metrics and download data.

00:15
#001

Hitting Ball into Goal

TrainTest
00:15
#002

Slicing the Object

TrainTest
00:15
#003

Stamping with Seal

TrainTest
00:15
#004

Box Building

TrainTest
00:15
#005

Closing Desktop Drawer

TrainTest
00:15
#006

Threading Hawthorn Skewers

TrainTest
00:15
#007

Trash Disposal

TrainTest
00:15
#008

Transferring Test Tubes

TrainTest
00:15
#009

Turning on Desk Lamp

TrainTest
00:15
#010

Sorting Cubes by Size

TrainTest
00:15
#011

Wiping Whiteboard

TrainTest
00:15
#012

Hammering Button

TrainTest

Interactive Leaderboard

Compare model performance across benchmark tasks. Hover for details, click to explore.

Robot Arm
Evaluation Metric
Select Models to Compare
Please select at least one model to view results.
0%
100%
Performance Scale