OPEN ROLE / DATA ALGORITHMS

Data Algorithm Engineer — Embodied Intelligence Human Video Direction

Build scalable human first-person video data systems for embodied intelligence foundation models.

What You Will Do

Human Ego Video Data System Construction

Build a human first-person video data system for training embodied intelligence foundation models. Around core scenarios such as human operation behavior, hand interaction, body motion, object manipulation, and scene understanding, construct high-quality, scalable, and continuously iterable embodied intelligence data assets.

Large-Scale Ego Video Data Management

Responsible for cleaning, filtering, deduplication, segmentation, retrieval, storage, version management, and distribution of large-scale human ego video data. Support efficient management and training usage of TB-to-PB-level multimodal video data.

Ego Video Annotation System Design

Design automated/semi-automated annotation specifications and production workflows for human first-person videos, including hand trajectories, body poses, action phases, object interactions, operation intent, task steps, key frames, failure behaviors, and long-horizon operation processes.

Hand Tracing and Body Tracing Algorithm Optimization

Develop and optimize algorithms for hand tracking, body pose estimation, action trajectory extraction, and hand-object interaction modeling in human ego videos, improving the automated extraction quality of hand, body, and action trajectory data.

Automated Annotation and Data Production Pipeline

Build large-scale automated/semi-automated annotation pipelines for human ego videos. Combine vision models, multimodal models, prompt engineering, rule engines, and human quality inspection workflows to improve annotation efficiency and accuracy.

Data Quality Evaluation and Closed-Loop Iteration

Establish an Ego video data quality evaluation system. Continuously optimize data filtering, annotation, and sampling strategies, and improve data quality based on model training results and failure cases.

Data Processing Inference Deployment and Performance Tuning

Responsible for inference deployment and performance optimization of video data processing, tracking algorithms, automatic annotation models, and data quality inspection models, ensuring efficient operation of large-scale data production pipelines.

Collaboration with Embodied Intelligence Model Training

Work closely with algorithm teams in embodied intelligence, VLA foundation models, multimodal foundation models, and robot learning, focusing on imitation learning, action prediction, world models, video-action modeling, and policy learning.

What We Expect

Academic Background

Background in computer vision, artificial intelligence, robotics, data science, computer science, automation, electronic information, mathematics, or related fields. Bachelor’s degree or above, with a solid foundation in vision algorithms, data engineering, and machine learning.

Ego Video Data Experience

Familiar with the characteristics of human first-person video data and understands the value of Ego videos in embodied intelligence, robot learning, imitation learning, action prediction, and vision-language-action modeling. Experience with Ego videos, human operation videos, robot operation videos, or multimodal video data processing is preferred.

Large-Scale Data Management Ability

Experience managing large-scale video or multimodal data. Familiar with the full process of data collection, cleaning, deduplication, segmentation, indexing, version management, quality evaluation, data distribution, and training set construction. Able to support stable production and iteration of TB-to-PB-level data.

Video Annotation and Data Production Experience

Experience designing or implementing large-scale video data annotation systems. Familiar with annotation tasks such as keypoints, trajectories, action phases, object interaction, task steps, temporal boundaries, and semantic labels. Able to formulate annotation specifications, quality inspection standards, and delivery acceptance workflows.

Hand / Body Tracking Algorithm Ability

Familiar with hand tracking, human pose estimation, keypoint detection, object detection, segmentation, optical flow, trajectory modeling, and temporal modeling. Experience in hand tracing, body tracing, hand-object interaction, pose estimation, or motion tracking projects is preferred.

Programming and Algorithm Implementation Ability

Proficient in Python and familiar with PyTorch or other mainstream deep learning frameworks. Skilled in using OpenCV, NumPy, Pandas, and other data/video processing tools. Strong coding ability, able to independently develop algorithm experiments, data processing scripts, automated annotation tools, and evaluation tools.

Inference Deployment and Performance Optimization Ability

Experience in model inference deployment and data processing performance tuning. Familiar with batch inference, distributed inference, GPU inference optimization, memory optimization, multiprocessing/multithreading, task scheduling, and throughput optimization. Experience with TensorRT, ONNX Runtime, Triton, Ray, Spark, or Flink is preferred.

Understanding of Embodied Intelligence

Understands the basic paradigms of embodied intelligence, VLA models, multimodal foundation models, robot learning, or imitation learning. Able to derive data collection, annotation, filtering, and evaluation strategies from model training requirements, rather than only completing isolated data processing tasks.

开放职位 / 数据算法

数据算法工程师(具身智能人类视频方向)

建设面向具身智能大模型训练的人类第一视角视频数据体系。

你将做什么

人类 Ego 视频数据体系建设

负责面向具身智能大模型训练的人类第一视角视频数据体系建设,围绕人类操作行为、手部交互、身体运动、物体操作、场景理解等核心场景,构建高质量、可规模化、可持续迭代的具身智能数据资产。

大规模 Ego 视频数据管理

负责大规模人类 Ego 视频数据的清洗、筛选、去重、切分、检索、存储、版本管理与分发,支持 TB-PB 级多模态视频数据的高效管理与训练使用。

Ego 视频标注体系设计

设计人类第一视角视频的自动化/半自动化标注规范和生产流程,包括手部轨迹、身体姿态、动作阶段、物体交互、操作意图、任务步骤、关键帧、失败行为、长程操作过程等标注内容。

Hand Tracing 与 Body Tracing 算法优化

负责人类 Ego 视频中的手部追踪、身体姿态估计、动作轨迹提取、手-物交互关系建模等算法研发与优化,提升手部、身体和动作轨迹数据的自动化提取质量。

自动化标注与数据生产 Pipeline

建设面向人类 Ego 视频的大规模自动化/半自动化标注 Pipeline,结合视觉模型、多模态模型、Prompt 工程、规则引擎和人工质检流程,提升标注效率与准确率。

数据质量评估与闭环迭代

建立 Ego 视频数据质量评估体系,持续优化数据筛选、标注和采样策略,结合模型训练效果和失败案例,提升数据质量。

数据处理推理部署与性能调优

负责视频数据处理、追踪算法、自动标注模型和数据质检模型的推理部署与性能优化,保障大规模数据生产链路的高效运行。

具身智能模型训练协同

深度协同具身智能、VLA 大模型、多模态大模型、机器人学习等算法团队,围绕 imitation learning、action prediction、world model、video-action modeling、policy learning 等方向。

我们希望你

专业背景

具有计算机视觉、人工智能、机器人学、数据科学、计算机科学、自动化、电子信息、数学等相关专业背景,本科及以上学历,具备扎实的视觉算法、数据工程和机器学习基础。

Ego 视频数据经验

熟悉人类第一视角视频数据的特点,理解 Ego 视频在具身智能、机器人学习、模仿学习、动作预测和视觉-语言-动作建模中的价值;有 Ego 视频、人类操作视频、机器人操作视频或多模态视频数据处理经验者优先。

大规模数据管理能力

具备大规模视频或多模态数据的管理经验,熟悉数据采集、清洗、去重、切分、索引、版本管理、质量评估、数据分发和训练集构建等完整流程,能够支持 TB-PB 级数据的稳定生产与迭代。

视频标注与数据生产经验

有大规模视频数据标注体系设计或落地经验,熟悉关键点、轨迹、动作阶段、物体交互、任务步骤、时序边界、语义标签等标注任务,能够制定标注规范、质检标准和交付验收流程。

Hand / Body Tracking 算法能力

熟悉手部追踪、人体姿态估计、关键点检测、目标检测、分割、光流、轨迹建模、时序建模等相关算法,有 hand tracing、body tracing、hand-object interaction、pose estimation 或 motion tracking 项目经验者优先。

编程与算法实现能力

精通 Python,熟悉 PyTorch 或其他主流深度学习框架,熟练使用 OpenCV、NumPy、Pandas 等数据处理与视频处理工具,具备扎实的代码实现能力,能够独立完成算法实验、数据处理脚本、自动化标注工具和评测工具开发。

推理部署与性能优化能力

具备模型推理部署和数据处理性能调优经验,熟悉批量推理、分布式推理、GPU 推理优化、显存优化、多进程/多线程处理、任务调度和吞吐优化,有 TensorRT、ONNX Runtime、Triton、Ray、Spark、Flink 等经验者优先。

具身智能理解

理解具身智能、VLA 模型、多模态大模型、机器人学习或 imitation learning 的基本范式,能够从模型训练需求出发反推数据采集、标注、筛选和评测策略,而不仅是完成单点数据处理任务。

Apply申请

Send your resume, project links, or a short note about relevant work.

请发送你的简历、项目链接,或一段关于相关经历的简短说明。