Jobless Developer
Rhoda ai logo
Rhoda ai

Posted 1 month ago

Open

Research Scientist / Engineer - Post-training & Robot Learning

Palo AltoOn-siteFull-time

AI Summary

Research Scientist/Engineer focused on post-training and robot learning. Designs RL pipelines and adapts video-prediction models to real robot tasks, measuring generalization and real-world performance.

About this role

At Rhoda AI, we're building the full-stack foundation for the next generation of humanoid robots — from high-performance, software-defined hardware to the foundational models and video world models that control it. Our robots are designed to be generalists capable of operating in complex, real-world environments and handling scenarios unseen in training. We work at the intersection of large-scale learning, robotics, and systems, with a research team that includes researchers from Stanford, Berkeley, Harvard, and beyond. We're not building a feature; we're building a new computing platform for physical work — and with over $400M raised, we're investing aggressively in the R&D, hardware development, and manufacturing scale-up to make that a reality.

We're looking for Research Scientists and Research Engineers with deep robotics or autonomous systems domain knowledge to adapt our web-pretrained video model to real robot tasks. Post-training at Rhoda means taking a causal video generation model pretrained on internet-scale data and fine-tuning it on robot-collected demonstrations to produce reliable, generalizable behavior — with as little task-specific data as possible. We hire across levels — from senior to staff.

What You'll Do

  • Design and implement RL training pipelines to improve robot policy performance beyond what imitation learning alone achieves — reward design, online data collection, and policy optimization

  • Develop and apply RL algorithms (PPO, GRPO, or similar) adapted to the video prediction setting, including reward modeling and feedback collection strategies for physical task performance

  • Design and implement broader post-training pipelines: supervised fine-tuning, preference optimization, and behavioral alignment on robot-collected demonstration data

  • Work on the inverse dynamics model that translates video predictions into executable robot actions

  • Build evaluation frameworks for post-trained policies: task success, generalization to novel objects and environments, and failure mode analysis on real hardware

  • Research methods to efficiently adapt models to new tasks with minimal demonstration data, including in-context generalization and few-shot adaptation

  • Identify failure modes and systematic weaknesses in deployed robot policies and drive targeted improvements

  • Iterate quickly between simulation and real robot evaluation to close the feedback loop

  • Collaborate with the pre-training team to surface what capabilities are missing from the base model and need to be addressed upstream

What We're Looking For

  • Hands-on experience with robot systems, robotic policy learning, or autonomous systems in an industry or research setting (robotics, self-driving, or similar physical AI domains)

  • Strong understanding of robot policy learning: imitation learning, behavior cloning, and how RL builds on top of it

  • Practical familiarity with real robot hardware, deployment constraints, and sensor modalities (vision, proprioception)

  • Solid ML skills with hands-on PyTorch experience

  • Ability to diagnose policy failures, reason about distribution shift, and iterate effectively on data and training strategies

  • Comfort with ambiguity and fast-changing research priorities

  • Staff-level candidates are expected to define technical direction and drive research strategy independently; senior candidates execute complex projects with strong fundamentals and growing scope

Nice to Have (But Not Required)

  • Hands-on experience with reinforcement learning — reward design, policy optimization, and online RL training loops — applied to real or near-real environments (robotics, games, simulated physics, or similar); this is a significant plus

  • Prior industry experience in robotics, autonomous driving, or physical AI (e.g., manipulation, mobile robotics, self-driving stacks)

  • Experience with teleoperation systems or robot demonstration collection at scale

  • Familiarity with robot middleware (ROS/ROS2) and real-time control systems

  • Experience with simulation environments for robotics (MuJoCo, Isaac Sim, Genesis)

  • Understanding of video generation models and how they connect to action prediction

  • PhD in Robotics, ML, or a related field

  • Publication record at ICRA, CoRL, RSS, NeurIPS, or related venues

Why This Role

  • Your work is what makes our robots actually perform tasks reliably in the real world — the direct connection between pre-trained capability and deployed behavior

  • Work at a rare intersection: state-of-the-art video generation models applied to real robot hardware, not simulation

  • Fast feedback loop between model changes and real robot performance

  • High ownership on a small team where robotics domain expertise is core to the mission

Skills

Behavior CloningData-driven DebuggingDistribution ShiftEvaluation FrameworksFew-shot AdaptationGRPOImitation LearningIn-context LearningInverse Dynamics ModelingOnline Data CollectionPolicy OptimizationPPOProprioceptionPyTorchReinforcement LearningReward DesignRL Training PipelinesRobot Hardware DeploymentRobot PerceptionRobot PoliciesSimulation-to-real TransferTask GeneralizationVideo Generation ModelsVision Sensors

Explore related jobs

Browse these categories