Toyota Uses New Generative AI Technique to Quickly Teach Robots New Skills

Toyota Uses New Generative AI Technique to Quickly Teach Robots New Skills
Image Credit: Toyota Research Institute

Toyota Research Institute (TRI) recently unveiled a new generative AI technique that enables robots to quickly and reliably learn complex, dexterous skills. This advancement brings researchers significantly closer to creating versatile assistant robots that can support people in everyday environments.

Most robots are designed to operate in controlled environments, such as factories and warehouses. The programming approach used for these robots was comprehensive but limited; it required precise modeling of the environment and situations a robot might encounter. But for robots to truly evolve, they must adapt to a more complex, unpredictable world. They should not just be task-oriented; they need to be flexible, adaptable, and general-purpose machines that can operate in various environments and under different conditions.

TRI's innovative approach, called Diffusion Policy, allows robots to acquire new dexterous behaviors, such as peeling vegetables or flipping pancakes, in just a few hours. A human operator teleoperates the robot to provide a set of demonstrations. The robot then autonomously learns from these demonstrations, applying the learned behavior to accomplish tasks on its own. This dramatically cuts down the time required to teach a robot new skills, thereby increasing the efficiency and scalability of the training process.

The institute has already leveraged the method to teach robots over 60 skills, including pouring liquids, using tools, and manipulating deformable objects. TRI accomplished this without any new code, instead supplying robots with new demonstration data.

The Diffusion Policy method generates robot behavior by leveraging a generative AI technique known as Diffusion—the same approach used in popular image generation tools like Midjourney. Their approach offers three key advantages:

  1. Applicability to Multi-modal Demonstrations: Human demonstrators can teach behaviors in a natural manner, without worrying about confusing the robot.
  2. High-dimensional Action Spaces: Diffusion Policy allows for planning in time, thus avoiding inconsistent or erratic behavior.
  3. Stable Training: The method is relatively simple to train, eliminating the need for extensive real-world evaluations to fine-tune performance.

Unlike previous teaching techniques requiring extensive programming or trial-and-error, Diffusion Policy efficiently learns behaviors from teacher demonstrations and goal descriptions. The approach produces reliable, repeatable results across varied scenarios. This dexterity enables robots to interact with the world in rich ways, supporting people in everyday, unpredictable settings.

TRI VP of Robotics Russ Tedrake said the institute's robots can now perform "simply amazing" tasks considered unrealistic a year ago. The reliability and speed of acquiring new skills via Diffusion Policy is particularly promising. It allows mastery of traditionally challenging areas like deformable objects and cloth manipulation.

The customized dual-arm robot platform focuses on enabling haptic feedback and tactile sensing. TRI leverages its open source Drake simulation framework for accelerated development. The institute prioritizes safety, designing control systems that respect guarantees against collisions.

By year's end, TRI aims to teach hundreds more abilities using Diffusion Policy. The institute has set an ambitious goal of equipping robots with 1,000 new skills by 2024. However, the learning process is still in the developmental stage; the robots sometimes struggle to generalize their skills across varying conditions and environments. TRI acknowledges that a truly general dexterous robot is still a work in progress.

They are investing in creating a robust curriculum of behaviors, using both physical robots and simulations. One of their goals is to develop a "Large Behavior Model", which would combine semantic capabilities with a high level of dexterity, similar to how existing large language models can understand and generate text.

With Diffusion Policy unlocking versatile, rapid skill acquisition, TRI is moving closer to flexible machines that can work alongside people in homes and workplaces. The institute's generative AI advance constitutes tangible progress, though much research remains to make such robots an everyday reality.

Chris McKay is the founder and chief editor of Maginative. His thought leadership in AI literacy and strategic AI adoption has been recognized by top academic institutions, media, and global brands.

Let’s stay in touch. Get the latest AI news from Maginative in your inbox.