Robotics will have ChatGPT Moment Soon
Back in 2024, as AIM expressed excitement for the upcoming GPT moment, little did we anticipate the convergence of numerous robotics advancements.
Recently, Vinod Khosla, founder of Khosla Ventures and the initial investor in OpenAI, underscored his perspective on why robotics will soon have its AI breakthrough.
Khosla said AI’s transformative capabilities foresee a future where AI and robotics liberate humanity from mundane tasks.
He said that robotics will approach a ‘GPT moment’ in the next 2-5 years, when robots will transition from being programmed (following instructions) to learning systems that understand the physical and real-world dynamics, enabling rapid progress in robotics.
It’s Already Happening
A few days back, NVIDIA researchers introduced DrEureka, an LLM-powered agent automating the simulation-to-reality pipeline, effortlessly training a robot dog to balance on a yoga ball without fine-tuning.
Interestingly, DrEureka is built on its prior work, Eureka, the algorithm that teaches a 5-finger robot hand to do pen spinning. “It takes one step further in our quest to automate the entire robot learning pipeline with an AI agent system,” said Jim Fan, senior research manager and lead of Embodied AI (GEAR Lab).
The OpenAI-powered Figure 01 has also been advancing significantly in terms of visual reasoning capabilities. Recently, it was able to differentiate between healthy options like oranges and less desirable choices like chips, with its in-house trained neural network mapping camera input to robot actions at a rapid 10 hz rate.
Brett Adcock, the founder of FigureAI robots, believes that “everyone will own a robot in the future similar to owning a car or phone today,” he added.
Tesla is not left behind. Recently, Optimus was ready to work in factories, sorting battery cells in real-time by leveraging its FSD (full self-driving) computers. It was able to sort battery cells precisely with minimal margins for insertions and automatically target the next available slot.
Google DeepMind also released three robotics research systems starting this year—AutoRT, SARA-RT and RT-Trajectory—that will aid robots in making faster decisions and better understanding and navigating their environments. The models will help with data collection, speed, and generalisation.
Additionally, Stanford University introduced Mobile ALOHA, a system designed to replicate bimanual mobile manipulation tasks requiring whole-body control.
Google DeepMind supported the project, and the technology addresses the limitations of traditional imitation learning from human demonstrations. These general-purpose robots are demonstrated to assist with various tasks such as cooking, cleaning, lifting weights, and other manual activities.
What’s Next?
While advancements in AI research are still common, companies are rushing for the next big breakthrough in robotics. Like NVIDIA releasing project GR00T a month ago and subsequently releasing Dr Eureka, as mentioned earlier, more companies have also been heavily investing in robotics.
Major players like Google DeepMind, Tesla, and NVIDIA are making robotics a priority, so major breakthroughs will likely come soon. Significant progress has also been made in open-source research, with Hugging Face launching LeRobot, an open-source robotics data library, just a couple of days ago.
As NVIDIA CEO Jensen Huang rightly said, “The enabling technologies are coming together for leading roboticists around the world to take giant leaps towards artificial general robotics.”
Clearly, the ChatGPT moment in robotics is not about when; it is now!