ARTICLE AD
I have barely figured out how to get Gemini going on my Android device, and already, Google has announced it’s putting Gemini 2.0 into real-life robots. The company announced two new AI models that “lay the foundation for a new generation of helpful robots,” as it writes in a blog. In the demonstrations, the robots look like people!
Gemini Robotics is an advanced vision-language-action (VLA) model built on Gemini 2.0—the same one I’ve been feeding PDFs to and asking for help with horoscopes. This version of Gemini 2.0 features the addition of physical actions as the output response to a query. On the Pixel phone, for example, Gemini’s “response” would be to perform an action or answer a question. Gemini in a robot would instead see that command as something it should physically respond to.
The second AI model is Gemini Robots-ER, a vision-language (VLM) model with “advanced spatial understanding.” This is where Gemini gets its “embodied reasoning,” which helps the artificial intelligence navigate its environment even as it changes in real-time. In an example video Google showed in a closed session with journalists, the robot could discern between bowls of varying finishes and colors on a table. It could also differentiate between fake fruits, like grapes and a banana, and then distribute each into one of the specific bowls. In another example, Google showed a robot understanding the nuance of granola in a Tupperware container that needed to be packed in the lunch bag.

Google DeepMind shows how robot arms can grab grapes from a container and place them on the counter.
At the core of this announcement is Google lauding DeepMind’s efforts in making Gemini the kind of “brain” it can drop into the robotic space. But it is wild to think that the AI branding for the smartphone in your hand will, in some capacity, be powering up a humanoid robot. “We look forward to exploring our models’ capabilities and continuing to develop them on the path to real-world applications,” writes Carolina Parada, Senior Director and head of robots at Google’s DeepMind.
Google is partnering with companies like Apptronik to “build the next generation of humanoid robots.” The Gemini Robots-ER model will also become available to partners for testing, including Agile Robots, Agility Robots, Boston Dynamics, and Enchanted Tools. The robots are coming, but there’s no timeline. You can temper your reaction for now.
Google is also preparing itself for the onslaught of questions it will inevitably receive regarding Gemini safeguards. I even asked what protections are in place so that the robot doesn’t go awry and cause physical pain to a human. “We enable Gemini Robotics-ER models to understand whether or not a potential action is safe to perform in a given context,” Google explains, by basing it on frameworks like the ASIMOV dataset, which has helped “researchers to rigorously measure the safety implications of robotic actions in real-world scenarios.” Google says it’s also collaborating with experts in the field to “ensure we develop AI applications responsibly.”