AI
From Silicon Valley Dreams to National Imperative: Inside Google’s Bold Quest to Empower AI with Robotics
To go back to this article, navigate to My Profile, and then check your saved stories.
Embarking on Google's 7-Year Journey to Equip AI with a Physical Form
In the fresh days of January 2016, I became part of the team at Google X, the clandestine innovation hub of Alphabet. My role was to strategize on the next steps for the employees and technologies inherited from the acquisition of nine robotics firms. There was a palpable sense of confusion. Andy Rubin, known as "the father of Android" and the prior leader, had departed under enigmatic conditions. Larry Page and Sergey Brin, in their limited free moments, attempted to provide some direction. Astro Teller, who leads Google X, had decided a few months prior to integrate the robotics teams into the lab, lovingly dubbed the moonshot factory.
I decided to join because Astro had persuaded me that Google X—or X, as it later became known—was going to be a unique kind of corporate innovation hub. The founders aimed for grand visions and had the necessary long-term investment to bring these ideas to life. Having founded and sold multiple tech ventures myself, this opportunity seemed perfect. X appeared to be exactly the type of project Google should be involved in. Based on my own experiences, I understood the challenge of creating a business that could, in the words of Steve Jobs, make a significant impact on the world, and I was convinced that Google was the ideal place for pursuing transformative projects. Developing AI-powered robots that could eventually coexist and collaborate with humans was one of these bold initiatives.
More than eight years have passed, and a year and a half since Google chose to abandon its major venture in robotics and artificial intelligence, it appears that a fresh robotics startup emerges on the scene weekly. My conviction that robots are essential has only deepened. However, I worry that Silicon Valley's penchant for launching "minimum viable products" and venture capitalists' typical reluctance to fund hardware projects might hinder its ability to lead the international competition in integrating AI with robotics. Furthermore, a significant portion of the investments seems to be targeting misguided priorities. Here's the reason.
The Concept Behind "Moonshot"
In 2010, Google X emerged, later becoming synonymous with Everyday Robots, from an ambitious vision that Google had the capacity to address some of the most challenging issues facing the globe. Positioned deliberately in a separate facility several miles from Google's primary campus, X was designed to cultivate a distinct culture and encourage thinking that pushed beyond conventional boundaries. There was a strong emphasis on motivating those at X to embrace significant challenges, engage in swift experimentation, and view failure as a testament to setting extraordinarily high objectives. By the time I joined, the lab had already produced groundbreaking initiatives like Waymo and Google Glass, along with other ventures that seemed like they belonged in a science fiction novel, such as airborne turbines generating power and high-altitude balloons designed to bring internet connectivity to remote areas.
X projects distinguished themselves from the typical Silicon Valley enterprises by the grand scale and long-range vision they were inspired to embrace. To earn the title of a moonshot, X employed a specific criterion: Initially, the endeavor had to tackle an issue impacting a vast population, in the hundreds of millions or even billions. Next, it required the identification of an innovative technology that could pave the way for a novel approach to the problem. Lastly, the proposal had to include an unconventional business or product strategy that seemed just barely plausible.
Daily, an automated robot is tasked with organizing and discarding waste
The Challenge of Artificial Intelligence Embodiment
Astro Teller is arguably the most fitting individual to lead X, adopting the unique title of Captain of Moonshots. His presence in the Google X facility, a massive, three-floor transformed department store, was unmistakable, always seen gliding on his rollerblades. Add to this his distinctive ponytail, his invariably warm grin, and the unmistakable moniker, Astro, and one could easily feel as though they've stepped into a scene from HBO’s Silicon Valley.
Astro and I convened initially to brainstorm potential directions for the robotic enterprises acquired by Google, recognizing the necessity for action. However, the question remained: What form should this action take? Historically, robots have been bulky, unintelligent, and hazardous, relegated to industrial environments where they required extensive oversight or physical barriers to ensure human safety. The challenge we faced was devising a strategy for creating robots that could be both beneficial and secure for use in daily life. This endeavor demanded a novel methodology. The monumental issue at hand was the global demographic shift towards aging societies, diminishing labor forces, and widespread worker shortages. Our pivotal innovation, which we were confident about even back in 2016, was artificial intelligence. Our visionary solution involved the development of fully autonomous robots capable of assisting with an expanding array of daily tasks.
In essence, we were setting out to provide artificial intelligence with a tangible presence in the real world. I was certain that if there was any place capable of bringing such a monumental idea to life, it would be X. Embarking on this venture would require an immense amount of time, an abundance of patience, an openness to entertain and possibly fail at numerous outlandish ideas. Achieving our goals would necessitate groundbreaking advancements in both AI and robotics, and the financial toll was projected to reach into the billions. (Indeed, billions.) Our team harbored a firm belief that, by extending our vision slightly beyond what was currently visible, the merging of AI with robotics was not just possible but destined. We were on the cusp of turning what had once been the domain of science fiction into tangible reality.
Daily Robot distributes flowers on February 14th.
Checking In with Mom
Roughly once a week, I'd catch up with my mom over a call. She'd immediately dive in with her usual inquiry, skipping the pleasantries: "When will the robots arrive?" She was eager to find out when we'd send a robot to assist her. My reply was always, "Not for some time, Mom," to which she'd quickly retort, "Well, they need to hurry up!"
Residing in Oslo, Norway, my mother benefited from excellent public health services; health aides visited her home thrice daily to assist with various duties, primarily due to her severe Parkinson's disease. These health aides made it possible for her to maintain her independence at home, yet she wished for robotic assistance to manage the plethora of minor yet challenging and sometimes humiliating obstacles, or just to have a robotic arm for support.
Daily, the Robot undertakes the task of cleaning café surfaces following dining periods.
It's Quite Challenging
"Do you understand that robotics involves multiple integrated components, correct?" Jeff inquired, gazing at me intently. It appears every team has their own version of "Jeff"; for us, it was Jeff Bingham. A slender, sincere man holding a doctorate in bioengineering, Jeff was raised on a farm and was well-known as a fountain of wisdom, offering profound understanding on virtually every topic. Even now, ask me about robotics, and one of the initial points I'll highlight is that, indeed, it involves the complexity of systems.
Jeff emphasized the complexity of robots, highlighting that their effectiveness is limited by their most vulnerable component. For example, if the robot's vision system struggles in bright sunlight, it might malfunction or halt altogether when exposed to a sunbeam entering through a window. Similarly, if it can't navigate stairs, there's a risk it could fall, potentially causing harm to itself or those nearby. Essentially, creating robots capable of coexisting and operating effectively with humans poses a significant challenge.
For years, efforts have been devoted to teaching robots to execute basic actions, such as picking up a cup from a table or unlocking a door. However, these attempts often result in fragile systems that break down with the smallest alterations in settings or shifts in surroundings. The reason? The real world's unpredictability, such as an unexpected beam of sunlight, poses a significant challenge. And this is just the beginning; navigating the chaotic and crowded areas of our daily lives and workplaces remains a formidable task yet to be tackled.
Upon giving this considerable thought, it becomes apparent that without stringent control over the environment, ensuring everything is stationary and in its assigned place, and the lighting is constant and perfect, performing a seemingly simple task such as moving a green apple into a glass bowl on a kitchen countertop turns into an extraordinarily challenging issue. This is the rationale behind enclosing industrial robots within protective barriers. It allows for the conditions they operate in, including the lighting and positioning of items they handle, to remain consistent, eliminating the risk of accidentally hitting someone on the head.
Notice alerting Google staff and guests about the presence of roaming robots.
The Essentials of Educating Autonomous Machines
According to what Larry Page shared with me, it seems all that's required are 17 experts in machine learning. This was one of his typical, enigmatic pieces of wisdom. Despite my objections that it was unrealistic to expect that such a small team could develop both the physical and digital frameworks necessary for robots to assist humans, he simply brushed off my concerns. "17 is all you need," he insisted. I was puzzled. Why 17 specifically? Why not a smaller or larger group? Clearly, I wasn't grasping the full picture.
In essence, there are two main methods for integrating artificial intelligence (AI) into robotics. The first method combines AI with conventional programming to form a hybrid system. In this setup, AI is used for specific tasks within the system, which are then connected via standard programming techniques. For instance, the vision component might employ AI to identify and classify objects it perceives. After identifying these objects, it compiles them into a list which is then processed by the robot's program. This program, based on pre-defined rules coded into it, decides on actions. For example, if the robot is programmed to pick an apple from a table, the AI-enhanced vision component would spot the apple and add it to the list. The program would then select the item labeled as “type: apple” from the list and instruct the robot to pick it up using its regular control mechanisms.
Another method, known as end-to-end learning or simply e2e, focuses on teaching robots to carry out complete actions such as “grabbing an item” or broader tasks like “cleaning a table.” This technique involves training the robots by exposing them to vast quantities of data, similar to how humans acquire physical skills. Take, for example, instructing a young child to pick up a cup. Depending on their age, they might first need to understand what a cup is, recognize that it can hold liquids, and through trial and error, such as spilling or tipping the cup over multiple times, they gradually learn. Through watching others, mimicking their actions, and plenty of playful experimenting, they master the task, eventually performing it effortlessly without consciously thinking through the steps.
In my understanding, Larry's point was that the true significance lies in proving that robots can independently learn to execute complete tasks from start to finish. This achievement would mark a crucial step towards enabling robots to reliably carry out these tasks amid the complexities and unpredictabilities of the real world, thus positioning our project as a groundbreaking venture. The emphasis wasn't on the arbitrary number 17, but rather on the idea that monumental innovations are often the product of small, agile teams rather than large contingents of engineers. Clearly, the artificial intelligence component of a robot is not its only feature, so I continued our various engineering projects to design and construct the physical aspects of the robot. However, it became evident that showcasing a robot's ability to accomplish an end-to-end (e2e) task successfully would instill confidence in us that, using moonshot terminology, we could overcome significant challenges. For Larry, any other aspect was merely a matter of working out the "implementation details."
Robot seeks employment! (Crew having fun post-announcement of Everyday Robots' closure in January 2023.)
At the Arm-Farm
Peter Pastor, a robotics expert from Germany, earned his doctorate in robotics from the University of Southern California. Whenever he wasn't buried in work, Peter attempted to match his girlfriend's kiteboarding skills. In his research space, he dedicated much of his efforts to managing 14 custom-made robotic arms, which were eventually swapped out for seven commercial-grade Kuka robotic arms in a setup affectionately referred to as "the arm-farm."
Day and night, without pause, these mechanical arms engaged in continuous attempts to grasp various items such as sponges, Lego pieces, rubber ducks, and faux bananas from a container. Initially, they were set up to lower their pincer-like appendage into the container from an arbitrary point overhead, close the appendage, elevate, and then check to see if they had successfully picked up an object. A camera positioned over the container documented what was inside, the actions of the mechanical arm, and whether it succeeded or failed in its task. This process continued for several months.
Initially, the robots achieved success in their tasks only 7 percent of the time. However, they received positive feedback whenever they completed a task successfully. This feedback involved adjusting the "weights" within the robot's neural network, which serve to encourage favorable actions and discourage unfavorable ones. Over time, these robotic arms improved their ability to successfully grasp objects, achieving a success rate of over 70 percent. One day, Peter shared a video with me that truly marked a milestone: it showed a robot arm not merely reaching for a yellow Lego piece but also cleverly moving other items aside to secure a better grasp. This action wasn't a result of direct, rule-based programming but was something the robot had learned on its own.
Despite the efforts, having seven robots spend months attempting to grasp a rubber duckling simply wasn't sufficient. Likewise, employing hundreds of robots for years of practice wouldn't adequately prepare them for their initial practical tasks in the real world. Therefore, we developed a simulator hosted on the cloud and, in 2021, generated over 240 million instances of robots within this simulation.
Imagine the simulator as an expansive virtual reality game, equipped with a version of physics from the real world that's accurate enough to mimic the heaviness of objects or the resistance offered by surfaces. Thousands of virtual robots would utilize their digital vision and their digital forms, crafted to resemble actual robots, to accomplish tasks such as lifting a mug off a table. Operating simultaneously, these robots would engage in trial and error on a massive scale, amassing data to refine the AI algorithms. After achieving a decent level of proficiency in the digital realm, the algorithms would be applied to physical robots for final adjustments in the tangible world, enabling them to master their newly acquired skills. I've always likened the simulation process to robots spending the night dreaming and then awakening with newfound knowledge.
Initial version of a robot being trained to categorize waste.
The Revelation of Data's Power
The moment arrived when we all became aware of ChatGPT, and it felt like we had encountered sorcery. Here was an AI-driven mechanism capable of crafting full paragraphs, tackling intricate queries, and maintaining a continuous conversation. However, we quickly realized its inherent constraint: achieving this feat required a vast accumulation of data.
Robots currently utilize advanced language models to comprehend spoken words and vision models to interpret visual information, leading to impressive demonstrations on YouTube. However, the challenge of enabling robots to independently function and cooperate with humans presents a significantly larger issue related to data. Despite the use of simulations and various methods for generating training data, it's highly improbable that robots will suddenly become highly skilled overnight, equipped with a core model that oversees the entire system.
The jury is still out regarding the complexity of tasks that robots can learn to accomplish using only artificial intelligence (AI). My belief has evolved to understand that it may necessitate the deployment of hundreds of thousands, if not millions of robots engaging in real-world activities to amass sufficient data to train end-to-end (e2e) models capable of empowering robots to execute tasks beyond those that are strictly defined and narrow in scope. The creation of robots that provide valuable services—such as clearing and cleaning all the tables in a restaurant, or tidying up beds in a hotel—will, for the foreseeable future, depend on a combination of AI and conventional programming techniques. In essence, the scenario where robots deviate from their programmed directives, acting autonomously outside our oversight, is not expected to materialize in the near term.
Initial version of a robot being trained to unlock doors and sanitize restrooms.
Do They Need to Resemble Us?
Horses excel in locomotion on all fours, yet our invention of the automobile utilized wheels instead. Similarly, while the human brain operates with exceptional efficiency as a biological processor, silicon-based computing systems fall short of replicating its performance. This raises the question: why do vehicles not possess legs, and why haven't we shaped computers to mirror biological structures? The essence of creating robots, as I wish to convey, should extend beyond mere imitation.
During a gathering with technical chiefs at Everyday Robots, a lesson dawned on me. As we engaged in lively debate around a conference table, the topic at hand was whether our robots ought to be equipped with legs or wheels. These conversations often slid into the realm of ideological standoffs rather than being anchored in empirical evidence or scientific reasoning. There exists a strong sentiment among some that robots ought to mimic human appearance. The logic behind this view is sound. Our environments, be it our homes or workplaces, have been tailored to suit human needs. And since humans are bipedal, the argument follows that perhaps robots should be as well.
Roughly half an hour into the discussion, Vincent Dureau, the highest-ranking engineering manager present, finally voiced his opinion. With a straightforward statement, "If it's possible for me to reach that place, then the robots should also be capable of doing so," he made his point clear. Vincent made this remark while sitting in his wheelchair. Following his input, silence filled the room, signifying the end of the argument.
The reality is that the mechanics and electronics behind robotic legs are incredibly intricate. They lack speed and often lead to instability in the robot's movements. Moreover, they don't offer the same level of energy efficiency as wheels do. Nowadays, whenever I observe companies striving to develop humanoid robots—those that aim to replicate human appearance and behavior—I can't help but think it shows a lack of creative vision. There's a vast array of designs that could work in harmony with human capabilities. Why insist on imitating humans so closely? At Everyday Robots, our approach was to streamline the robot's design as much as possible. This is because getting robots to carry out tasks in the real world as soon as possible allows us to collect essential data more quickly. Vincent's observation served as a reminder that our priority should be to address the most challenging and significant issues right from the start.
Office Task
While seated at my workstation, a unipedal robot featuring a rectangular, softly contoured head approached me, called me by my name, and inquired about cleaning the area. Agreeing, I moved away. Shortly after, it gathered several discarded paper cups, an empty Starbucks iced tea container, and the wrapper from a Kind snack bar. These were then placed into its integrated waste compartment. Following this, it acknowledged me with a nod and proceeded to the adjacent workstation.
The launch of this neat-desk solution marked a significant achievement: It was evidence that we were advancing in solving a complex aspect of robotics. The robots were now adept at using artificial intelligence to accurately identify both humans and objects. Benjie Holson, who had a background in software engineering and puppeteering, led the development team for this project and was a strong proponent of combining different methods. He wasn’t opposed to fully automated learning tasks but rather maintained an approach focused on applying current capabilities to practical uses immediately. Should the machine learning specialists develop a fully automated task solution that outperformed his team's programmed efforts, they were open to integrating these newer algorithms into their toolkit.
I had become accustomed to our robots navigating the workspace, handling tasks such as organizing desks. Now and then, a newcomer or a recent addition to our engineering team would catch my eye. Their expressions of amazement and delight were evident as they observed the robots in action. It was through their perspective that I was reminded of the uniqueness of our situation. Our chief designer, Rhys Newman, once commented in his Welsh accent as a robot passed by, "This has turned into the usual scene for us. Strange, isn't it?"
Catie Cuan, the artist in residence at Everyday Robots, performs a dance alongside a robot.
Dancing Together
At Everyday Robots, our panel of experts was composed of a philosopher, an anthropologist, a past union leader, a historian, and an economist. We engaged in intense discussions surrounding economic, societal, and philosophical topics such as: How would the economy change if robots were integrated into our daily lives? What would be the immediate and future consequences on employment? In an era dominated by smart machines, what constitutes human essence? And, how can we design these machines to ensure they foster a sense of security and inclusivity?
In 2019, while informing my team about our search for an artist in residence who could bring a unique and unconventional touch to our robots, I encountered Catie Cuan. At the time, Catie was pursuing her PhD in robotics and AI at Stanford University. What stood out to me was her background as a professional dancer, having graced stages such as the Metropolitan Opera Ballet in New York City.
Chances are, you've come across videos on YouTube showcasing robots executing dance routines—where the robots follow a meticulously programmed set of movements in time with music. While these performances are entertaining, they're not fundamentally different from the experiences one might have on an amusement park ride at Disneyland. I inquired with Catie about the potential for robots to instead dynamically create and interact with one another as humans, or even as groups of animals like bird flocks or fish schools, do. To achieve this vision, she, along with a team of engineers, crafted an artificial intelligence algorithm inspired by the artistic preferences of a choreographer—Catie herself.
In the evenings and on some weekends, when the robots were free from their daily tasks, Catie and her makeshift team would assemble a group of robots, around a dozen, in a spacious central atrium within X. These robots started to move in unison, sometimes a bit clumsily, yet forming captivating patterns that seemed to exhibit curiosity, and at times, elegance and beauty. Tom Engbersen, a roboticist from the Netherlands who enjoys replicating famous paintings in his leisure, embarked on a collaborative project with Catie to investigate how robots might dance to music or even play musical instruments. He then had a groundbreaking thought: What if the robots themselves could be transformed into musical instruments? This led to an inventive experiment where each robot's movement produced a distinct sound. The movement of the base generated a deep bass note, while the action of a gripper opening and closing produced a chime-like sound. Activating the music mode, the robots began to produce original musical compositions with their movements. Whether they were navigating corridors, sorting waste, cleaning surfaces, or moving in a dance-like manner, the robots took on the role of novel, friendly beings, creating an experience unlike any I had previously encountered.
The Journey Has Just Begun
Towards the close of 2022, the debate between using purely end-to-end models versus a hybrid approach was still very much alive. Peter, along with his team and their partners at Google Brain, dedicated their efforts to leveraging techniques such as reinforcement learning, imitation learning, and the use of transformers—the technology underpinning large language models (LLMs)—across a range of robotic tasks. Their work was demonstrating promising results in teaching robots to perform tasks in a way that was adaptable, reliable, and capable of withstanding challenges. At the same time, the team led by Benjie focused on integrating AI models with conventional programming techniques. Their goal was to design and develop robotic services that could operate effectively and safely in environments populated by humans.
As it turns out, Project Starling, the name given to Catie's ensemble of robots, transformed my perspective on these mechanical beings. I observed the fascination, happiness, and inquisitiveness they sparked in individuals. This experience enlightened me to the realization that the manner in which robots navigate our spaces and their sounds can profoundly influence our emotions. This will significantly impact our acceptance and integration of them into our daily routines.
In essence, we were on the brink of fully leveraging our most significant gamble: robots enhanced by artificial intelligence. This AI equipped them with the capability to process both auditory and visual inputs – turning spoken or written words into actionable tasks and interpreting visual data from cameras into recognizable scenes and objects for interaction. Demonstrated by Peter and his team, these robots had mastered the skill of object manipulation. After dedicating over seven years to this project, we had begun to roll out numerous robots across various Google premises. These robots, though uniform in design, were versatile in function – autonomously cleaning cafeteria surfaces, checking conference rooms, sorting waste, and undertaking other tasks.
In January 2023, two months following OpenAI's launch of ChatGPT, Google decided to discontinue its Everyday Robots project due to significant cost issues. The robots, along with a select group of individuals, were transferred to Google DeepMind for further study. Despite the substantial expenses and extended timeframe, the decision took all parties by surprise.
A Crucial National Goal
In 1970, the global ratio was ten working-age individuals for every person aged 64 and above. By 2050, this ratio is expected to drop to less than four. The world is facing a shortage of labor. The question arises: who will look after the aging population? Who will staff our manufacturing plants, healthcare facilities, and eateries? Who will be behind the wheel of trucks and taxis? Nations such as Japan, China, and South Korea are acutely aware of the urgency of this issue. For these countries, turning to robotics is not a matter of choice but a critical national priority, prompting significant investments in robotic technologies.
Integrating artificial intelligence into tangible robots presents a dual challenge: it's a matter of national defense and a significant chance for economic growth. If major tech corporations like Google back away from ambitious projects such as AI-driven robots designed to enhance and assist future workforces, then who will take the lead? Can Silicon Valley or other innovative hubs rise to the occasion, and will they have access to the sustained, long-term investment needed? I'm skeptical. We referred to Everyday Robots as a moonshot project because creating such intricate systems on a large scale surpasses the endurance limit of what startups typically funded by venture capital can handle. Although the US holds a leading position in AI technology, crafting its robotic counterparts demands expertise and facilities where other countries, particularly China, have a head start.
The automated assistants failed to arrive in time to assist my mom. She died in the early part of 2021. The many discussions we had towards her life's end only strengthened my belief that an evolved form of what we initiated at Everyday Robots is on the horizon. Indeed, it's needed more urgently than ever. Thus, we're faced with contemplating: How do we bring about this sort of transformation and future? I stay intrigued and worried.
Share your thoughts on this piece by sending a letter to the editor via email at mail@wired.com.
Explore Further…
Dive into the World of Politics: Subscribe to our newsletter and tune into our podcast.
The outcome of providing individuals with money at no cost
Ozempic doesn't lead to weight loss for everyone
The Pentagon is planning to allocate $141 billion towards a cataclysmic device.
Gathering: Be part of the Energy Tech Summit happening on October 10th in Berlin.
Additional Content from WIRED
Evaluations and Manuals
© 2024 Condé Nast. All rights reserved. WIRED could receive a share of revenue from items bought via our website, as we have affiliate agreements with certain retailers. Content from this website should not be copied, shared, transmitted, stored in any form, or used in any other way without explicit prior written consent from Condé Nast. Choices regarding advertisements.
Choose a global location
Discover more from Automobilnews News - The first AI News Portal world wide
Subscribe to get the latest posts sent to your email.