I Witnessed the Future of AI, and It’s a Broken Toy

April 27, 2024

179 5 minutes read

I Witnessed the Future of AI, and It’s a Broken Toy — 1714217557 original.jpg

This story was supposed to have a different beginning. You were supposed to hear about how, earlier this week, I attended a splashy launch party for a new AI gadget—the Rabbit R1—in New York City, and then, standing on a windy curb outside the venue, pressed a button on the device to summon an Uber home. Instead, after maybe an hour of getting it set up and fidgeting with it, the connection failed.

The R1 is a bright-orange chunk of a device, with a camera, a mic, and a small screen. Press and hold its single button, ask it a question or give it a command using your voice, and the cute bouncing rabbit on screen will perk up its ears, then talk back to you. It’s theoretically like communicating with ChatGPT through a walkie-talkie. You could ask it to identify a given flower through its camera or play a song based on half-remembered lyrics; you could ask it for an Uber, but it might get hung up on the last step and leave you stranded in Queens.

When I finally got back to my hotel room, I turned on the R1’s camera and held up a cold slice of pizza. “What am I looking at?” I asked. “You are looking at a slice of pizza,” the voice told me. (Correct!) “It looks appetizing and freshly baked.” (Well, no.) I decided to try something else. “What are top 10 …” I stumbled, letting go of the button. I tried again: “What are the top 10 best use cases for AI for a normal person?” The device, perhaps confused by our previous interaction, started listing out pizza toppings beginning with the No. 2. “2. Sausage. 3. Mushrooms. 4. Extra Cheese.”

Until now, consumer AI has largely been defined by software: chatbots such as ChatGPT or the iPhone’s souped-up autocorrect. Now we are experiencing a thingification: Companies are launching and manufacturing actual bits of metal and plastic that are entirely dedicated to AI features. These devices are distinguished from previous AI gadgets, such as the Amazon Echo, in that they incorporate the more advanced generative-AI technology that has recently been in vogue, allowing users more natural interactions. There are pins and pendants and a whole new round of smart glasses.

Read: Alexa, should we trust you?

Yet for all its promise, this new era is not going very well. Take Humane, a Rabbit competitor that launched a wearable “AI Pin” earlier this month. That device has been positioned as a smartphone replacement, with a price to match: It costs $699 and requires a $24 monthly subscription fee. Reviewers brutalized the pin, saying it is slow, overheats, and struggles to answer basic queries. “I’m hard-pressed to name a single thing it’s genuinely good at,” The Verge wrote.

By comparison, the R1 is satisfyingly small in its ambition and (relatively) affordable in its price ($199, no subscription). The device itself is fun and retro-chic: Jesse Lyu, Rabbit’s founder and CEO, reportedly bought every member of his team a Tamagotchi for inspiration. And, in fairness, the R1 does some interesting things. Onstage, Lyu showed how the device can interpret a handwritten table and convert it into a working digital spreadsheet. It managed to speak a summary of a handwritten page when I asked, though only with about 65 percent accuracy. I was able to use the gadget to order an acai bowl on DoorDash, although it couldn’t handle any customizations. (I wanted peanut butter.) And I never got Uber to work. (Though at one point, the device told me the request had failed when it in fact hadn’t, leaving me on the hook for a $9 ride I didn’t even take.)

One of the big selling points of the R1 is that it supposedly runs something called a large action model, or LAM—a spin on the phrase large language model, which is the technology powering recent chatbots. Whereas ChatGPT can answer questions and draft you a mediocre essay, the R1 can, in theory, complete actions that you might take on different apps (Venmo-ing your friend $20, for example). Rabbit has said the device will be able to learn any app, if you teach it. Lyu compared the technology to a Tesla: When on autopilot, a Tesla car can in theory recognize a stop sign not because engineers tell it how a stop sign looks but because it has been trained on countless hours of footage to recognize the sign’s physical attributes. Likewise, R1 will be able to accomplish tasks on your phone without having to be taught each app.

The problem is, none of this is actually real. At least not yet. As with so many AI products, the R1 is fueled more by hype than by a persuasive use case. (So many of its functions could, after all, be done on a smartphone.) Back in February, Lyu said the Rabbit was training its model on 800 apps. This week, it launched with the ability to use just four: Spotify, DoorDash, Uber, and Midjourney (a popular AI art generator). The company says LAM is in “very early stages.”

Read: Phones will never be fun again

Onstage before an audience of reporters and Rabbit fanboys on Tuesday night, Lyu seemed nervous at times, at one point encouraging people to laugh in order to ease his nerves. Prior to the event, a user had posted on GitHub accusing Rabbit of misrepresenting its technology. “For those with a technical background, it’s painfully clear that there’s no artificial intelligence or large action model in sight,” the anonymous post, which has since been deleted, read. On X, Lyu characterized the post as “all false claims.” Lyu promised to fix any bugs that might crop up in R1 devices. Before demoing DoorDash onstage, he admitted that the feature doesn’t yet work as fast as they’d like it to: “But I want to show you, and I want to be frank with you guys.”

Yet Lyu also breathlessly announced a number of new initiatives, including a high-concept system that would allow people to someday merge the physical and the digital, so people could point at various smart items in their home and control them through Rabbit’s AI. (Never mind that the R1 has launched without many of its promised features.) Toward the end of the presentation, the words Be Humble appeared on the giant screen behind him. “We are a really, really humble team,” Lyu told the crowd. Those words were still displayed when, a few moments later, the curtains on either side of the stage dramatically dropped to reveal conveyor belts loaded with boxes of R1s. Music started blasting, and people started lining up to snatch theirs.

The R1 is a reminder of the disconnect, for better and for worse, between a Silicon Valley culture that often prioritizes speed over quality and high consumer expectations about the products they use. And to be fair, expectations are high at least in part because of the extraordinary products that have emerged from that same competitive and iterative culture over the years.

As the party wound down, news of the first bug arrived: There was no way to change the time zone on the devices, many of which were programmed by default to the West Coast. Turns out the future is stuck three hours behind.

Source

April 27, 2024

179 5 minutes read