A.I.’s ‘Her’ Era Has Arrived

The Shift

New chatbot technology can talk, laugh and sing like a human. What comes next is anyone’s guess.

Joaquin Phoenix in a white shirt watches a city skyline at night.
Joaquin Phoenix as Theodore in the 2013 romantic drama “Her,” directed by Spike Jonze.Credit…Warner Bros.

Kevin Roose

A lifelike artificial intelligence with a smooth, alluring voice enchants and impresses its human users — flirting, telling jokes, fulfilling their desires and eventually winning them over.

I’m summarizing the plot of the 2013 movie “Her,” in which a lonely introvert named Theodore, played by Joaquin Phoenix, is seduced by a virtual assistant named Samantha, voiced by Scarlett Johansson.

But I might as well be describing the scene on Monday when OpenAI, the creator of ChatGPT, showed off an updated version of its A.I. voice assistant at an event in San Francisco.

The company’s new model, called GPT-4o (the o stands for “omni”) will let ChatGPT talk to users in a much more lifelike way — detecting emotions in their voices, analyzing their facial expressions and changing its own tone and cadence depending on what a user wants. If you ask for a bedtime story, it can lower its voice to a whisper. If you need advice from a sassy friend, it can speak in a playful, sarcastic tone. It can even sing on command.

The new voice feature, which ChatGPT users will be able to start using for free in the coming weeks, immediately drew comparisons to Samantha from “Her.” (Sam Altman, OpenAI’s chief executive, who has praised the movie, posted its title on X after Monday’s announcement, making the connection all but official.)

On social media, users hailed the arrival of an A.I. voice assistant that will finally understand them, or at least pretend that it does.

In a series of live demonstrations on Monday, OpenAI employees showed off ChatGPT’s new capabilities. One asked ChatGPT to read him a story — then to read it again more dramatically, using the voice of a robot. (“Initiating dramatic robotic voice,” it responded.) Another asked it to sing “Happy Birthday.” ChatGPT did well at both tasks, and it also performed ably when employees asked it to serve as a real-time translator between languages.

But the real killer feature was the way ChatGPT’s voice itself changed. One moment, it was a sing-songy soprano. The next, it shifted into a lilting contralto. It paused for effect, giggled at its own jokes and added filler phrases like “hmm” and “let’s see” for extra realism. It sounded more humanlike than some humans I know.

It also seemed to have a sense of humor. At one point during a demo, an OpenAI employee breathed in a heavy, exaggerated pant. ChatGPT heard him and responded, “Mark, you’re not a vacuum cleaner.”

For years, A.I. voice assistants have been limited by their inability to pick up on the nuances of conversation, such as tone and emotional affect. Synthetic A.I. voices, like those used by Siri and Alexa, tend to be flat and impersonal — they sound the same whether they’re giving tomorrow’s weather forecast or telling you that your cookies are done.

And as I discovered recently when I spent a month talking to a group of A.I. “friends,” a big problem with today’s A.I. voice models is speed. It’s hard to forget you’re talking to a robot when every answer has a three-second delay.

OpenAI has addressed the latency problem by giving GPT-4o what is known as “native multimodal support” — the ability to take in audio prompts and analyze them directly, without converting them to text first. That has made its conversations faster and more fluid, to the point that if the ChatGPT demos were accurate, most users will barely notice any lag at all.

All this adds up to a much different subjective experience. If previous A.I. assistants felt like talking to a dispassionate librarian, the new ChatGPT feels like a friendly, chatty co-worker. (Albeit one who occasionally spouts nonsense — but don’t we all have one of those?)

These demonstrations, along with other A.I. news from recent days — including reports that Apple is in talks with OpenAI to use its technology on the iPhone, and is preparing a new, generative A.I.-powered version of Siri — signal that the era of the detached, impersonal A.I. helper is coming to an end.

Instead, we’re getting chatbots modeled after Samantha in “Her” — with playful intelligence, basic emotional intuition and a wide range of expressive modes.

Some users may be repelled by them. But many will come to love and appreciate the new breed of A.I. assistants — and some will inevitably fall in love, as Theodore does.

The most telling detail of Monday’s demo, in my view, was the way that OpenAI’s own employees have started talking to ChatGPT. They anthropomorphize it relentlessly, and treat it with deference — often asking “Hey ChatGPT, how’s it going?” before peppering it with questions. They cheer when it nails a difficult response, the way you might root for a precocious child. One OpenAI employee even wrote “I ❤️ ChatGPT” on a piece of paper and showed it to ChatGPT through his phone’s camera. (“That’s so sweet of you!” ChatGPT responded.)

These are seasoned A.I. experts, who know full well that they are summoning statistical predictions from a neural network, not talking to a sentient being. And some of it may be showmanship. But if OpenAI’s own employees can’t resist treating ChatGPT like a human, is it any mystery whether the rest of us will?

After all, users were already trying to trick ChatGPT into acting like their boyfriend, even before the upgrade. And my recent experiment with A.I. friends proved to me that the technology required to create realistic A.I. companions already exists, even if the execution isn’t perfect yet.

(The New York Times sued OpenAI and its partner, Microsoft, in December, claiming copyright infringement of news content related to A.I. systems.)

In some ways, the choice to model a chatbot after Samantha from “Her” is an odd one. The film is hardly a utopian picture of A.I. companionship, and it ends — spoiler alert! — with Theodore getting his heart broken by Samantha.

But despite the film’s cautionary message, there’s no turning back now. After Monday’s announcement, one OpenAI employee posted, perhaps a bit ominously:

“You are all gonna fall in love with it.”

Harry Byrne

Related post