OpenAI is gradually rolling out access to its advanced voice assistant ChatGPT to a “select group” of ChatGPT Plus subscribers. All paying users will be able to have a natural language conversation with the AI by the end of the year, but for now, only a lucky few will benefit.
Late last week, my account was flagged as one of those able to chat with an emotionally sensitive, hyperactive, Yoda-like artificial voice. After a full weekend with Advanced Voice, it's better and more expressive than the demos suggested.
One of the most notable features is the ability to interrupt the AI and have it react immediately to a change in direction. For example, I had it tell a story about London's Paddington Station using a Yoda voice, then interrupted it and changed it to quickly count to 100.
What stands out, though, is how “human” Advanced Voice feels compared to all the other AI voice assistants I’ve tried. Talking to it feels natural, and its voice reacts and adapts in pitch and even speed to your voice as you speak to it.
I understand why OpenAI is concerned about people developing an emotional attachment to the AI voice. Combined with natural language and GPT-4o's knowledge, it's a great experience.
Advanced Voice is a great storyteller
Watch on
ChatGPT with GPT-4o is already very good at writing stories. Yet with the addition of the text-to-speech feature in Advanced Voice, it has also become a brilliant storyteller, able to adapt stories on the fly and even add multiple voices and energy levels.
I started asking him to tell me a story about an AI that gains sentience, and it started out well, sounding a lot like an audiobook. I then asked him to add things like space travel and math equations/real science. Then I had him “talk like the vampire Yoda,” and that’s exactly what he did. It sounds exactly like you’d imagine.
Then he told me a story about the first humans on Mars who discovered something unexpected, including sound effects – which he did but sparingly.
I also had to ask him to be more dramatic in his reading, but he did it perfectly. He can also create a “do it yourself” story where you direct the story. I asked him to find a human skeleton.
Advanced voice as city guide
When I have to go to London for work, sometimes it's nice to take a walk and explore the surroundings. My office is near Paddington Station, so I asked ChatGPT Advanced Voice to provide me with information about different sites and places.
This feature will become more useful as OpenAI integrates searchGPT and other live data features into Voice.
Even without live data, its training dataset is recent enough to tell me about the Paddington Bear statue, the history of the station, and even details of its unique architecture.
Advanced Voice as a Personal Trainer
Watch on
After decades of avoiding any form of exercise, I finally decided to get back in shape. I have a personal trainer and go to the gym regularly. I also traded my cherry addiction for water and eat healthier in general.
I was doing an intense workout, so I asked Voice for advice. He guided me through a stretching exercise, even counting down from 10 to show me how long I should hold a specific position or stretch.
He also introduced me to different healthy recipe ideas. He motivated me while I was on the treadmill, offering me regular phrases of encouragement and adapting his tone and energy level from gentle persuasion to full-blown drill sergeant.
Final thoughts
I don't think I've even scratched the surface of what's possible with Advanced Voice. When I can access it simply by saying Hey, ChatGPT or pressing a button on my phone, it'll also become much more useful. I hope Apple comes up with alternatives to Siri in the future.
When I first got access, I did all the stupid things you'd expect, including having him try different voices, talk like Yoda, and count quickly. I also had him try singing, speaking in different languages, and doing a short stand-up routine about space. I'm not getting a Netflix special.
But as I used it more, I found that it had become my default way of searching for information or interacting with my phone. At the grocery store, I used it to track what I was buying and even offer suggestions for alternative ingredients.
When I was walking around and curious about a building, I found myself asking Advanced Voice rather than typing into Google or ChatGPT.
Being so natural and responsive, with the ability to easily interrupt and change the conversation, is a huge leap forward in computer interaction, one that has been expected for decades. It is a leap forward comparable to that of the mouse and the touch screen.