Skip to main content
  1. Home
  2. Phones
  3. Mobile
  4. News

Baidu’s Deep Voice 2 text-to-speech engine can imitate hundreds of human accents

Add as a preferred source on Google

Baidu, the Beijing-based juggernaut that commands 80 percent of the Chinese internet search market, is investing heavily in artificial intelligence. In 2013, it opened the Institute of Deep Learning, an R&D center focused on machine learning. And in May, it took the wraps off the newest version of Deep Voice, its AI-powered text-to-speech engine.

Deep Voice 2, which follows on the heels of Deep Voice’s public debut earlier this year, can produce real-time speech that’s nearly indistinguishable from a human voice. All the more impressive, it needs just thirty minutes of audio to build a working model, and can imitate the regional accents of hundreds of different speakers.

Recommended Videos

That’s leaps and bounds better than early versions of Deep Voice, which took multiple hours to learn one voice.

They key is Deep Voice 2’s ability to identify similarities between hundreds of different speakers to build a working model of a human voice. Then, it autonomously derives unique voices from that model — unlike voice assistants like Apple’s Siri, which require that a human record thousands of hours of speech that engineers tune by hand, Deep Voice 2 doesn’t require guidance or manual intervention.

Baidu (sign)
Image used with permission by copyright holder

“Give it the right data, and it can learn on [its] own what sort of features are important,” Andrew Gibiansky, a research scientist at Baidu’s Silicon Valley AI Lab, told The Verge.

Baidu isn’t the only company investing in high-quality text-to-speech tech. Google’s WaveNet, a product of the company’s DeepMind division, generates voices by sampling real human speech and independently creating its own sounds in a variety of voices. Adobe’s Project VoCo transcribes human speech to editable text in real time. And Lyrebird, a Canadian AI startup, licenses algorithms that can imitate any voice with just a single minute of sample audio, create one thousand sentences in less than half a second, and can infuse the speech it creates with emotions like anger, sympathy, and stress.

But don’t expect Deep Voice 2 or WaveNet to replace Siri, the Google Assistant, or Amazon’s Alexa anytime soon — AI-powered translation apps require more resources than today’s phones can reasonably supply. But Baidu sees potential in applications like text-to-speech apps and voice-based assistants. “The ability to quickly synthesize multiple human voices will have a huge effect on products such as personal assistants and eBook readers in the future. For example, each character of your eBook could have a unique voice when you listen to the eBook.”

Kyle Wiggers
Kyle Wiggers is a writer, Web designer, and podcaster with an acute interest in all things tech. When not reviewing gadgets…
The world’s chip lord issues price hike warning that’s going to hurt your phone and laptop
TSMC, the company behind the chips in virtually every consumer device, says inflation is driving up costs and hasn't ruled out passing them on.
TSMC Fab

The world's largest chipmaker has signaled that rising costs may force it to increase prices for the chips that power consumer devices and AI infrastructure.

Speaking to the BBC, TSMC CFO Wendell Huang confirmed that inflation is driving up the company's costs and did not rule out passing those increases on to customers. He stopped short of committing to sudden dramatic increases, saying the company would not impose "fourfold, fivefold" price rises. TSMC chairman and CEO CC Wei separately told shareholders the same day that he would "like" to raise prices, as competitors have already done.

Read more
Everything Apple announced at WWDC 2026: iOS 27, next-gen Siri, AI upgrades, and more
Apple stopped making promises at WWDC 2026 and started delivering: Siri AI, six OS updates, and Cook's farewell.
WWDC 2026 poster

Unlike most years, Apple’s WWDC 2026 carried more weight than usual, not just because it was Tim Cook’s final keynote as CEO, but also because it represented Apple’s chance at redemption after missing deadlines, mounting questions, and criticism about its ability to keep pace in the AI race. 

Fortunately, Apple answered many of those questions on June 8, 2026, unveiling an upgraded AI-powered Siri alongside a range of new Apple Intelligence features, while also raising a few fresh questions. WWDC was packed with announcements across six operating systems that underpin Apple’s ecosystem of devices. 

Read more
iOS 27 offers the clearest sign that a foldable iPhone is right around the corner
Resizable iPhone apps may be Apple’s first step toward a foldable iPhone
iPhone Ultra

Apple’s WWDC 2026 event was packed with major software announcements, including its new Siri AI experience, expanded child safety tools, and the latest operating system updates for its phones, Macs, and iPads. It was only a matter of time before someone dug out something interesting from the new software, and developer Sam Henri Gold might have just found the biggest clue yet that Apple is planning to launch a foldable iPhone soon.

iOS 27 is quietly preparing apps for a foldable future

Read more