Information, news and best practices covering our industry, company, partners and customers

Synthetic voices: a brief history (to help your marketing strategy)

Imagine a machine that can talk just like a human being, understanding our questions, and replying with meaningful information. Until not too long ago, this was just pure fantasy, a utopia–something not more real than HAL 9000 from 2001: A Space Odyssey.

Most likely, nobody would have ever imaged that, in such a short time, we would be using similar technology in our everyday life: from Siri to Google Voice — just to mention the most famous. In fact, having a conversation with our phone is now quite an ordinary activity.

The dream to build machines that can talk to humans is anything but recent. The first attempts on creating devices able to fulfill this dream date back to the year 1000, and were conceived by the visionary genius of Gerbert of Aurillac, Albertus Magnus, and Roger Bacon.

There was also a Danish scientist, Christian Kratzenstein who, in the 18th century, built a machine able to reproduce the five vocal octaves. During the same century, the ‘Talking Machine’ by the Slovakian inventor Wolgang von Kempelen came to life. It could recite up to thirty different meaningful sentences.

In this secular history, there is also space for a bizarre illusionist born in Vienna in 1873.  We are talking about Joseffy the magician, a character who built an extremely-realistic copper-made skull that was able to ‘reply’ to basic questions.

In the 30’s, when Joseffy was still alive, in the United States — in the Bell Labs, more specifically — the first vocoder was developed. A vocoder is a category of voice codec that analyzes and synthesizes the human voice. The sounds that came out were metallic and inhuman, but the sentences were perfectly understandable.

Finally the time of the first real computers arrived, which brought preliminary research based on the text-to-speech vocal synthesis. Today, almost a century later, text-to-speech remains the most evolved artificial technology for the production of human speech. It’s the technology used for most of the synthetic voices you hear around you.

Today, artificial voices are everywhere. But only recently have they started to really be difficult to distinguish from human ones. Marketers cannot ignore this new potential, which still needs to be fully explored, but that seems to be more and more the next big thing.

But synthetic voices today are not just the ones in our smartphones. The possibility to add a pleasing and engaging voice to audio or video messages on a large scale represents an immense potential for marketing and communication professionals.

We have always been used to listen to voiceovers — from radio and television shows, to movie trailers and advertising — so we are very familiar on the impact that the right voice can have in terms of persuasion and engagement. Nowadays, for the first time, you can take advantage of the same power, without the need of an expensive, professional speaker. An artificial voice — if it has the right tone, cadence, and mood — can make any message more powerful and entertaining.

Here is a concrete example: Some Canadian researchers have manipulated political leaders’ voices and monitored the impact that the alteration had on the electorate’s persuasion. Guess what? The sole tone of voice, without any change to the message’s content, significantly impacted the audience’s reaction and opinions.

Just imagine transferring this potential to the world of marketing (that, as we have been told, can have a lot in common with politics), and it will be clear how a proper voice is a crucial component of an effective message.

Images matter. Words matter. But do not underestimate what a fascinating voice can do.

But how best to leverage this innovative, ever-changing digital scenario? Today, you don’t need to be a bizarre, brilliant inventor to take advantage of synthetic voices. You just need to find the right platform for your “digital” voiceovers.

To help you, we have created Pvideo, a platform that can generate personalized videos with synthetic voices based on the latest text-to-speech technology. Starting from your “rough databases”, full of numbers, addresses, names and all kind of figures, we can automatically generate clear, sharp audio thanks to our software. You will be able to freely decide the voice’s characteristics, including gender, age, tone, and mood.

This has been allowing our customers to easily create unique, localized and personalized content for their final users. All at affordable costs, and without the hassle. Take a look at TARI project here.

And this is only a small part of the magic that Pvideo can deliver. In comparison, Joseffy the magician was just a beginner…




Back to Blog

This site or the third-party tools it uses make use of cookies necessary for their operation and useful for the purposes set out in the Privacy Policy. By navigating the site, scrolling this page or clicking "I agree", you are consenting to the use of cookies. To learn more or disable the use of cookies, consult the Privacy Policy

The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.