Traveling can be intimidating if you don’t speak the language in a country, especially when a lot of translation apps seem to butcher everything you’re trying to say. Now, Google is seeking to improve AI translation with its new program, Translatotron.
In the company’s announcement, Google describes Translatotron as an end-to-end, speech-to-speech translation model. Essentially, Google is hoping to ditch the usual speech-to-text and then text-to-speech method — which is what Google Translate currently does.
Instead, Translatotron will use a neural network allowing the program to skip the step of translating audio to text. What makes the program even cooler is that it’ll retain the voice of the original speaker after translation.
In the company’s announcement, Ye Jia and Ron Weiss, Software Engineers, Google AI, wrote:
“To the best of our knowledge, Translatotron is the first end-to-end model that can directly translate speech from one language into speech in another language. It is also able to retain the source speaker’s voice in the translated speech. We hope that this work can serve as a starting point for future research on end-to-end speech-to-speech translation systems.”
There are other benefits to Google’s new AI model as well. The company claims it has a faster inference speed to eliminate compounding errors and is much better at handling words that don’t need translation, such as names and proper nouns.