Meta AI researchers announced that they have developed a new suite of artificial intelligence models called Seamless Communication that aim to enable more natural and authentic communication across languages — essentially making the concept of a Universal Speech Translator a reality. The models were publicly released this week along with research papers and accompanying data.
The flagship model, called Seamless, merges capabilities from three other models — SeamlessExpressive, SeamlessStreaming, and SeamlessM4T v2 — into one unified system. According to the research paper, Seamless is “the first publicly available system that unlocks expressive cross-lingual communication in real-time.”
The Seamless translator represents a new frontier in the use of AI for communication across the blog. It combines three sophisticated neural network models to enable real-time translation between over 100 spoken and written languages while preserving the vocal style, emotion, and prosody of the speaker’s voice.
The models’ capabilities could enable new voice-based communication experiences, from real-time multilingual conversations using smart glasses to automatically dubbed videos and podcasts. The researchers suggest it could also help break down language barriers for immigrants and others who struggle with communication.