SeamlessM4T is a pioneering multilingual and multitask model that facilitates seamless translation and transcription across both speech and text.
The internet, mobile devices, social media, and communication platforms have ushered in an era where access to multilingual content has reached unprecedented levels. SeamlessM4T aims to realise the vision of seamless communication and comprehension across languages.
The innovation encompasses:
- Automatic speech recognition for nearly 100 languages
- Speech-to-text translation supporting nearly 100 input and output languages
- Speech-to-speech translation for nearly 100 input languages and 35 (including English) output languages
- Text-to-text translation for almost 100 languages
- Text-to-speech translation for nearly 100 input languages and 35 (including English) output languages
The metadata of SeamlessAlign – the largest multimodal translation dataset ever compiled, consisting of 270,000 hours of mined speech and text alignments – has been released. This facilitates independent data mining and further research within the community. To ensure the accuracy and safety of the system, Meta adheres to a responsible AI framework. As the world becomes more connected, SeamlessM4T’s ability to transcend language barriers is a testament to the power of AI-driven innovation. This milestone brings us closer to a future where communication knows no linguistic limitations, enabling a world where people can truly understand each other regardless of language.
Source: AI NEWS