Meta’s new MMS model wants to save world’s languages with AI – Times of India

By Andrew McCollum On May 23, 2023

Like most other big tech companies, Meta has been betting big on artificial intelligence (AI). The company’s approach to AI has been a bit different than the likes of Google and Microsoft. Meta unveiled a new large language model (LLM) with the aim of preserving the world’s languages. Called Massively Multilingual Speech (MMS), the model expands “text-to-speech and speech-to-text technology from around 100 languages to more than 1,100 — more than 10 times as many as before — and can also identify more than 4,000 spoken languages, 40 times more than before.”
How will the model be used?
According to a blog post by Meta, many of the world’s languages are in danger of disappearing, and the limitations of current speech recognition and generation technology will only accelerate this trend. “We want to make it easier for people to access information and use devices in their preferred language, and today we’re announcing a series of artificial intelligence (AI) models that could help them do just that,” the company said in the blog post.
How will the model work
Meta said that the biggest challenge was collecting audio data for thousands of languages. The largest existing speech datasets cover 100 languages at most, explained Meta. For example, religious texts, such as the Bible, have been translated into many different languages and whose translations have been widely studied for text-based language translation research. Meta had these translations have publicly available audio recordings of people reading these texts in different languages. “As part of the MMS project, we created a dataset of readings of the New Testament in more than 1,100 languages, which provided on average 32 hours of data per language,” the company said.
Meta then considered unlabeled recordings of various other Christian religious readings and increased the number of languages available to more than 4,000. “While this data is from a specific domain and is often read by male speakers, our analysis shows that our models perform equally well for male and female voices. And while the content of the audio recordings is religious, our analysis shows that this doesn’t bias the model to produce more religious language,” the company said.

For all the latest Technology News Click Here

For the latest news and updates, follow us on Google News.

Read original article here

Denial of responsibility! NewsUpdate is an automatic aggregator around the global media. All the content are available free on Internet. We have just arranged it in one platform for educational purpose only. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials on our website, please contact us by email – [email protected]. The content will be deleted within 24 hours.