Meta’s new AI fashions can acknowledge and produce speech for greater than 1,000 languages

0
728
Meta’s new AI fashions can acknowledge and produce speech for greater than 1,000 languages


There are round 7,000 languages on the earth, however current speech recognition fashions cowl solely about 100 of them comprehensively. This is as a result of these sorts of fashions are likely to require big quantities of labeled coaching knowledge, which is accessible for under a small variety of languages, together with English, Spanish, and Chinese.

Meta researchers acquired round this downside by retraining an current AI mannequin developed by the corporate in 2020 that is ready to study speech patterns from audio with out requiring giant quantities of labeled knowledge, akin to transcripts. 

They educated it on two new knowledge units: one which incorporates audio recordings of the New Testament Bible and its corresponding textual content taken from the web in 1,107 languages, and one other containing unlabeled New Testament audio recordings in 3,809 languages. The group processed the speech audio and the textual content knowledge to enhance its high quality earlier than working an algorithm designed to align audio recordings with accompanying textual content. They then repeated this course of with a second algorithm educated on the newly aligned knowledge. With this technique, the researchers had been capable of educate the algorithm to study a brand new language extra simply, even with out the accompanying textual content.

“We can use what that model learned to then quickly build speech systems with very, very little data,” says Michael Auli, a analysis scientist at Meta who labored on the venture.

“For English, we have lots and lots of good data sets, and we have that for a few more languages, but we just don’t have that for languages that are spoken by, say, 1,000 people.” 

The researchers say their fashions can converse in over 1,000 languages however acknowledge greater than 4,000. 

They in contrast the fashions with these from rival firms, together with OpenAI Whisper, and declare theirs had half the error charge, regardless of protecting 11 occasions extra languages.

However, the group warns the mannequin continues to be vulnerable to mistranscribing sure phrases or phrases, which may end in inaccurate or doubtlessly offensive labels. They additionally acknowledge that their speech recognition fashions yielded extra biased phrases than different fashions, albeit solely 0.7% extra. 

While the scope of the analysis is spectacular, using spiritual texts to coach AI fashions may be controversial, says Chris Emezue, a researcher at Masakhane, a company engaged on natural-language processing for African languages, who was not concerned within the venture.

“The Bible has a lot of bias and misrepresentations,” he says.

LEAVE A REPLY

Please enter your comment!
Please enter your name here