At an AI keynote event in New York City today, Google announced its intention of expanding its already expansive language portfolio by ten-fold utilizing artificial intelligence.
The 1,000 Languages Initiative is Google’s commitment to build an AI model that supports the 1,000 most spoken languages across the globe to make information more accessible.
“Language is fundamental to how people communicate and make sense of the world,” said Jeff Dean, Google Senior Fellow. “But more than 7,000 languages are spoken around the world, and only a few are well represented online today.”
Since the undertaking is extremely ambitious, the project will likely take many years to see fruition. However, Google is already working on reaching its goal.
The tech-giant developed a Universal Speech Model (USM) that is trained on over 400 languages, making it the most coverage in a speech model to date, according to a blog post. Google is also partnering with communities around the world to source speech data.
Google’s attention to expanding its language capabilities is nothing new. Recently, Google added 24 more languages to its Google Translate platform and enabled voice typing for nine more African languages on Gboard.
Google is also working with local governments, NGOs and academic institutions in South Asia to collect audio samples of different dialects throughout the region.
Other major tech companies are also building large language models. In July, Meta announced an AI model called No Language Left Behind which can translate across 200 languages.
Meta’s efforts were also done with the intention of bringing content to communities that are otherwise not represented on the web. Meta’s AI model includes translations for 55 African languages — a significant advancement, since fewer than 25 African languages are supported by widely used translation tools.