How can technology benefit language diversity?
A quick look at the ways new technologies like AI and machine learning can open up access to knowledge in more languages.
When we celebrate International Mother Language Day, we celebrate the huge diversity of the world’s (roughly) 7,000 languages. Every one of these languages is important and offers its speakers a unique world viewpoint and cultural identity.
But in the interconnected, digitalized world of today not all languages are created equal. Only a handful of so-called ‘mainstream’ languages dominate the internet and it is often difficult for speakers of minority languages to access content in their mother tongue.
Happily, as technology evolves so does its ability to serve a greater number of languages. By working towards including more languages in the world of AI and machine learning, the goal of language equality gets nearer.
The start of the AI revolution
Although artificial intelligence has been creeping into our lives for a while, it is only in the last 18 months – since the arrival of ChatGPT – that the wider public has sat up and taken notice of it. The remarkable abilities of this large language model have reached people’s laptops, desks and projects and the world is still discovering what it and other similar models, can do.
Of course, the drawbacks have been well documented and we now understand that generative AI has a long way to go before it can replace the subtle abilities of the human brain. However, the potential in the sphere of language is enormous.
AI and language learning
Chatbots like Bard and ChatGPT can be great language-learning tools and are now being used as companions for extra practice, like a modern-day pen pal. Large language models can provide endless and varied conversational scenarios and don’t have to stick to a pre-defined script like previous online methods.
Many language learning apps have now been developed based on the available open-source code from these AI models and even the most well-known in the sector, Duolingo, has partnered with OpenAI to incorporate GPT-4 technology into its products.
For indigenous languages the situation is more complex because of the lack of available data to train AI with. There is, however, encouraging progress as researchers and indigenous language activists develop new ways of training algorithms to learn minority tongues. Duolingo also now offers courses in Navajo, Yiddish and Hawaiian with te reo Māori promised soon.
In the classroom there has been understandable reluctance from teachers to use AI technology but attitudes are slowly changing as education professionals begin to see the possibilities of this innovative tech. Language models can, for example, offer help to non-native students with grammar and sentence structure or with translating unfamiliar phrases, and learning can be personalized to suit the individual. Excitingly, the possibility of using multilingual virtual and augmented reality devices in teaching is also on the horizon.
Machine translation progress
New AI technologies are giving a big boost to existing machine translation tech which itself has improved significantly over recent years. Language service providers now routinely offer machine translation for certain types of text and have been absorbing it into their workflows for a while. As with language learning though, the number of languages on offer has historically been limited to a handful of widely used tongues. AI means this is changing.
Big Tech has been quick to harness artificial intelligence for translation. Meta’s Massively Multilingual Speech project claims that its research models can now recognize over 4,000 languages and that one of its core objectives is ‘to make it easier for people to access information and use devices in their preferred language’. Because Meta’s model is a text-to-speech and speech-to-text platform, it has many potential benefits for languages that only exist in oral form.
Google and Microsoft have also declared that increasing access to minority languages is one of their targets. Google has, for example, developed a dataset of over 400 languages, designed to bridge some of the gap between available training data and machine translation models. Microsoft has been working on providing more Indian languages for its Translator platform so that it reaches a greater proportion of the subcontinent’s population.
Other projects also exist. Clear Global, encompassing the Translators Without Borders organization, is combining its technology and non-profit expertise to bring language resources to the ‘4 billion people who can’t connect and get information in their language.’ Masakhane is a project aiming ‘to spur NLP research in African languages, for Africans, by Africans’ and bring digital power to more of the 2,000 languages of the African continent.
AI can help more languages go digital
The digital language divide may still be wide but it is starting to close. As technology progresses so do the possibilities for including a much greater number of the world’s languages in AI and machine learning programs.
The potential of what AI and technology offer language equality is thrilling. There is still a long way to go but it is to be hoped that the happy coupling of language and technology can soon start to have a positive impact on the lives of more people across the planet.
t-works continues to embrace language technology wherever it improves our workflows and gives our customers the most value. Talk to us today about how machine translation could help your language projects.