Ouma Katrina may not be a household name across the African continent, but for those in Southern Africa and among language enthusiasts, her name is a poignant reminder of the urgent need to preserve the Njuu language, South Africa’s oldest indigenous tongue. Like Njuu, many other African languages are critically endangered and face the threat of extinction.
African languages are a treasure of history, culture, and traditional knowledge. However, many of these languages are on the verge of disappearing due to globalization, cross-cultural marriages, urbanization, and the dominance of widely spoken languages such as English and French. According to Kapodgroup, Africa has over 2,000 languages, and losing one would erase unique cultural identities and worldviews.
The most endangered languages are often minority languages, spoken by less than 50% of the population in specific areas. This is why Ouma Katrina has been regarded as a national treasure in South Africa; she was the last remaining native speaker of Njuu following the deaths of her two sisters, Hanna Koper and Griet Seekoei.
Vambo Academy in South Africa was established to provide a platform for people to learn indigenous languages. They initially focused on Zulu, Shona, and Afrikaans. However, the COVID-19 pandemic prompted a shift from physical classes to online learning through Zoom, a transition that proved successful.
This transition laid the foundation for Vambo AI, born out of the increasing demand to digitize African languages. During this year’s EdTech Summit, Linah Kitala, Community Manager for Vambo AI, captured our attention when she reassured a delegate that AI could support his native language.
The conversation was sparked by questions from María José Ogando from AI-for-Education.org, who walked us through the current landscape of AI-Edtech products in Sub-Saharan Africa, highlighting the opportunities and challenges as technology continues to evolve. Speaking during this year’s Edtech Summit, Maria shared that AI educational products are expanding rapidly, but we are seeing a gap in quality assurance and AI benchmark datasets targeting the teaching and learning process.
Few EdTech startups have ventured down the path Vambo has chosen: preserving African culture through technology.
According to African World Initiatives, African languages lag in natural language processing (NLP), large language models (LLMs), and AI research, largely due to a lack of quality datasets. Maria acknowledged this as one of the challenges they face, stating that though AI educational products are expanding rapidly, there is a gap in quality assurance and AI benchmark datasets targeting the teaching and learning process.
Documentation is an effective method for preserving endangered languages. AI can play a pivotal role in recording and transcribing spoken languages, making it easier to create dictionaries, grammar guides, and other linguistic resources.
Kabodgroup recommends that written records, transcriptions, and recordings be created for every minority or endangered language to aid in their preservation. NLP tools can then be used to automatically transcribe audio recordings of native speakers, converting spoken words into text with high accuracy.
Another approach involves creating and maintaining large-scale digital archives of African languages, including audio recordings, texts, and cultural artifacts. These archives can be made accessible online, allowing people worldwide to explore and learn about these languages.
Many Africans have moved away from their native homes for socio-political reasons, in search of better opportunities. In such situations, people may lose contact with native speakers and, consequently, fail to learn their ancestral languages. Digital archives can bridge this gap, making it easier for displaced people and others to learn these languages, thereby preventing their extinction.
AI helps to curate and organize these archives, making it easier to search for specific words, phrases, or cultural contexts. Machine learning algorithms can continuously update and expand these archives as new data becomes available, ensuring that languages are preserved for future generations.
This is how Vambo is building its indigenous language platform, continually updating existing content. As Linah said, “Building is a process.”
With many African languages at risk as older generations pass away, Linah emphasized the importance of embracing digital tools to preserve our linguistic heritage. Without these efforts, we risk creating an irreparable void once a language becomes extinct.
To ensure that Africa’s cultural legacy endures, we must pass on our languages to future generations. According to Kapodgroup, a language’s longevity largely depends on the quantity of its speakers. Preserving and growing our languages digitally is crucial to keeping them alive and relevant for the future.
While it is important to ensure that African languages are preserved digitally, it is also important to ensure LLM models are fed with good-quality local educational content. Digital companies should source their content from the local people to avoid passing on incorrect information and losing the true meaning of native African languages. Maria suggested the need to develop benchmarks that will give guidance on how AI should be used not only to teach but also to preserve African languages.