Despite the fact that 2000 of the world’s languages are African, their representation in technology remains minimal. This can be linked to the diversity of African languages, limited written resources, and colonialism. It has resulted in inadequate representation in scientific research, technological advances, and algorithmic bias.
Masakhane, a pan-African initiative, addresses this challenge by creating open-source AI models. The initiative targets low-resource African languages such as Amharic, IsiXhosa, Yoruba, and others. The Masakhane Web, featuring a neural machine translation (NMT) system for African languages, produces reasonably accurate translations. It fosters collaboration among African researchers, linguists, and data scientists to ensure culturally grounded models and enables them to develop, test, and refine the AI models.
With over 49 translation results for over 38 African languages published, it stands as one of the largest efforts to expand AI capabilities in this field. This initiative exemplifies AI’s potential in preserving African languages and ensuring their representation in technology.