Artificial Empires: The Race for Swahili AI in East Africa

    As East Africa experiences a digital awakening, a new geopolitical fault line is emerging – not over ports or pipelines, but language models. The rapid proliferation of artificial intelligence (AI) systems trained in Swahili, a lingua franca for over 100 million people, signals not just a linguistic milestone, but a profound struggle for influence in the region. Beneath the promise of digital inclusion lies a strategic contest involving the world’s leading tech powers – each seeking to shape the algorithms that will mediate knowledge, commerce, and governance in Swahili-speaking Africa.

    The Linguistic Frontier of AI

    Swahili is one of the few African languages to receive consistent attention from global tech firms, primarily due to its wide geographic spread across Kenya, Tanzania, Uganda, Rwanda, and parts of the DRC and Mozambique. Yet, until recently, most large language models (LLMs) failed to provide meaningful support for Swahili or other African languages, perpetuating what scholars have called “digital neo-colonialism”- a system where African users remain consumers of foreign technologies, with little influence over the rules that govern their digital lives.

    “The Anglo‑centrism of the internet is a sort of colonialism in another form.”
    EM Lewis-Jong of Mozilla

    This is beginning to change. Google’s latest language model, PaLM 2, was trained on over 100 languages, including Swahili, and was designed with a focus on multilingual understanding. Meta has similarly invested in open-source models capable of handling low-resource African languages, while Huawei and Alibaba are quietly developing Swahili-compatible AI interfaces as part of China’s Digital Silk Road. But these efforts are not merely about language support – they are about influence. The race to build Swahili-fluent AI is fast becoming a proxy for deeper geopolitical competition over East Africa’s digital future. As EM Lewis-Jong of Mozilla has noted, “The Anglo‑centrism of the internet is a sort of colonialism in another form.” This critique underscores the urgency of developing language models that reflect Africa’s linguistic plurality – not just as a matter of equity, but of epistemic sovereignty.

    Big Tech’s Expanding Footprint

    As AI becomes embedded in government services, banking, education, and health care, those who control its development gain disproportionate power. According to a Brookings report, emerging technologies in Africa could increase GDP by up to $1.5 trillion by 2030, with AI playing a critical role. However, this economic promise hinges on who sets the ethical, regulatory, and technological standards. The AI systems being rolled out in East Africa are not neutral. As Timnit Gebru has warned, large-scale models developed by US or Chinese companies often replicate the political, economic, and cultural assumptions of their creators. From data collection practices to content moderation policies, these models may inadvertently (or deliberately) marginalize local epistemologies and value systems.

    “Data is power. And power must be localized… digital colonialism is a new form of extraction.”
    Philani Mdingi of Tech for Good

    Moreover, data is infrastructure – and Swahili-speaking populations are rapidly becoming a lucrative source of it. Mobile penetration in Kenya stands above 90%, and internet usage across the region is expanding year-on-year. As Philani Mdingi of Tech for Good has argued, “data is power. And power must be localized… digital colonialism is a new form of extraction.” Without local control over data flows and processing, the promise of AI inclusion risks becoming a new form of dependence.

    China’s Digital Sovereignty Model

    Nowhere is this asymmetry more evident than in China’s approach. Beijing’s state-backed firms are exporting an AI model grounded in “cyber sovereignty”- a principle that aligns digital infrastructure with regime security. In Tanzania and Ethiopia, Huawei has built extensive surveillance systems under the guise of “smart city” projects. These developments echo China’s broader strategy of exporting its techno-authoritarian governance model under the Digital Silk Road. As the African Human Rights Law Journal notes, this model is now being replicated across the continent, threatening freedom of expression and civil liberties. In East Africa, where democratic institutions remain fragile, the risk is acute : Swahili-language AI could be weaponized to monitor dissent, filter information, and manipulate electoral outcomes.

    Europe and the Ethics Race

    In contrast, the European Union is attempting to position itself as a normative power in AI regulation. The EU AI Act categorizes AI applications by risk and mandates transparency, accountability, and non-discrimination – principles that resonate with many African civil society groups. However, the EU lacks the technological muscle and infrastructure investment to match China or the U.S. in Africa. As Deremi Atanda, Managing Director of Remita Payment Services Ltd., emphasizes : “Africa must shed its digital dark back.” This metaphor highlights the pressing need for digital infrastructure development, which is fundamental for economic growth and innovation on the continent. Still, partnerships are emerging. UNESCO, under the UN’s umbrella, has launched the first AI Needs Assessment Survey in Africa, identifying language gaps, regulatory deficits, and opportunities for South-South collaboration. These efforts may lay the groundwork for more inclusive AI ecosystems, but they remain nascent.

    Indigenous Innovation or Algorithmic Dependency ?

    The stakes are not merely symbolic. As the Portulans Institute argues, the adoption of foreign AI systems in the Global South risks deepening digital dependency. Without investment in local computing power, datasets, and research institutions, East African countries may find themselves permanently positioned as algorithmic peripheries – consumers of foreign logic embedded in black-box systems. Initiatives like Deep Learning Indaba offer a glimmer of resistance. Based in Africa and led by African researchers, Indaba promotes open, equitable AI grounded in local knowledge systems. Other grassroots efforts such as Masakhane, an open-source NLP collective, and Makerere University’s AI lab in Uganda, show that homegrown talent is not in short supply – but it is under-resourced. However, these initiatives face persistent obstacles : limited infrastructure, lack of computer access, and dependence on foreign grants that often dictate research priorities. Without sovereign investment strategies – like those now under consideration by Rwanda’s Ministry of Innovation and ICT or Ghana’s AI policy task force – African countries risk reinforcing the very dependencies they seek to escape. As Lorna Omondi of Google Research Africa observed at the 2025 Africa Soft Power Summit, “When we buy language data from the ecosystem, we partner with local universities, local startups, local governments… AI must solve the problems that matter most to African users.” This signals a growing awareness within global firms of the need to embed local perspectives – though such efforts remain the exception, not the norm – a reality that underscores the urgency of shifting power toward African-led AI ecosystems.

    Strategic Implications

    The geopolitical implications of Swahili-language AI are manifold. First, whoever builds the dominant Swahili model will shape not just speech recognition and translation tools, but how East Africans access knowledge, conduct business, and interact with the state. Second, control over these systems enables subtle forms of influence – from narrative shaping to surveillance – without deploying troops or installing bases. As Ambassador Philip Thigo of Kenya warned, “If Africa does not define its own AI future, others will, and they are already trying to define it for us.” In this light, the struggle over Swahili-language AI becomes more than technological – it is a contest over self-determination in the digital age. In short, the race to dominate Swahili AI is a microcosm of broader shifts in digital power. As in previous colonial eras, language remains a vector of control – only this time, it is mediated through neural networks, data lakes, and machine learning algorithms.

    The struggle over Swahili-language AI becomes more than technological – it is a contest over self-determination in the digital age.

    East Africa stands at an inflection point – the decisions made now will determine whether the region achieves true digital sovereignty or deepens its technological dependence. This will require more than just regulating foreign AI providers. It demands investment in the technical and institutional foundations for building indigenous systems – by Africans, for Africans.

    Governments – such as Kenya’s Ministry of ICT, which promotes AI localization in its Digital Economy Blueprint – must demand transparency in AI procurement, mandate Swahili and other local language support, and create open data infrastructures that serve public, not private, interests. Swahili-language AI must not become the next frontier for artificial empires. If left unchecked, the very tools meant to empower could be used to surveil, censor, and control. But if reimagined with local agency, these systems can become vessels of cultural expression and democratic resilience. In the end, it’s not just about who writes the code – but whose realities it reflects.

    Les opinions exprimées et les arguments avancés dans cet article demeurent l'entière responsabilité de l'auteur-e et ne reflètent pas nécessairement ceux du CETRI.

    Discussion