LATAM-GPT is a groundbreaking large language model developed by the National Center for Artificial Intelligence (CENIA) in Chile, in partnership with over thirty institutions and twelve Latin American countries. The initiative aims to create an open-source AI model that reflects the region’s diverse cultures, languages—including Spanish, Portuguese, and Indigenous tongues—and social realities. Using ethically sourced, regionally contributed data, LATAM-GPT seeks to overcome the limitations and biases of global AI models predominantly trained on English data. The project is designed to empower local communities, preserve linguistic diversity, and support applications in education, public services, and beyond. Its development highlights the importance of ethical data practices, regional collaboration, and policy frameworks that foster inclusive, representative AI.
A video version of this case study is available below.
1. Background: Digital Inequality and AI in Latin America
Latin America faces unique challenges in digital inclusion, with significant linguistic, cultural, and infrastructural diversity across the region. Global AI models often fail to capture local nuances, leading to inaccuracies and reinforcing stereotypes. Many Indigenous and minority languages are underrepresented in mainstream technology, exacerbating digital exclusion. LATAM-GPT addresses these gaps by building a model tailored to the region’s needs, aiming to democratize access to advanced AI and promote technological sovereignty.
2. Technology and Approach
LATAM-GPT is based on Llama 3, a state-of-the-art large language model architecture. The model is trained on more than 8 terabytes of regionally sourced text, encompassing Spanish, Portuguese, and Indigenous languages such as Rapa Nui. Training is conducted on a distributed network of computers across Latin America, including facilities at the University of Tarapacá in Chile and cloud-based platforms. The open-source nature of the project allows for transparency, adaptability, and broad participation from local developers and researchers.
3. Project Overview
The project is coordinated by CENIA with support from the Chilean government, the regional development bank CAF, Amazon Web Services, and over thirty regional organizations. LATAM-GPT’s primary objective is to serve as a foundation for culturally relevant AI applications—such as chatbots, virtual public service assistants, and educational tools—rather than directly competing with global consumer products like ChatGPT. A key focus is the preservation and revitalization of Indigenous languages, with the first translation tools already developed for Rapa Nui and plans to expand to other languages.
4. Data Sources and Key Resources
LATAM-GPT uses ethically sourced data contributed by governments, universities, libraries, archives, and community organizations across Latin America. This includes official documents, public records, literature, historical materials, and Indigenous language texts. All data is carefully curated to ensure privacy, consent, and cultural sensitivity. Unlike many global AI models, LATAM-GPT publishes its list of data sources, emphasizing transparency and ethical data governance.
5. Legal and Ethical Challenges
Copyright and Licensing:
The project relies on open-access and properly licensed materials, with explicit permissions from data contributors. This approach avoids the legal uncertainties faced by models that scrape data indiscriminately from the internet.
Data Privacy and Consent:
CENIA and its partners ensure that sensitive personal information is anonymized or excluded, and that data collection respects the rights and wishes of contributors, especially Indigenous communities.
Inclusivity and Bias:
By prioritizing local languages and cultural contexts, LATAM-GPT aims to reduce biases inherent in global models. Ongoing community engagement and feedback are integral to the model’s development and evaluation.
6. International and Regional Collaboration
LATAM-GPT exemplifies pan-regional cooperation, with twelve countries and over thirty institutions contributing expertise, data, and infrastructure. The project has also engaged international partners and multilateral organizations, such as the Organization of American States and the Inter-American Development Bank, to support its mission of technological empowerment and digital inclusion.
7. Emerging Technology and Policy Issues
LATAM-GPT’s open-source model sets a precedent for responsible AI development, emphasizing transparency, ethical data use, and regional self-determination. The project also highlights the need for robust digital infrastructure and continued investment to ensure equitable access to AI across Latin America. As with all large language models, ongoing attention to potential biases, data privacy, and the impact on local labor and education systems is essential.
8. National and Regional Legal Frameworks
While LATAM-GPT’s ethical sourcing and licensing practices minimize legal risks, the project underscores the importance of harmonized copyright and data protection laws across Latin America. Policymakers are encouraged to develop frameworks that facilitate data sharing for socially beneficial AI, protect Indigenous knowledge, and promote open science.
9. Contractual or Policy Barriers
Some challenges remain in securing permissions for certain data sources, particularly from private publishers or institutions with restrictive contracts. The project’s commitment to open licensing and community engagement helps mitigate these barriers, but continued advocacy is needed to expand access to valuable regional content.
10. Conclusions
LATAM-GPT represents a major step forward in creating culturally sensitive, inclusive AI for Latin America. By centering ethical data practices, regional collaboration, and linguistic diversity, the project offers a model for other regions seeking to decolonize AI and ensure technology serves local needs. Continued investment, policy reform, and community participation will be crucial to realizing the full potential of LATAM-GPT and similar initiatives.
Video Version
Hear from the researchers themselves. Watch the video of this case study below.