📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s government-funded AMÁLIA large language model is operational and surpasses several benchmarks in Portuguese. However, experts are raising three fundamental questions about its openness, native-language data, and optimization goals. These issues highlight broader concerns in Europe’s sovereign AI efforts.
Portugal’s €5.5 million government-funded large language model, AMÁLIA, is now operational, demonstrating superior performance in Portuguese benchmarks compared to previous open models. However, experts are raising three critical questions about its openness, native-language data, and strategic goals, which could influence the future of European sovereign AI initiatives.
AMÁLIA, developed through a consortium of around 60 researchers across Portugal’s top institutions, was officially released in September 2025. It is based on a continuation of the EuroLLM multilingual foundation, with the Portuguese component comprising approximately 5.8 billion tokens from the Portuguese web archive, Arquivo.pt, during its training. The model currently outperforms previous fully open models on European Portuguese benchmarks and beats Qwen 3-8B on most Portuguese tasks, though it still trails on some specific benchmarks like ALBA.
Despite its promising technical performance, questions have emerged from experts such as Duarte O.Carmo about the model’s openness, the sufficiency of native-language data, and the strategic priorities guiding its development. These questions are part of a broader debate about the structural approach of European sovereign LLM projects, which often treat individual launches as isolated events rather than components of a larger pattern.
The core issues revolve around how transparent and accessible AMÁLIA truly is, whether the amount of native Portuguese data used is adequate, and what the model is ultimately optimized to achieve. These concerns are not accusations but are seen as essential for evaluating the long-term viability and strategic direction of national AI efforts in Europe.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.
Portuguese language AI chatbot
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.
large language model development kit
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.

BXQINLENX Professional 8 PCS Model Tools Kit Modeler Basic Tools Craft Set Hobby Building Tools Kit for Gundam Car Model Building Repairing and Fixing(A)
● FUNCTION—EASY TO USE—The modeler basic tools set is suitable for a beginner and advanced modeler as well.You…
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.
native language data collection software
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Implications for Europe’s Sovereign AI Strategies
The questions raised about AMÁLIA reflect broader concerns about transparency, data sufficiency, and strategic focus in Europe’s national AI projects. How open the model truly is impacts trust and collaboration across the continent, while the amount of native-language data influences the model’s cultural and linguistic authenticity. The answers to these questions will shape future investments, policy decisions, and the competitiveness of European AI initiatives in a global landscape.
As the first major Portuguese effort with public funding, AMÁLIA serves as a case study for other European countries considering similar projects. Its development trajectory and the responses to these questions could determine whether Europe can build truly sovereign, multilingual LLMs capable of competing internationally.
European Sovereign LLMs and the Structural Challenge
The development of European large language models is characterized by multiple national initiatives, including Italy’s Minerva, Germany’s Aleph Alpha, France’s Mistral, and others across Scandinavia. These projects often face similar structural questions: how open is the model’s code and data, how much native-language data is enough, and what should the models be optimized for?
Portugal’s AMÁLIA exemplifies this pattern, being publicly funded, built on a multilingual foundation, and subject to scrutiny regarding its openness and data choices. The broader European effort is still in an early stage, with many projects sharing common strategic dilemmas and uncertainties about long-term goals and transparency.
While some models like Minerva were trained from scratch on native data, others like AMÁLIA extend existing multilingual models, raising debates over technical merit versus strategic priorities. The public discourse has yet to fully address these structural questions at a continental level, making AMÁLIA a key case for understanding Europe’s approach to sovereign AI development.
“The AMÁLIA project demonstrates impressive technical performance, but it prompts essential questions about openness and data sufficiency that the community must address.”
— Duarte O.Carmo
Unanswered Questions About Openness and Strategy
It remains unclear how open the AMÁLIA model will be by its final release, including access to training data and source code. The sufficiency of native Portuguese data for long-term performance and cultural relevance is also still under debate. Additionally, the ultimate strategic goals—whether to prioritize openness, performance, or strategic sovereignty—are not yet publicly clarified.
Further developments in transparency policies or technical adjustments could alter the current understanding, but these remain to be seen as the project progresses toward its June 2026 release.
Next Milestones and Policy Discussions
The final version of AMÁLIA is scheduled for release in June 2026, which will likely provide more clarity on its openness and capabilities. Meanwhile, ongoing discussions among European policymakers, researchers, and industry leaders will shape the strategic framework guiding future projects. Transparency initiatives and data-sharing agreements may also influence the model’s accessibility and trustworthiness.
In the coming months, expect detailed evaluations from independent experts and potential policy statements from the Portuguese government addressing the structural questions raised by Duarte O.Carmo and others.
Key Questions
What are the main concerns about AMÁLIA’s openness?
Experts are questioning how transparent the model’s training data and source code will be upon final release, which impacts trust, collaboration, and strategic sovereignty.
Is the amount of native Portuguese data used sufficient?
While the model performs well on benchmarks, there is debate about whether the 5.8 billion tokens from Arquivo.pt are enough for deep cultural and linguistic fidelity in the long term.
What are the strategic goals behind AMÁLIA?
It is not yet clear whether the project aims primarily at technical excellence, national sovereignty, or fostering open collaboration, as these priorities may influence future development and transparency policies.
How does AMÁLIA compare to other European models?
AMÁLIA outperforms many open models in Portuguese benchmarks but still trails some specialized models like Qwen 3-8B on certain tasks, reflecting ongoing trade-offs between openness, data, and performance.
Source: ThorstenMeyerAI.com