📊 Full opportunity report: AMÁLIA · The Three Hard Questions. on ThorstenMeyerAI.com — validation score, market gap, and execution plan.
TL;DR
Portugal’s AMÁLIA, a €5.5M European Portuguese LLM, is operational and outperforms many models on benchmarks. However, critical questions about its openness, native data sufficiency, and optimization goals remain unresolved, raising concerns about the broader European sovereign-LLM efforts.
Portugal’s €5.5 million investment in the AMÁLIA large language model has resulted in an operational system that surpasses many benchmarks for European Portuguese tasks, yet critical questions about its openness, native data, and strategic goals remain unanswered.
AMÁLIA, developed through a consortium of approximately 60 researchers from Portugal’s top research institutions, was officially launched in October 2025. It is based on a continuation of the EuroLLM multilingual foundation, with the base version completed by September 2025. The model is currently accessible to 450,000 academic users via the FCT’s IAedu platform, holding knowledge up to the end of 2023. Benchmarks show AMÁLIA outperforms previous open models and beats Qwen 3-8B on most Portuguese-specific tasks, although it still trails Qwen on certain benchmarks like ALBA.
Despite these achievements, the project faces scrutiny from experts like Duarte O.Carmo, who publicly questioned whether the model truly embodies openness, whether the native Portuguese data used is sufficient, and what the primary objectives of the model should be. These questions are not just technical but relate to national policy and the broader European sovereign-LLM movement, which faces similar structural uncertainties.
AMÁLIA
The three hard
questions.
Portugal spent €5.5M to build a European Portuguese LLM. The base version is operational, the benchmarks beat Qwen 3-8B on most pt-PT tasks. So why are the most important questions still unanswered?
Last month, Duarte O.Carmo published the sharpest public analysis of AMÁLIA — Portugal’s state-funded European Portuguese large language model. He prefaces his critique with the necessary diplomatic apparatus before doing what almost nobody else in the European-sovereign-LLM discourse has been willing to do publicly: asking hard questions about whether the work, as released, actually does what it set out to do. This piece is a structural extension of his analysis. The AMÁLIA case study exposes three hard questions every national LLM effort needs to answer publicly — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
Three questions every national LLM effort needs to answer publicly.
Duarte O.Carmo’s framing maps cleanly onto the structural argument. Each question lands specifically in AMÁLIA — and the broader European sovereign-LLM movement has been operating without explicit answers to any of them.
The three questions form a structural feedback loop. Q3 (optimization target) determines Q2 (data volume needed) which conditions Q1 (openness sufficient for community contribution). The European sovereign-LLM movement collectively benefits from these questions becoming standard methodology disclosure, not exceptional critique.

Advanced Language Tool Kit: Teaching the Structure of the English Language
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
107 billion tokens. 5.8 billion clearly pt-PT.
The structurally tractable question with a structurally surprising answer. For a model whose entire stated purpose is European Portuguese prioritization, the native-language share of extended pre-training is 5.5%. The implications cascade into every other question.

Evals for AI Engineers: Systematically Measuring and Improving AI Applications
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
The Olmo standard. AMÁLIA’s current state.
Allen Institute for AI’s Olmo project defines what “fully open” operationally requires. Olmo doesn’t lead frontier benchmarks. That’s not the point. The point is to be the structural reference for openness. AMÁLIA’s “fully open source” claim should track to the operational standard.
Portuguese language AI models
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Four strategic positions. AMÁLIA between two and three.
Approximately €100M+ in publicly disclosed European sovereign-LLM funding across the major initiatives. The structural question every project faces: what is the actual competitive position you’re staking? Four options — none mutually exclusive — but each requiring different commitments.
open source LLM platform
As an affiliate, we earn on qualifying purchases.
As an affiliate, we earn on qualifying purchases.
Three standards. For AMÁLIA and the movement.
The structural critique generalizes beyond AMÁLIA. Italy, France, Germany, Switzerland, the OpenEuroLLM consortium, and every subsequent national project benefit from public discourse holding national LLM efforts to operational standards on openness, data accounting, and strategic positioning.
The European sovereign-AI agenda is a serious strategic project that deserves serious public discourse. O.Carmo’s analysis is what serious public discourse looks like. Appropriately diplomatic. Structurally rigorous. Willing to ask the hard questions in public when the public investment justifies it. More of this is needed — across every European sovereign-LLM project, not just AMÁLIA.
Implications for Portugal’s AI Strategy and European Sovereign Models
The development of AMÁLIA exemplifies Portugal’s commitment to building a sovereign-language LLM, but the unresolved questions about openness, native data, and strategic goals highlight broader challenges facing European efforts to develop independent AI models. These issues impact national AI policies, research priorities, and Europe’s position in the global AI landscape. How Portugal addresses these questions could influence other countries’ approaches to sovereignty and AI development, shaping the future of European AI autonomy.
European Sovereign-Language Model Initiatives and Structural Challenges
Across Europe, countries like Italy, Germany, France, and Norway have launched or announced their own sovereign-language LLM projects, often with public funding and academic partnerships. These initiatives share common challenges: defining what “fully open” means, determining how much native-language data is enough, and setting clear objectives for model deployment. The European sovereign-LLM movement is still in a formative phase, with many projects in progress and little consensus on these foundational questions. Portugal’s AMÁLIA is a prominent case due to its public funding and national scope, making its structural questions particularly relevant.
“The questions about openness, native data, and goals are not just technical; they are strategic questions that every national effort must answer publicly.”
— Duarte O.Carmo
Unanswered Questions About AMÁLIA’s Openness and Strategy
It is not yet clear how open the AMÁLIA model truly is, especially regarding access to training data and model weights. The long-term strategic goals—whether the model will be further open-sourced or used primarily for national policy—remain uncertain. Additionally, the sufficiency of native-language data and how it impacts model performance continue to be debated among experts.
Upcoming Milestones and Policy Discussions for AMÁLIA
The final version of AMÁLIA is expected in June 2026, which will likely address some of the current gaps. In the coming months, Portugal’s government and research institutions are expected to clarify their openness policies, data strategies, and deployment goals. Broader European discussions on sovereignty, openness, and data sharing are also anticipated to influence the future direction of AMÁLIA and similar models, with potential policy adjustments and increased transparency efforts.
Key Questions
What makes AMÁLIA different from other European language models?
AMÁLIA is a state-funded project based on a continuation of a multilingual foundation, with a focus on Portuguese. It is publicly accessible and outperforms many open models on Portuguese benchmarks, but questions about its openness and native data usage remain.
Why are questions about openness and native data important?
They determine how transparent, accessible, and strategically autonomous the model is, affecting national policies, research integrity, and Europe’s AI sovereignty efforts.
What are the main concerns experts have about AMÁLIA?
Experts like Duarte O.Carmo question whether the model is truly open, whether the native Portuguese data used is sufficient for high-quality performance, and what the long-term goals of the project are.
When will the final version of AMÁLIA be released?
The final version is expected in June 2026, which may address some of the current uncertainties and questions.
How does this development affect Europe’s AI landscape?
It highlights the structural challenges European countries face in building sovereign-language models and the importance of clear policies on openness and data use, shaping the continent’s AI independence trajectory.
Source: ThorstenMeyerAI.com