Annexes — referenced by Article 51Article Annex XIII

Annex XIII: Criteria for Classification of GPAI Models with Systemic Risk

In effect since 2 Aug 20255 min readEUR-Lex verified Apr 2026

Annex XIII lists the criteria for classifying a GPAI model as having systemic risk under Article 51. It includes both quantitative indicators (notably the 10^25 FLOPs cumulative compute threshold that creates a rebuttable presumption) and qualitative criteria the AI Office considers when assessing high-impact capabilities. The Commission may update these criteria via delegated acts under Article 97 as technology evolves.

Start free assessment All articles

Who does this apply to?

-Providers of GPAI models assessing whether they meet systemic-risk thresholds
-The AI Office and scientific panel for AI (applying and monitoring the criteria)
-Downstream providers integrating GPAI models who need to know systemic-risk status
-Compliance teams tracking threshold changes via Commission delegated acts

Scenarios

A new frontier model is trained with cumulative compute exceeding 10^25 floating-point operations.

Presumed to have systemic risk under Annex XIII / Article 51(2). Provider must notify the AI Office and comply with Article 55.

Ref. Annex XIII + Art. 51(2)

A model is below 10^25 FLOPs but achieves state-of-the-art scores on reasoning and code generation benchmarks with broad deployment across the EU.

The AI Office may still designate systemic risk based on qualitative criteria (high-impact capabilities, reach, number of users) even without crossing the compute threshold.

Ref. Annex XIII + Art. 51(1)(b)

The Commission adopts a delegated act lowering the FLOPs threshold to 10^24 after advances in training efficiency.

Providers must re-assess against the updated criteria; models previously below threshold may now be captured.

Ref. Art. 97 + Annex XIII

What Annex XIII covers (in plain terms)

Annex XIII provides the assessment framework the AI Office uses to determine whether a GPAI model has high-impact capabilities and should be classified as systemic risk. The criteria include:

Number of parameters of the model
Quality and size of the dataset used for training
Amount of computation used for training the model (measured in FLOPs) — including the 10^25 FLOPs presumption threshold
Input and output modalities of the model (text, image, video, code, etc.)
Benchmarks and evaluations of the model, including state-of-the-art performance
Number of registered users or reach
Any other indicator of high-impact capabilities

The 10^25 FLOPs threshold creates a rebuttable presumption: models above it are presumed systemic risk, but providers may argue otherwise. Models below it can still be designated if other criteria demonstrate equivalent capabilities.

The 10^25 FLOPs threshold — context

The 10^25 floating-point operations threshold was calibrated to frontier models at the time of legislative negotiations (roughly GPT-4-class training compute). Key considerations:

It is a rebuttable presumption, not a hard boundary
The Commission can update the threshold via delegated act as training efficiency evolves
Distillation, data quality improvements, and architecture advances may reduce the compute needed for equivalent capabilities—the threshold may under-capture risk over time
The AI Office can designate models below the threshold based on qualitative criteria

Providers should track both their absolute FLOPs and benchmark performance to assess classification risk.

How Annex XIII connects to the rest of the Act

Article 51 — Uses Annex XIII criteria to define systemic risk; paragraph (2) establishes the FLOPs presumption.
Article 52 — Procedure for classification (notification, designation, rebuttal) based on Annex XIII assessment.
Article 55 — Additional obligations triggered by systemic-risk classification.
Annex XI Section 2 — Documentation requirements triggered by classification (evaluation strategies, red teaming, architecture).
Article 97 — Delegated acts allowing the Commission to update Annex XIII criteria and thresholds.
Article 90 — Scientific panel that may issue qualified alerts based on Annex XIII analysis.
Article 113 — Application dates (Chapter V from 2 August 2025).

Recitals (preamble) on EUR-Lex

The recitals in the same consolidated AI Act on EUR-Lex contextualise the 10^25 FLOPs calibration, the rebuttable presumption design, and the Commission's power to evolve criteria. Use the official preamble on EUR-Lex—do not rely on unofficial recital lists without checking sequence and wording against the authentic text.

Compliance checklist

Calculate and document cumulative training compute (FLOPs) for each GPAI model release.
Track benchmark performance against state-of-the-art metrics across modalities.
Monitor Commission delegated acts for threshold updates to Annex XIII.
If above 10^25 FLOPs: prepare notification to the AI Office under Article 52.
If below threshold but with broad deployment: assess qualitative criteria proactively.
Document rebuttal arguments if you believe systemic-risk classification is not warranted despite threshold crossing.
Track the AI Office's published list of systemic-risk models for upstream dependencies.

Read the official text on EUR-Lex

Assess your GPAI model against Annex XIII criteria—free assessment.

Start Free Assessment

Article 51: Classification of GPAI Models with Systemic Risk

Article 52: Procedure for Systemic Risk Classification of GPAI Models

Article 55: Obligations for Providers of GPAI Models with Systemic Risk

Article 56: Codes of Practice for GPAI Models

Article 90: Penalties

Article 97: Exercise of the Delegation

Article 101: Fines for Providers of General-Purpose AI Models

Article 113: Entry into Force and Application Dates

Annex XI: Technical Documentation for Providers of General-Purpose AI Models

Related annexes

Annex XI — GPAI technical documentation (Section 2 triggered by systemic-risk classification)

Frequently asked questions

Is the 10^25 FLOPs threshold permanent?

No. The Commission can update it via delegated act under Article 97 based on evolving technological benchmarks and state of the art.

Can a model below 10^25 FLOPs still be systemic risk?

Yes. Article 51(1)(b) allows the AI Office to designate based on equivalent capabilities or impact using qualitative Annex XIII criteria, even if the compute threshold is not crossed.

How do I calculate FLOPs?

FLOPs typically refers to the total number of floating-point operations used during training. For transformer models, common approximations exist based on parameter count, dataset size, and training steps. Document your methodology.

Does fine-tuning compute count?

The Annex refers to 'cumulative amount of computation used for training.' Whether fine-tuning adds to the base model's FLOPs depends on interpretation—document your position and monitor AI Office guidance.

On this page

Key terms

10^25 FLOPs: The rebuttable compute threshold: GPAI models trained with cumulative floating-point operations above this level are presumed to have high-impact capabilities and systemic risk.
High-impact capabilities: Capabilities matching or exceeding the most advanced GPAI models, evaluated through technical tools, indicators, and benchmarks listed in Annex XIII.
Delegated act: A Commission legislative instrument used to update or supplement non-essential elements of the Regulation, including Annex XIII thresholds, subject to co-legislator scrutiny.

Maximum penalties (Art. 99)

3% of turnover / Up to EUR 15 million or 3% of global annual turnover for GPAI infringements under Article 101
SMEs / start-ups: Lower caps for SMEs and start-ups where Article 101 applies
Failure to notify the AI Office when Annex XIII criteria are met is an infringement under Chapter V.

Timeline

2 Aug 2025
Annex XIII criteria applied as part of Chapter V GPAI obligations.

At a glance

Article: Annex XIII
Status: Applicable now
Timeline: 2 Aug 2025
Updated: 11 Apr 2026

Recommended next step

Run the short assessment to map risk class and obligations to your specific AI use case in minutes.

Start assessment now