What is General Purpose AI (GPAI) in the EU AI Act? – Trustworthy and Compliant Machine Learning

written by Tobias Leemann

The European Union’s Artificial Intelligence Act (EU AI Act) classifies AI systems into different categories. Most notably, there are AI systems with impermissible risk that are prohibited in the EU (Article 5, AI Act), and models with high-risk (Article 6, AI Act) for which special measures are required to ensure the safety and compliance of their operation. Besides this fundamental risk-based classification, there is an additional distinction based on scope and capabilities, giving rise to the regulatory term of General Purpose AI (GPAI).

Origins. The notion of GPAI models was introduced in the AI Act as a consequence of the creation of Large Language Models (LLMs) such as ChatGPT, Gemini, or the Llama models. As the risk-based categorization of AI models largely depends on their specific use case, LLMs and other foundation models are hard to map to this categorization. The notion of GPAI models was created to circumvent this issue and to provide specific regulation for this emerging AI paradigm.

Definition. The formal definition of a GPAI model is given in Article 3(63) of the AI Act:

‘general-purpose AI model’ means an AI model, including where such an AI model is trained with a large amount of data using self-supervision at scale, that displays significant generality and is capable of competently performing a wide range of distinct tasks regardless of the way the model is placed on the market and that can be integrated into a variety of downstream systems or applications
EU AI Act, Article 3(63)

Recitals (98)-(100) help clarify these requirements a bit. Specifically,

GPAI models are typically large, with at least a billion parameters, and are trained on large and diverse amounts of data using methods such as Reinforcement Learning (RL) or Self-supervised Learning. This also includes large-scale pretraining through masked language modeling or next token prediction tasks.
Large generative AI models (e.g., those generating text, audio, images, or video) are clear examples of GPAI.
GPAI models require the addition of further components, for example a user interface, to become GPAI systems.
GPAI systems can either be used directly or embedded into other applications. This adaptation is simple due to their broad capabilities.
Depending on how this is done, the entire system may be considered a GPAI system.

In summary, GPAI models are large and not limited to a specific task. LLMs and other foundation models are typical examples of GPAI models.

GPAI with systemic risks. The AI Act further introduces a subcategory of GPAI models that includes even more capable state-of-the-art models as GPAI models with systemic risk. Recital 110 lays out the reasoning behind this distinction: The lawmakers note that increasing the capabilities of models in turn leads to an increased potential for negative side effects. Some examples mentioned include the lowering of access barriers to the development of weapons and offensive cybersecurity tools, and the ability to interfere with critical infrastructure. However, drawing a line between vanilla GPAI and GPAI with systemic risks is hard. The final definition in Article 51 allows systems to be classified as systemic risk for several reasons:

high-impact capabilities that match or exceed those of the most powerful models currently available. If the provider does not provide reasonable evidence against it, models whose training required more than 10²⁵ floating point operations are presumed to have high impact capabilities by default (for reference, training of GPT-3, a 175B model whose successor was the basis for ChatGPT took about 3×10²³ flops to train, while GPT-4 training was estimated to require 2×10²⁵ flops to train, thereby making it a model with presumed high-impact capabilities).
based on a decision by the Commission. In Annex XII, some reasons besides pure capabilities and size are listed that can lead to being seen as a systemic risk. These include the model’s autonomy, the tools it can use, and lastly its impact, for instance measured through the number of users.

The distinction still remains a bit vague and will be specified and amended with practical benchmarks and metrics in the future.

Implications. The definition of GPAI models in the EU AI Act is targeted towards foundation models that are developed without having a single purpose in mind, that are otherwise not directly covered by the use-case specific distinction in the AI Act. As foundation models like LLMs form the basis of many AI applications, this prompts the question whether the rules laid out for GPAI apply to all of them as well.

Is my LLM-based AI application GPAI? The recitals of the AI Act make clear that the class of GPAI models mainly targets raw foundation models. These are usually built by large research and technology firms. For AI systems that leverage GPAI models, e.g., through an API, their classification as GPAI depends on whether the system may still be used for a variety of purposes or is limited to specific tasks (Recital 100, AI Act). For instance, a customer support chatbot in a tool store that is constrained to answer questions about specific products in the store is unlikely to be a GPAI system as it does not have general capabilities and may not be embedded in other applications. However, systems built on top of GPAI models may still fall into other categories, such as high-risk AI systems or even be prohibited, depending on their application. For instance, chatbot-based systems used in law enforcement or education fall into the high-risk applications listed in Annex III of the AI Act.

Next Steps.The AI Act establishes broad guidelines on documentation, copyright protection, and safety standards for GPAI systems. These principles will soon be further specified in the GPAI Code of Practice set to take effect in August 2025. The Code will also clarify the obligations of GPAI providers concerning businesses that deploy GPAI models for specific tasks. Additionally, it may offer clearer guidance on liability in the AI value-chain, for instance, when chatbot-driven systems cause harm to companies. A recent example of such an event was when an Air Canada customer service chatbot provided incorrect information, resulting in a lawsuit in which the airline was ultimately held accountable.

I will cover the specific implications of the Code Of Practice for such LLM-based systems in may next post! Stay tuned.

Disclaimer: This post is a summary of notions from of the AI Act. It is non-exclusive and does not constitute legal advice.