EU Commission releases draft Code of Practice for AI regulation

 The photograph shows a black museum display with circular holes in a dark room. Behind the holes are illuminated pictures of animals and leafs.

The European Commission’s AI Office has published the first draft of a Code of Practice aimed at defining the technical measures and policies necessary for compliance with the EU’s Artificial Intelligence Act (AI Act).

The AI Act, which came into effect on 1 August 2024, sets out far-reaching obligations for providers of general-purpose AI (GPAI) models, with a particular focus on transparency and copyright. These requirements are especially relevant to the cultural and creative sectors, where AI is increasingly influencing artistic processes and the use of cultural data.

To help GPAI providers meet these obligations, the AI Office is leading the development of the Code of Practice. Scheduled for completion by May 2025, the Code will serve as a comprehensive guide to best practices and compliance measures.

The drafting process is collaborative, engaging close to 1,000 stakeholders from industry, academia, civil society, and rightsholder organisations. Renowned experts, acting as Chairs, are synthesising this input to create successive iterations of the document.

Culture Action Europe has published an overview (below) of the main measures related to transparency and copyright. Additionally, as members of the a Code of Practice working group, Culture Action Europe, together with the Michael Culture Association, has prepared considerations regarding the implementation of the AI Act, developed through our Action Group on AI & Digital. This paper forms the basis of the feedback that they are providing in the Code of Practice drafting process.

Earlier this year, NEMO published recommendations for policy makers addressing AI and museums for the European Parliament's consideration as they continue to develop and work on AI issues.

Measure 3: Internal Copyright Policy
  • Providers of GPAI models must implement an internal policy ensuring compliance with EU copyright laws across the entire lifecycle of their models. They should also assign clear responsibilities within their organisations to oversee this policy.
  • Providers of GPAI models must perform copyright due diligence on upstream parties before contracting them and ensure that these entities have respected rights reservations. In the context of AI model development, ‘upstream’ refers to the process of collecting and preparing the datasets used to train the model.
  • Providers of GPAI models should take steps to mitigate the risk that downstream systems produce copyright-infringing outputs. ‘Downstream’ refers to later stages where the AI model, being essentially a statistical model, is integrated into tools or applications for real-world use. Providers are urged to avoid overfitting their models (when the model learns the training data too closely, including its noise or specific details) and should require downstream entities to prevent repeated generation of outputs identical or recognisably similar to protected works. This measure does not apply to SMEs.
Measure 4: Providers should identify and comply with rights reservations
  • Providers should only use crawlers that respect the robots.txt protocol.
  • Providers should ensure that rights reservation expressed through robots.txt does not negatively affect the findability of the content in their search engine.
  • Providers should respect other appropriate machine-readable means to express a rights reservation at the source and/or work level according to widely used industry standards.
  • Providers, excluding SMEs, should collaborate to develop and adopt interoperable machine-readable standards for expressing rights reservations.
  • Crawling activities must exclude pirated sources, such as those listed on the European Commission’s Counterfeit and Piracy Watch List or national equivalents.
Measure 5: Transparency
  • Providers will publish information on their websites about the measures they adopt to identify and comply with rights reservations, written in clear and understandable language.
  • This information should include the names of all crawlers used for GPAI model training and their relevant robots.txt features.
  • Providers are encouraged to designate a single point of contact to allow rightsholders to communicate directly and promptly lodge complaints regarding the use of protected works in GPAI model development.
  • Providers will draw up, keep up-to-date and provide the AI Office upon its request with information about data sources used for training, testing and validation and about authorisations to access and use protected content for the development of a GPAI model. 

Transparency and copyright

On 21 November, the first meeting of the Working Group on Transparency and Copyright, co-chaired by Nuria Oliver and Alexander Peukert, took place. Pre-selected participants, representing both rightsholders and tech companies, briefly presented their positions on the first draft of the Code of Practice. Culture Action Europe provides generalised feedback from the meeting (in line with the Chatham House Rule, names of organisations are not disclosed). 

  1. Providers’ copyright policies should go beyond merely respecting opt-outs, even though this is a crucial aspect. They should also incorporate measures to establish robust licensing frameworks and encourage collaboration with Collective Management Organisations and key rightsholders.
  2. Many rightsholders argued that relying solely on the robots.txt protocol for opting out is insufficient and risks misapplication for AI training permissions. Rightsholders should be able to use other machine-readable mechanisms, such as opting out via terms and conditions on a website, public repositories of rights reservations, public declarations, or using Automated Content Recognition (ACR) technology to remove protected content from datasets. 
  3. Some participants suggested establishing an official public registry to explicitly record rights reservations. This registry would provide legal certainty for all stakeholders and enable tracking the dates of rights reservations, facilitating the removal of protected data from datasets as needed. However, one participant opposed the proposal, arguing that it could place an undue burden on rightsholders.
  4. Regarding upstream copyright compliance, rightsholders argue that it should not be limited to a simple pre-check of datasets—GPAI model providers should require third parties to provide full traceability of the data they supply and details about their collection methods. The concept of ‘reasonable due diligence’ needs further elaboration.
  5. Ensuring downstream copyright compliance requires GPAI model providers to share detailed information about the data used for training with the AI Office and downstream entities. This is the only way to ensure that AI outputs are not generated using illegal or infringing content.
    However, others noted that downstream providers are often the only entities capable of properly assessing and managing copyright compliance within their specific operational context. They may manipulate their own protected content or hold licences which fall outside the control of GPAI providers.
  6. Authors and rightsholders must be compensated for the prior unauthorised and illegal use of copyrighted works by GPAI providers. The Code of Practice should include a provision requiring AI providers to commit, through their copyright policies, to compensating for such unauthorised use. The Code should also establish a framework for sanctions and measures to address non-compliance.
    At the same time, tech company representatives stressed the need to stay within the scope of the AI Act, avoiding additional obligations: ‘We’re here to finish the rules under the AI Act, nothing more, nothing less.’ They questioned the AI Office’s role, arguing it is ‘not a copyright enforcer’ and that its responsibilities in verifying copyright compliance are unclear.
    They also pointed to technical challenges, including the unfeasibility of work-level rights reservations and the difficulty of downstream compliance. Predicting infringing outputs, they argued, is nearly impossible with current technology, and imposing copyright compliance on downstream providers lies outside the AI Act’s scope. 

Both the next meeting and the publication of the second iteration of the Code of Practice are expected to take place in January 2025.