On 28 October 2024, sixteen international data protection regulators reaffirmed their joint views about the risks and expected standards of compliance associated with data scraping from a privacy and data protection perspective.
Following their engagement with social media companies (SMCs) and other key stakeholders, the regulators issued a concluding joint statement (Concluding Statement).
The Concluding Statement builds on the initial joint statement published in August 2023 (see our previous article here). It highlights additional measures that organisations, particularly SMCs, should implement to safeguard personal data against unlawful data scraping. The Concluding Statement is the result of engagement between the signatory data protection authorities and certain SMCs who voluntarily participated in the consultation.
Key Takeaways from the Concluding Statement
1. Regular Review and Update of Safeguarding Measures
Organisations are urged to deploy a combination of ongoing safeguarding measures that are regularly reviewed and updated to keep pace with advances in unlawful scraping techniques and technologies. This approach is crucial to effectively mitigate the risks associated with unlawful data scraping. The Concluding Statement does not give specific examples of the safeguards expected but can be interpreted as including contractual, organisational and technical measures.
2. Leveraging AI for Protection
The Concluding Statement acknowledges that AI is often used by data scrapers to evade detection but confirms that AI tools can help organisations prevent unlawful scraping. It highlights that AI technologies can enhance protections against unlawful scraping, making it harder for unauthorised entities to extract personal data.
3. Obligations for Small and Medium Enterprises (SMEs)
The obligation to protect against unlawful scraping applies to both large corporations and SMEs. The Concluding Statement acknowledges that SMEs may have limited resources but emphasises that lower-cost measures are available that SMEs can implement. These include bot detection, rate limiting, and the use of CAPTCHAs, which can be implemented with the assistance of third-party service providers.
4. Lawful and Transparent Data Scraping
Where data scraping is contractually authorised, the Concluding Statement is clear that contractual terms alone do not render such scraping lawful. Organisations must ensure that: (i) they have a lawful basis to permit data scraping (including, under applicable privacy/data protection and other laws); (ii) they are transparent about data scraping activities; and (iii) they obtain consent where required by law.
5. Use of an Application Programming Interface (API) for Controlled Access
When granting lawful permission for third parties to collect publicly accessible data from their platform, providing such access via an API can provide organisations with greater control over the data. APIs may also facilitate the detection and mitigation of unauthorised scraping, ensuring that data access is monitored and regulated.
6. Compliance with Data Protection Laws in AI (Artificial Intelligence) Development
Organisations that use scraped data sets or data from their own platforms to train AI models must comply with data protection and privacy laws and any AI-specific regulations. This includes adhering to guidelines and principles on the ethical development and use of AI and ensuring that personal data are handled lawfully and transparently.
Practical implications – risk assessments and compliance cannot be siloed into one area of law
The Concluding Statement is not legally binding, but it is a helpful guide for organisations engaging in data scraping (including, those using or sharing data under a licence), particularly to mitigate against unlawful scraping. It is evident from the Concluding Statement that the risks of data scraping and associated compliance obligations go beyond privacy and data protection law. Rather, other laws and contractual obligations must also factor into risk assessments and compliance frameworks.
From a data protection perspective, the key expectations from the signatory regulators are:
- increased accountability and transparency from organisations (such as SMCs) when using or otherwise processing personal data which has been web scraped or licenced from a platform under a data sharing arrangement;
- the use of APIs where possible to protect user data;
- the implementation of data security measures through the use of AI;
- implementing protections beyond contractual terms when data sharing (e.g. purpose limitation obligations); and
- ensuring compliance with applicable laws (not only privacy and data protection laws).
Next steps: what can we expect from European regulators?
The Concluding Statement is timely from a General Data Protection Regulation (GDPR) perspective given that the European Data Protection Board will imminently issue its Article 64(2) Opinion on two key issues: (1) the extent to which AI models process personal data; and (2) the legal basis of legitimate interests to process personal data in the pre and post-training phases of AI models. This is following a request from the Data Protection Commission of Ireland for clarification on such matters. While the Opinion will not be legally binding (like the Concluding Statement), the views expressed will undoubtedly be considered by organisations and the European data protection authorities.
For more information about the laws surrounding data scraping, including the interplay between data scraping and GDPR, please contact Leo Moore, Rachel Hayes, or your usual William Fry contact.
Contributed by Ronan Shaughnessy.