• Perspective
Published on 21 January 2022

Tensions over privacy in Europe

Crédit : Victorgrigas sur Wikimedia Commons (CC BY-SA 3.0)

Credit : Victorgrigas on Wikimedia Commons (CC BY-SA 3.0)

Europol, the European criminal police agency, probably had other plans for 2022 than sorting and deleting its datasets older than 6 months. However, the EDPS (European Data Protection Supervisor) ordered it to do so on 3 January, following an investigation that started in April 2019.

Big data

Europol coordinates the police activities of the 27 Member States of the European Union against criminal and terrorist networks. It plays a central role in analysing and sharing the data it receives from the various Member States.

The massive adoption of IT tools has generated more and more digital data for several years. When police forces collect this data in the context of an investigation, they no longer limit themselves to very targeted data, but retrieve larger data sets. Necessarily, the volume of datasets that States ask Europol to analyse increases, and at the same time the formats of the data they contain become more varied.

This requires Europol to use big data processing techniques to analyse these new datasets, which involve resource-intensive operations and, consequently, the storage of many intermediate datasets.

Europol Regulation

The agency soon realised that these processing operations raised compliance issues with the 'Europol rules' on personal data and turned to the EDPS, who ensures compliance with these rules.

Indeed, Europol has to define several requirements before each analysis project, including the categories of personal data, the categories of data subjects, the period of retention and the conditions of access. Once defined, it is no longer possible for Europol to process data that do not comply with these requirements. Furthermore, the only categories of persons whose data the agency can analyse are suspects, potential future criminals, contacts and associates, victims, witnesses and informants.


The volume of these datasets is so huge that their content is often unknown before the data is extracted, in which case the agency is unable to confirm that all the information complies with the requirements of the Europol regulations.

Finally, in order to verify the veracity and reliability of the information, these datasets may be kept for several years after the extraction of the necessary elements for an investigation.

The EDPS warned Europol in September 2020 that the storage and processing of such datasets was not in line with Europol rules, after having recalled that compliance with these rules reduces the risk of associating an innocent individual with criminal activities, and avoids the consequences that such an association would have on his or her fundamental rights and freedoms.

Data Subject Categorisation

Following the Supervisor's warning, Europol implemented an architectural solution to isolate datasets dealing with persons that had not yet been categorised (Data Subject Categorisation, DSC), in order to minimise the risk of integration into its analysis work.

The agency also limited access rights to these datasets to technical teams dedicated to the identification of DSCs, and implemented a tagging system to indicate whether a DSC is completed or not. It has further assured the EDPS that data without a CSG will not be further analysed, will not be included in a general search and will not be shared with any State.

The problem is the retention period of these personal data which, despite secure storage, still do not comply with Europol rules. The agency has indeed proposed a quarterly review to decide whether to delete each dataset without CSG, keeping them "as long as necessary and proportionate to support the relevant investigation" - but without detailing the conditions of necessity and proportionality. The EDPS, on the other hand, considers that the storage of such data without a time limit continues to pose a significant risk to privacy, and has asked Europol to set a maximum retention period for non-CSD datasets.

Towards mass surveillance?

Europol replied to the EDPS that the proposed timeframe of 6 months would be insufficient to carry out the detailed analysis of the largest and most complex datasets necessary to determine their CSP. In addition, it stressed that some investigations were evolving over several years, and that new information might reveal links within the datasets without a CSP that would not have been identified initially.

Europol's insistence on retaining this data illegally raises concerns about its true aims and interests. Several experts fear that the agency has ambitions to conduct mass surveillance operations - starting with EDPS Supervisor Wojciech Wiewiórowski, who compared it to the NSA on its new data retention practices.

It further appears that Europol had initiated in 2020 the development of machine learning programmes to analyse their growing datasets, before the warning issued by the EDPS. These algorithms would inevitably process sensitive personal data, and the agency asked the EDPS whether it could dispense with the supervision of the EDPS in the initial phases of its AI training, since it had already received the warning for the retention of data that would be used for learning.


The EDPS did not allow Europol to continue its developments, but the agency decided to ignore it. After an intervention by the EDPS, Europol finally shelved its machine learning programme in February 2021, without having trained it or used it for operational analysis.

EDPS decision

The Europol rules did not anticipate these big data issues and do not allow for an exemption to analyse whether personal data correspond to persons linked to criminal activities, nor do they define a maximum period of retention of data without a CSG.

The EDPS considers that the article that comes closest to this new need is the one defining the temporary analysis of a dataset to determine whether the data are relevant for Europol. As the maximum retention period for such data is 6 months, the EDPS has decided to impose the same limit on the storage of datasets without CSG.

Author: Paul Guillier








Article written by
La Minute Cyber