Machine Learning meets GDPR

Serengeti

30.09.2021.

The impact of GDPR on Machine Learning and Artificial Intelligence

Many AI applications process personal data that can be used to analyze, predict, and affect human behavior. Thanks to AI, we can turn such data and the results of its processing into valuable commodities.

Machine Learning algorithms are based on large amounts of data that need to be processed for the algorithm to learn. The way this data is used is a critical component in determining fairness. Personal data can be used for research reasons alone, such as detecting patterns and correlations. On the other hand, the data can also be used to make choices that impact individuals. Even in sectors where complicated decisions must be made based on many aspects and non-predefined criteria, the combination of AI and Big Data allows automatic decision-making. In recent years, there has been much discussion over the benefits and drawbacks of algorithmic assessments and choices affecting people.

For example, displaying a specific online advertisement to a person based on their social media and web activities may not be considered intrusive and may even be appreciated if suits their interest. However, in some cases, this can be perceived as discrimination. An example that took place recently relates to the Facebook AI system, that labelled a video of black men as „primates”. The video, which showed recordings of Black men in altercations with police and white bystanders, automatically prompted viewers to “keep seeing videos about Primates.”

Another example shows a bias against women – a female doctor was shut out of a women's gym locker area because the automated safety system mistook her for a man because the term “Dr.” was associated with men only.

The Connection Between AI and GDPR

GDPR requires companies to comply with regulations that will secure consumer data. But questions have been raised about how this regulation will address automation in analytics as the AI and machine learning market grows.

As stated by the European Data Protection Board, „Any processing of personal data through an algorithm falls within the scope of the GDPR.“ Therefore, whenever a Machine Learning algorithm or AI system uses personal data, GDPR may apply.

The two significant components of machine learning are addressed by the GDPR legislation. First, it improves data security by combining AI with data privacy. Companies that collect and process personal data are subject to stringent responsibilities under the law. Second, this regulation specifically targets „automated individual decision-making” and profiling.

According to Article 22 of GDPR, an individual has the right not to be subject to either if they have legal effects on them. This gives data subjects the right to be excluded from exclusively automated processing, including profiling. While Article 22 is a general restriction, Article 15 gives stricter requirements that are linked to automated decision-making and profiling, that include:

The “existence” of automated decision-making, including profiling.
“Meaningful information about the logic involved.”
“The significance and the envisaged consequences of such processing” for the individual.

Legal Challenges Related to AI

GDPR outlines six data protection principles, and according to Norwegian Data Protection Authority, AI is facing four of them:

Fairness and bias

As shown in the examples at the beginning of this post, there are possibilities that algorithmic decisions may be discriminatory or mistaken. There are steps you can take to minimize bias. Choosing the correct learning model (supervised or unsupervised), the right training data, and monitoring the performance should have a positive impact on bias minimization.

Data minimization

Companies must minimize the quantity of data they gather and analyze. They must guarantee that the personal information is adequate – that it is sufficient to correctly fulfill the specified purpose; relevant – that it has a reasonable relationship to that goal; and restricted to what is necessary – that they do not keep more than they need for that objective.

Purpose limitation

Besides minimization, personal data may only be collected for a specific and precisely defined purpose.

Transparency

Transparency entails providing individuals with clear information on how their personal data is processed using AI, as well as any possible impact on their privacy. People must be aware that their data is being gathered and how it is being processed.

Building a Trustworthy AI

According to the High-Level Expert Group, seven conditions must be satisfied to deploy and establish trustworthy AI:

Human agency and oversight, including fundamental rights.
Technical robustness and safety, including attack resilience and security, a backup plan, general safety, accuracy, dependability, and repeatability.
Privacy and data governance.
Transparency, which includes traceability, explanation, and communication.
Diversity and non-discrimination.
Societal and environmental wellbeing.
Accountability, includes audibility, negative effect reduction and reporting, trade-offs, and restitution.

For companies that are purchasing or building IT solutions based on AI, there are a couple of recommendations for the protection of personal data:

Before you buy a solution, perform risk assessment.
Require that the system you build or buy meets the privacy needs by design.
Conduct regular tests for regulatory requirements compilation.
Use good systems for subjects’ data protection.

If you are looking for a business consultant, or even an entire team of experts who are already familiar with the best practices and who have seen what strategies work in different places and projects, feel free to reach out.

If you’d like to learn more about Machine Learning, download the whitepaper below.

Let's do business

Get in touch

The project was co-financed by the European Union from the European Regional Development Fund. The content of the site is the sole responsibility of Serengeti ltd.

Industries:

Technologies: