The EU AI law has just been passed by the European Parliament. You might think, I'm not in the EU, whatever, but trust me, this is actually more important for scientists and data individuals around the world than you think. The EU AI law is a key move to regulate and manage the use of certain machine learning models in the EU or affecting EU citizens and it contains some strict rules and severe penalties for violations.

This law has sparked a lot of debate about risks and this means risks to the health, safety, and basic rights of EU citizens. It's not just the risk of some theoretical AI Apocalypse, but also about the daily risks that people's lives actually become worse in some way by the models you are building or the products you are selling. If you are familiar with many debates on AI ethics today, this sounds familiar. Discrimination and violation of people's rights, as well as harm to people's health and safety, are serious issues for the current season of AI products and companies, and this law is the EU's first effort to protect everyone.

Defining AI

Regular readers know that I always want a clearly defined AI, and I find it uncomfortable when it's too vague. In this case, The law defines AI as follows:

A machine system designed to operate at different levels of autonomy that can exhibit adaptability after deployment and for clear or implicit goals, information from the input it receives, how it generates outputs such as predictions, content, recommendations, or decisions that may affect the physical or virtual environment.

So, what does this really mean? My explanation is that machine learning models that produce outputs used to impact the world (especially the physical or digital conditions of everyone) fall under this definition. It does not directly adapt or self-regulate, although if it is insured.

But if you are building ML models used to do things like...

  • Assess the risk level of individuals, such as credit risk, regulatory risk, or legal risk, etc.
  • Identify the content that people see online in a data feed or in advertising
  • Differentiate prices displayed to different people for the same product
  • Suggest the best treatment, care, or service for everyone
  • Suggest whether people should take certain actions or not

All of these will be addressed by this law, if your model affects anyone who is an EU citizen - and that's just to name a few examples.

Classifying AI Applications

However, not all AI is the same and the law acknowledges that. Some AI applications will be completely banned, and others will have to undergo much higher scrutiny and transparency requirements.

Unacceptable High-Risk AI Systems

These types of systems are currently referred to as unacceptable high-risk AI systems and are not allowed. This part of the law will take effect first, six months from now.

  • Behavioral or technical manipulation to make people do things they wouldn't
  • Targeting people based on things like age or disability to change their behavior and/or exploit them
  • Biometric classification systems, attempting to classify people based on very sensitive characteristics
  • Evaluating personality traits leading to social scoring or differential treatment
  • Real-time biometric identification for law enforcement beyond a selected use case
  • Policy predictions (predicting that people will commit crimes in the future)
  • Facial recognition/biometric scanning broadly or data scraping
  • Emotion inference systems in education or work without medical or safety purposes

This means, for example, you cannot build (or be forced to submit) a screening tool meant to determine if you are happy enough to get a retail job. Facial recognition is being limited only in selected, targeted situations. (ClearView AI is certainly an example of that.) Policy predictions, something I worked on in academia early in my career and now regret, are being phased out.

Biometric classification of Viking people points to group models using risk or sensitive features such as political, religious, philosophical beliefs, sexual orientation, race, etc. Using AI to try and label people in these categories is prohibited by law.

High-Risk AI Systems

This list, on the other hand, includes systems that are not banned but are subject to careful consideration. There are specific rules and regulations that will encompass all of these systems, described below.

  • AI in medical devices
  • AI in vehicles
  • AI in emotion recognition systems
  • AI in policies

This excludes the specific use cases described above. So, emotion recognition systems may be allowed, but not in the workplace or in education. AI in medical devices and in vehicles is deemed to have serious risk or potential risk to health and safety, rightly so, and should only be pursued with great care.

Other

The other two categories are low-risk AI systems and general-purpose AI models. General-purpose models are things like GPT-4, or Claude, or Gemini-systems with very broad use cases and often used in other downstream products. So, GPT-4 itself is not high-risk or banned, but how you can embed them for use is limited by other rules described here. You cannot use GPT-4 for predictive policing, but GPT-4 can be used for low-risk cases.

Transparency and Scrutiny

So, suppose you are working with a high-risk AI application and you want to comply with all the rules and be approved to deploy it. How do you get started?

For high-risk AI systems, you will be responsible for the following:

  • Maintaining and ensuring data quality: The data you use in your model is your responsibility, so you need to manage it carefully.
  • Providing documentation and source access: Where do you get your data and can you prove it? Can you show your work like any changes or edits made?
  • Providing transparency: If the public is using your model (think chatbot) or the model is part of your product, you must tell users that this is the case. Do not pretend the model is just a real person on a hotline or customer service chatbot. This will really apply to all models, even those with low risk.
  • Human oversight: Just saying the model says so, not cutting it. Humans will be responsible for what the model outputs and most importantly how the outputs are used.
  • Protecting cybersecurity and robustness: You need to be careful to make your model secure against cyberattacks, breaches, and unintentional privacy violations. Your model twisting due to code errors or being hacked through vulnerabilities you don't fix will be on you.
  • Conducting impact assessments: If you are building a high-risk model, you need to conduct strict assessments of the potential impacts (even if you don't intend to) on the health, safety, and rights of users or the public.
  • For public entities, register in the EU's public database: This registration agency is being set up as part of the new law and application requirements will apply to public bodies, agencies, or primarily government organizations, not private businesses.

Testing

Another thing the law notes is that if you are working to build a high-risk AI solution, you need to have a way to test it to ensure you are following the guidelines, so there are compliance checks on You get approval. Those of us from the social sciences will find this quite familiar - it's like getting Institutional Review Board approval to conduct a study.

Effectiveness

The law has a staggered implementation:

  • Within 6 months, bans on unacceptable high-risk AI take effect
  • Within 12 months, governance of general-purpose AI takes effect
  • Within 24 months, all remaining rules in the law take effect

Note: The law does not cover entirely personal, non-professional activities unless they fall into the prohibited categories listed earlier, so your small open-source project is unlikely to be at risk.

Penalties

So, what happens if your company does not comply with the law and an EU citizen is affected? There are clear penalties in the law.

If you engage in any of the prohibited forms of AI described above:

  • Fines of up to 35 million euros or, if you are a business, 7% of your global revenue from last year (whichever is higher)

Other violations not included in the prohibited list:

  • Fines of up to 15 million euros or, if you are a business, 3% of your global revenue from last year (whichever is higher)

Lying to the government about anything in these:

  • Fines of up to 7.5 million euros or, if you are a business, 1% of your global revenue from last year (whichever is higher)

Note: For small and medium-sized enterprises, including startups, the fine amount is whichever is lower, not higher.

What Should Data Scientists Do?

If you are building models and products using AI as defined in the Law, first you should familiarize yourself with the law and what it requires. Even if you are not currently impacting EU citizens, this could have a big impact on the field and you should be aware of it.

Then, watch out for potential violations in your own business or organization. You have a time to find and address issues, but the banned forms of AI take effect sooner. In large businesses, you may have a legal team, but don't assume they will take care of all of this for you. You are the machine learning expert, and therefore you are a crucial part of how the business can detect and avoid violations. You can use the compliance checker tool on the EU's ACT AI website to help you.

There are many forms of AI being used today in businesses and organizations that are not allowed under this new law. I mentioned ClearView AI above, as well as policy predictions. Emotional testing is also a very real thing that people have to endure in job interview processes (I invite you to try Google's emotional test for jobs and see how companies are using these services), as well as the large Biometric Collection. It will be extremely interesting and important for all of us to comply with this and see how enforcement plays out once the law is fully in effect.

Tôi muốn dành một chút thời gian ở đây và nói một vài lời về một người bạn thân của tôi đã vượt qua trong tuần này sau một cuộc đấu tranh khó khăn với bệnh ung thư. Ed Visel, được biết đến trực tuyến là alistaire , là một nhà khoa học dữ liệu xuất sắc và đã dành rất nhiều thời gian và tài năng của mình cho cộng đồng khoa học dữ liệu rộng lớn hơn. Nếu bạn đã hỏi một câu hỏi R về Stackoverflow trong thập kỷ qua, có một cơ hội tốt anh ấy đã giúp bạn . Anh ấy luôn kiên nhẫn và tử tế, bởi vì đã là một nhà khoa học dữ liệu tự tạo như tôi, anh ấy biết cảm giác như thế nào khi học được điều này một cách khó khăn, và không bao giờ mất đi sự đồng cảm đó.

I have been incredibly fortunate to work with Ed for a few years, and be a friend of his to a few more people. We lost him too soon, and my question is you help a friend or colleague solve a technical problem in his memory. The data science community will be a slightly less friendly place without him.

Also, if you know Ed, online or in person, the family has requested donations to Severson Dells Nature Center, a place that was special to him.

Read more of my content at www.stephaniekirmer.com.

References and Further Reading

https://www.theverge.com/23919134/kashmir-hill-your-face-belongs-to-us-clearview-facial-recognition-privacy-decoder

Users who liked