AI Criteria: Review

The most important piece of any privacy program handling or investigating the use of artificial intelligence is the review process. There are three areas to review, including the algorithm itself, the training data, and finally the outputs. This will act like a filter of increasing scrutiny as we progress.

The Algorithm

We have a few questions when it comes to the algorithm, but luckily this is the most straightforward part of the process. We want to know how the AI works and what it is looking to do, or what the purpose is. These are high level questions, but understanding what data is processed and where it came from and why it is using that data answers a lot of our questions. Thinking about this as our filter, this is the coarsest part of the review. We are asking broad questions at a high level. If we find out that the purpose is inappropriate or that the data used is highly sensitive, we can stop further progress of the AI until we resolve issues. We can do this early on, in some cases before we even start programming through a Privacy by Design inspired approach. Once we approve the algorithm itself, we can start looking at the training data.

The Training Data

Training data is the information you use to teach your AI how to act. Effectively, this is the information you use to test your AI and make sure it works correctly. If you want an AI to be able to identify a crosswalk or a bicycle, you put a massive amount of pictures of those things into it. That’s why you have to identify pictures for CAPTCHAs, you are actually helping to train AI, most likely for self-driving cars.

The reason we want to review this information is because training data can have serious implications for processing. Bias is the issue we are most concerned about here. Bias is often looked at as an issue of bigotry or racism, but that isn’t the only concern as far as bias goes. Any data set could be a disproportionate representation of a given population and that can bias an AI. If you ask 10 people what their favorite food flavor is for ice cream, and give them a few choices, you might have 4 or 5 people say they like vanilla. If you take a data set of 10 people with those answers and apply it to thousands or millions of people, it will most likely be inaccurate. An AI that uses that data set to suggest an order will be telling a lot of people to order vanilla because it is overrepresented, and thus bias.

Outputs

After the training set is reviewed, we will need to review the actual outputs of the AI. Outputs are any data, actions, or recommendations presented by the AI. We want to be sure we are getting the actual outcomes we intended. Accurate outputs are important as well as those that meet our expectations. If an AI that provides quotes or estimates that regularly provides exceedingly high quotes or very low interest rates would be an issue that needs to be addressed immediately. Also, unexpected outcomes should be investigated and addressed as needed.

Recently, the FTC ordered Rite Aid to stop using facial recognition because there were not reasonable safeguards in place. Notably, minorities were being identified as shoplifters disproportionately to white, male shoppers. A review of the system, as well as the training data, would have most likely revealed these issues and be a form of safeguard as well.

In Practice

An important question is who will actually do the review. Obviously, we want the privacy team involved, however that is just the privacy concerns. Issues around bias or an AI being overly invasive or objectionable, as we discussed in earlier blogs, can be avoided with a varying group of reviewers. Different people with different backgrounds, cultures, or experiences will help to identify problematic areas. Once you get into the area of ethics, you may want to find someone with expertise in philosophy or ethics to provide insight.

Whoever is involved, set a regular cadence for reviews as well as policies for recording those reviews. This could even be included alongside a conformity assessment that the EU AI Act will require of high risk processing. Most importantly though, is that you actually do the reviews.

Reach out to Privacy Ref with all your organizational privacy concerns, email us at info@privacyref.com or call us 1-888-470-1528. If you are looking to master your privacy skills, check out our training schedule, register today and get trained by the top attended IAPP Official Training Partner.