AI Criteria: Access - Privacy Ref

During a webinar in May of 2023, I stated that the main problem with artificial intelligence isn’t AI itself, but the people who use it. I still hold to this idea. Most applications I have seen that make me scratch my head or cringe with disbelief are bad ideas from people, not mistakes or errors from the AI itself. To that end, deciding who is involved with the AI, or who gets access to it, is an important aspect of an AI privacy program.

Any person who is going to make decisions about how AI uses personal information should follow policy and procedures set by the organization strictly. They should also have credentials that would qualify them for that position. These are engineers and programmers with expertise in artificial intelligence or similar fields. Additionally, privacy or legal professionals should also be given access to assess compliance with policies for AI use. The goal is ensuring that no one is using personal information with artificial intelligence in such a way that introduces unneeded risk.

Training Data Comes First

The reason we care so much about who can and cannot access this information is because of risk. We are limiting access to minimize risks as much as possible. Starting with the training data, having too many cooks in the kitchen could result in bad data sets for training. Remember that training data is used to teach the AI, and an example of bad training data sets could result from either being too limited or too large. We could also have the training data become poisoned with bad or misleading information. Good training data is critical for AI and we must make sure we protect it.

A great example of bad training data that I regularly site during classes involves an AI that identifies pictures of dogs versus those of wolves. I heard about this during the World Information Governance Conference last year in San Diego. The issue with the AI was that almost all the pictures of wolves were taken in snowy environments. This meant the AI learned that snow in the picture meant wolf. This is bad data as it biases the AI to make an assumption based on something that isn’t actually the subject itself.

Getting into the (algo)Rhythm

The algorithm itself also requires protection. Both in development and in use, we must make sure qualified personnel provide oversight. When making the AI, we are in the same position for the training data, where if we have bad inputs, we will have bad outputs. Garbage in, garbage out. As or the use of the AI, we need to make sure it is only used as intended. This means not using personal data that has been restricted from some types of processing. The combination of data sets is a large risk as well. We may accidentally reidentify information, which opens up all sorts of risk.

Overall, controlling access is about controlling the quality of the AI. Individuals that do not have the knowledge or experience to handle AI will more likely harm the training data, algorithm, or outputs, and so we want to limit access only to specific groups.

Reach out to Privacy Ref with all your organizational privacy concerns, email us at info@privacyref.com or call us 1-888-470-1528. If you are looking to master your privacy skills, check out our training schedule, register today and get trained by the top attended IAPP Official Training Partner.