This is a squishy area.
“Use Common Sense” - The authors of our textbook.
Handling Personal Data
Be cautious when feeding information to AI - Personal Data can be reused by models
Consider the information you collect - Only collect the infromation you need
The five Ps terminology comes from here the Harvard Business Review.
Provenance
Purpose
Protection
Privacy
Preparation
![]()
Sarah Silverman, Mona Awad and Paul Tremplay sue Open AI for direct infringement.
Using copyrighted materials to train their models.
Where does the data originate?
Was it acquired legally?
Microsoft (owner of github) uses code in repos to train copilot.
Ethical on public repositories?
Private repositories?
Copilot costs $20 a month or is free.
Would the source of the data agree with how it is being used?
What if the data is re-purposed?
Users information scraped from OK Cupid
If that data is scraped from the site by a third party and made available is that ethical?
OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits
“The data is already public.” -Emil Kirkegaard
How is the data being protected?
Who is responsible for destroying it?

There is data that HCC collected about you.
Everything you do or make in this class exists on canvas.
How is the data cleaned?
Are data sets being combined to preserve anonymity?
Is the accuracy of the data verified?
Who will have access to data that can be used to ID a person?
How will individuals be anonymized?
Who has access to that anonymized data.
A citizen of a country files taxes with their government. Paying taxes is rarely optional. In filing there are data related to residence, wages, age, gender, taxes and government ID numbers pertaining to this citizen and their family.
Discuss:
Provenance
Purpose
Protection
Preparation
Privacy
Who should access to this data?
How should is be secured?
What rights to privacy does this citizen have?
How long should the data be held?
At the beginning of last year DOGE requested access to tax payer returns to root out fraud.1
Wanted a person without the proper clearance to be able to access tax payer returns. (Gavin Kliger)
Will be given access if needed in anonymized form. 2



Unless otherwise cited or linked here are the credits:
The five ps seem to come from the Harvard Business Review https://hbr.org/2023/07/the-ethics-of-managing-peoples-data 9/30/2024
Suing Chat GPT https://www.theverge.com/2024/2/13/24072131/sarah-silverman-paul-tremblay-openai-chatgpt-copyright-lawsuit