This is a squishy area.
“Use Common Sense” - The authors of our textbook.
The five Ps termonology comes from here the Harvard Business Review.
Sarah Silverman, Mona Awad and Paul Tremplay sue Open AI for direct infringement.
Using copyrighted materials to train their models.
Where does the data originate?
Was it aquired legally?
Microsoft (owner of github) uses code in repos to train copilot.
Ethical on public repositories?
Private repositories?
Copilot costs $20 a month or is free.
Would the source of the data agree with how it is being used?
What if the data is re-purposed?
If that data is scraped from the site by a third party and made available is that ethical?
OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits
“The data is already public.” -Emil Kirkegaard
How is the data being protected?
Who is responsible for destroying it?
There is data that Smith collected from you.
The repos in this class are data that you are creating.
How is the data cleaned?
Are data sets being combined to preserve anonymity?
Is the accuracy of the data verified?
Who will have access to data that can be used to ID a person?
How will individuals be anonymized?
Who has access to that anonymized data.
A piece of data itself has no positive or negative moral value, but the way we manipulate it does. It’s hard to imagine a more contentious project than programing ethics into our algorithms; to do otherwise, however, and allow algorithms to monitor themselves, is to invite the quicksand of moral equivalence.
The five ps seem to come from the Harvard Business Review https://hbr.org/2023/07/the-ethics-of-managing-peoples-data 9/30/2024
Suing Chat GPT https://www.theverge.com/2024/2/13/24072131/sarah-silverman-paul-tremblay-openai-chatgpt-copyright-lawsuit