Data Ethics

Author

Schwab

Data Ethics

This is a squishy area.

“Use Common Sense” - The authors of our textbook.

The Five Ps of Ethical Data Handling.

The five Ps terminology comes from here the Harvard Business Review.

Here’s the link.

Copyright and Chat GPT

Sarah Silverman, Mona Awad and Paul Tremplay sue Open AI for direct infringement.
Using copyrighted materials to train their models.
List of suits is getting longer.

Provenance

Where does the data originate?

Was it acquired legally?

Github

Microsoft (owner of github) uses code in repos to train copilot.

Ethical on public repositories?
Private repositories?
Copilot costs $20 a month or is free.

Purpose

Would the source of the data agree with how it is being used?

What if the data is re-purposed?

If that data is scraped from the site by a third party and made available is that ethical?
OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits
“The data is already public.” -Emil Kirkegaard

Protection

How is the data being protected?

Who is responsible for destroying it?

College Students

There is data that Smith collected from you.

The repos in this class are data that you are creating.

Preparation

How is the data cleaned?

Are data sets being combined to preserve anonymity?

Is the accuracy of the data verified?

Lab 5ish

Privacy

Who will have access to data that can be used to ID a person?

How will individuals be anonymized?

Who has access to that anonymized data.

Scenario

A citizen of a country files taxes with their government. Paying taxes is rarely optional. In filing there are data related to residence, wages, age, gender, taxes and government ID numbers pertaining to this citizen and their family.

Discuss:

Provenance
Purpose
Protection
Preparation
Privacy

Government’s responsibility

Who should access to this data?

How should is be secured?

What rights to privacy does this citizen have?

How long should the data be held?

Current Events

DOGE requesting access to tax payer returns to root out fraud.¹

Wanted a person without the proper clearance to be able to access tax payer returns. (Gavin Kliger)

Will be given access if needed in anonymized form. ²

Books

Credits

The five ps seem to come from the Harvard Business Review https://hbr.org/2023/07/the-ethics-of-managing-peoples-data 9/30/2024
Suing Chat GPT https://www.theverge.com/2024/2/13/24072131/sarah-silverman-paul-tremblay-openai-chatgpt-copyright-lawsuit

Other Formats

Data Ethics

Data Ethics

The Five Ps of Ethical Data Handling.

Copyright and Chat GPT

Provenance

Github

Purpose

Protection

College Students

Preparation

Lab 5ish

Privacy

Scenario

Government’s responsibility

Current Events

Books

Credits

Footnotes