Data Ethics

Author

Schwab

Data Ethics

This is a squishy area.

“Use Common Sense” - The authors of our textbook.

The Five Ps of Ethical Data Handling.

The five Ps terminology comes from here the Harvard Business Review.

Here’s the link.

Provenance

Where does the data originate?

Was it acquired legally?

Github

Microsoft (owner of github) uses code in repos to train copilot.

  • Ethical on public repositories?

  • Private repositories?

  • Copilot costs $20 a month or is free.

Purpose

Would the source of the data agree with how it is being used?

What if the data is re-purposed?

  • If that data is scraped from the site by a third party and made available is that ethical?

  • OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits

  • “The data is already public.” -Emil Kirkegaard

Protection

How is the data being protected?

Who is responsible for destroying it?

College Students

Smith College

There is data that Smith collected from you.

The repos in this class are data that you are creating.

Preparation

How is the data cleaned?

Are data sets being combined to preserve anonymity?

Is the accuracy of the data verified?

Lab 5ish

Privacy

Who will have access to data that can be used to ID a person?

How will individuals be anonymized?

Who has access to that anonymized data.

Scenario

A citizen of a country files taxes with their government. Paying taxes is rarely optional. In filing there are data related to residence, wages, age, gender, taxes and government ID numbers pertaining to this citizen and their family.

Discuss:

  • Provenance

  • Purpose

  • Protection

  • Preparation

  • Privacy

Government’s responsibility

Who should access to this data?

How should is be secured?

What rights to privacy does this citizen have?

How long should the data be held?

Current Events

DOGE requesting access to tax payer returns to root out fraud.1

Wanted a person without the proper clearance to be able to access tax payer returns. (Gavin Kliger)

Will be given access if needed in anonymized form. 2

Books

Credits