Data Ethics

Schwab

Data Ethics

This is a squishy area.

“Use Common Sense” - The authors of our textbook.

The Five Ps of Ethical Data Handling.

The five Ps termonology comes from here the Harvard Business Review.

Here’s the link.

Provenance

Where does the data originate?

Was it aquired legally?

Github

Microsoft (owner of github) uses code in repos to train copilot.

  • Ethical on public repositories?

  • Private repositories?

  • Copilot costs $20 a month or is free.

Purpose

Would the source of the data agree with how it is being used?

What if the data is re-purposed?

  • If that data is scraped from the site by a third party and made available is that ethical?

  • OkCupid, including usernames, age, gender, location, what kind of relationship (or sex) they’re interested in, personality traits

  • “The data is already public.” -Emil Kirkegaard

Protection

How is the data being protected?

Who is responsible for destroying it?

College Students

Smith College

There is data that Smith collected from you.

The repos in this class are data that you are creating.

Preparation

How is the data cleaned?

Are data sets being combined to preserve anonymity?

Is the accuracy of the data verified?

Lab 7ish

Privacy

Who will have access to data that can be used to ID a person?

How will individuals be anonymized?

Who has access to that anonymized data.

Lab 3

Spotify

Algorithms Reflect the bais of their creator

A piece of data itself has no positive or negative moral value, but the way we manipulate it does. It’s hard to imagine a more contentious project than programing ethics into our algorithms; to do otherwise, however, and allow algorithms to monitor themselves, is to invite the quicksand of moral equivalence.

Books

Credits