# Data Ethics ### Will Styler - CSS Bootcamp --- ### This is a discussion - Please be kind and respectful - We can have different views --- ## Some Principles of Data Ethics --- ### [According to Wikipedia...](https://en.wikipedia.org/wiki/Big_data_ethics) - Ownership - Transparency - Consent - Privacy - Currency - Openness --- ### [According to Harvard...](https://online.hbs.edu/blog/post/data-ethics) - Ownership - Transparency - Privacy - Intention - Outcomes --- ### [According to Caltech...](https://pg-p.ctme.caltech.edu/blog/data-analytics/what-is-data-ethics-principles-examples-benefits-best-practices) - Transparency - Accountability - Individual Agency - Privacy - Fairness and Impartiality - Data minimization - Limited Storage --- (... and so on) --- ### There are many interpretations of what 'data ethics' means - So, rather than telling you, let's ask questions about data about people - Laws and regulations are not what's being talked about here, but ethics - We'll put aside risks to people directly from collecting data (e.g. experiments) --- ### Why should we collect human data at all? - What are good reasons to do so? - What are some unethical reasons to collect data? - Where are there edge cases? --- ### What kind of data is ethical to collect about people? - What are kinds of data which are never OK to collect? - What kinds of data are require extra care to collect? --- ### Does 'identifiable' change things? - Are there things you can collect without identification but not with? - Who gets to identify data? --- ### Does your intent matter? - Is there data that can be collected for good, but not evil? - How do you manage intent if data changes hands? - What about dual-use data? --- ### Who owns data about people? - The collector? The people who are in it? Somebody else? - Do people have a right to sell the data of others without their consent? - How do we handle data previously collected without consent? - Do communities have different rights than people? --- ### How is data about people stored and managed? - What are the ethical aspects of data storage? - Who gets to control when data about people gets deleted? - What countries regulate international data? - Who is responsible if the data is leaked or stolen? --- ### What data should be made open? - What data should not be publicly available? - Is public release of data a good idea? - Is creating public data a better thing? --- ### How does informed consent work for giving data? - What do people need to do to consent to data collection? - Who can't consent to data collection? - Is consent to service the same as consent to data? --- ### How can and should people be compensated for their data? - Does it have to be money? - Does providing a service count? - Is there such a thing as too much compensation? Or things that can't be paid for? --- ### Who should know about data about people? - Do people have to know that their data was collected? - How are we obligated to tell them? - In what circumstances should data collection be secret? --- ### When is imbalance or poor sampling an ethical problem? - Not a problem in terms of bad stats, but in terms of ethics?