Ethics in AI: When Smart Machines Make Dumb Decisions
- Muxin Li
- Sep 17
- 9 min read
The Big Myth: AI Isn't Actually Neutral
AI doesn't remove bias - it simply automates the ones we already have. It's bias at scale!
There's a common misperception that AI systems are neutral or unbiased because a computer model is making decisions rather than a biased human. But AI systems can contain all types of biases - some inherited from human creators, others from how the system is trained or evaluated.
The problem is that it's difficult to detect bias in AI systems. While there aren't any explicit laws to avoid bias, we can establish the North Star goals of what is fair, accountable, and transparent AI, and define strategies to help mitigate potential ethical risks and identify sources of bias.
Two Main Types of Harm: Allocative and Representational
Allocative Harm: AI Discrimination
One of the main categories of risks in unethical and biased AI systems is allocative harm. This is very similar to what we would think of as discrimination in a traditional sense - in which opportunities or resources are withheld from certain people.
For instance, having an automated résumé review system that ends up selecting male candidates for a technical role. This actually happened - Amazon scrapped their secret AI recruiting tool because it showed bias against women.
Another example is the Apple credit card situation. When Apple released its credit card, people including Apple co-founder Steve Wozniak charged that the algorithm was biased against women. Men and women with identical backgrounds - sometimes even married couples - were receiving different credit limits, with men generally getting higher limits despite having identical characteristics.
Representational Harm: AI Stereotypes
The other main category of risk is representational harm, and this is very similar to what we think of as stereotypes. So for instance, if an AI was trained on images of people in different occupations, and it ends up identifying all females in a hospital setting as being nurses instead of doctors.
This is where AI either propagates existing stereotypes that it learns from training data, or even creates new stereotypes based on patterns it finds.
The Three Goals of Ethical AI: Fair, Accountable, Transparent
Fairness: Harder Than It Sounds
Fairness is difficult enough for people to even agree upon so no surprise that it's very difficult to measure fairness in AI when there's not a single universally agreed upon definition for fairness.
There are two primary definitions of fairness that relate to AI systems: individual fairness and group fairness.
Individual Fairness: Think of individual fairness as assessing people based on their individual characteristics. If two people have very similar backgrounds and skills then we should expect a fair outcome is treating them both equally as candidates for a job or for a school application.
Group Fairness: The idea is that different groups should experience similar rates of error within the model or that the AI system should perform relatively the same way for different groups of people. However, in reality it's been shown that facial recognition systems work better for white males than they do for black women.
That's probably not too much of a surprise given that what is likely happening is that the training data is heavily skewed towards white males, just given the proportion of the engineering talent that works at companies developing facial recognition software are probably white and male.
Accountability: Who's Responsible When AI Screws Up?
AI is not a person and yet we trust it to make decisions - so who holds the AI accountable? Or who should be responsible for the AI system's performance? If something goes wrong, then what can users do?
There's a famous self-driving car example of who is responsible if the AI system in a self-driving car decides that it's better to run into one pedestrian rather than cause an accident that could lead to many more fatalities.
Key questions for accountability include: Who is responsible for the performance of the system? On what set of values and laws is the system based? What recourse do users have if they believe the system isn't behaving properly?
Transparency: Explaining the Black Box
Transparency is a topic that we've already covered quite a bit in our last module regarding privacy and the need to explain to users what data we're collecting from them and for what purposes.
Four Popular Methods of Providing Transparency:
Using Simpler Models: A linear regression is a lot easier to explain than a neural network.
Feature Importance: Explaining what features were important in the AI model's decision-making.
Simplified Approximations: If we do use complicated models like neural networks, then we can use simplified approximations or simpler ways of explaining what the AI is doing without going into all the details and the math. For instance you probably notice a lot of chatbots like Perplexity and ChatGPT share explanations of how they're arriving at a conclusion. If given a task, the chatbot will break down the steps it takes to analyze and synthesize a response so that you can see its logical thought process, and how it's getting to its conclusion.
It's a more abstract and a lot more simplified compared to what is really running in the background. We don't need to have users understand how transformer architecture works or how the AI is crunching the numbers to handle long context dependencies, in order to give users a sense of confidence in how the AI is working through its problem.
Counterfactual Explanations: The last popular method is counterfactual explanations, or offering what data point would have changed the outcome. So for instance, when assigning credit limits to applicants of a new credit card, the AI may offer an explanation that "if your credit score was 10 points higher, we would have increased your limit by $2000."
Where Bias Comes From: The AI Development Pipeline
Bias can enter AI systems at multiple stages of development. Let me walk through the main sources:
Data Collection: The Foundation Problems
Historical Bias: The data we're collecting reflects existing biases in the world around us. For example, if we're using large-scale text data to create word embeddings, certain occupational words like "nurse" or "engineer" end up being more strongly associated with women or men respectively, because that's what existed in the historical data.
Representation Bias: Our training data set isn't representative of the entire target population we're trying to model. This can happen because certain groups are naturally underrepresented, or because of our sampling method.
A great example is the city of Boston's pothole app. The city released a smartphone app so citizens could flag potholes for repair. When they analyzed the data, they found way more potholes reported in areas with younger, more affluent citizens. It wasn't that these areas had more potholes - it was because younger people with money were more likely to have smartphones and actually use the app.
Feature and Label Definition: Measurement Problems
Measurement Bias: The features or labels we choose are poor representations of what we're actually trying to measure. For example, using GPA to represent student learning, when GPA doesn't always fully represent student success. Or building an anomaly detection model for manufacturing, but finding that each site has different definitions of what counts as an "anomaly."
Training and Evaluation: Learning Problems
Learning Bias: The choices we make in building a model amplify performance disparities across groups. For example, we might optimize for overall aggregate performance, but this comes at the expense of consistency across groups - meaning wildly different error rates for different groups.
One example is using demographic data in a model designed to predict likelihood of criminals to re-offend. Using race or age might improve overall performance, but could cause hugely different error rates when we look at performance by group.
Deployment: When Models Meet Reality
Deployment Bias: A mismatch between how a tool was intended to be used and how it actually gets used.
Example: Houston Independent School District contracted with a tech company to build an automated teacher evaluation tool. The original intent was to complement other evaluation methods. But the tool ended up being used as the primary basis for firing teachers, resulting in large-scale terminations.
Feedback Loops: The Self-Perpetuating Problem
Feedback Loop Bias: The design of a system with a feedback loop influences the training data and therefore the model outputs, creating a self-perpetuating cycle.
Example: A product recommendation engine that orders items by number of positive reviews. Items that show up higher get purchased more and receive more reviews, so they continue to stay at the top. The popular stuff stays popular not because it's actually better, but because of where it appears in the ranking.
Fighting Back: Three Key Strategies
Since there are no explicit laws about how to build ethical AI systems, strategies and approaches have developed over time. Three key ones include data sheets for datasets, checklists, and ethical pre-mortems.
Data Sheets for Datasets: Documentation That Actually Matters
Data powers AI - yet there is no standardized way to document how and why a dataset was created, what information it contains, what tasks it should and shouldn't be used for, and whether it might raise any ethical or legal concerns.
This approach is inspired by the electronics industry, where it's standard practice to accompany every component with a datasheet that provides operating characteristics, test results, recommended usage, and other information.
What Data Sheets Should Include:
Information about how the data was collected and identified sampling issues that might cause bias. Who collected the dataset. How the data might be used or should NOT be used in building models. Process for maintaining and updating the dataset over time. Who to contact if there are any issues with the data.
Benefits of Data Sheets:
Can encourage best practices in collecting data by requiring teams to document the process. Help teams think through potential issues that can come from the way data was collected. Give data creators a chance to reflect on potential risks of using the dataset. Provide transparency to support decisions about whether to use a particular dataset for a certain model. Give consumers and users of models access to information that explains how the model works.
Ethical Checklists: Questions That Force Reflection
A series of questions that require AI creators to reflect on any potential ethical issues that might happen as a result of their process.
There are lots of types of ethical checklists. One version is based on Alistair Fritz's work for data science and includes questions about:
Project Selection and Scoping: Is the problem we're solving a root cause or just a symptom? Is AI the right tool for this problem?
Team Composition: Does the team include users as stakeholders? Does the team reflect diversity of opinions and backgrounds?
Data Collection: Are we protecting user privacy? Do we have user consent? Have we introduced sampling bias? Can we identify bias in our data collection?
Modeling Work: Have we introduced bias in variable selection? Are we including discriminatory features? Can we provide sufficient transparency and explainability?
Model Evaluation: Have we tested for disparate error rates among different user groups?
Implementation: Have we provided transparency into what the model should and shouldn't be used for? Do we have accountability and redress mechanisms?
Ethical Pre-mortems: Anticipating Problems Before They Happen
Many people are familiar with post-mortems - reviewing a problem that happened to understand how it was handled and what to do differently. The idea behind a pre-mortem is doing this BEFORE problems arise.
You gather a diverse group of stakeholders to anticipate potential ethical issues so you can develop mitigation strategies before problems occur. It's critical to involve not just technical developers, but others from the organization like product management and customer service. You should also involve representatives from your user base or people who are impacted by your system.
Detecting and Fixing Fairness Issues
Fair AI tools exist to simplify the process of evaluating fairness, though there is no universally accepted definition of fairness. Even if you use tools to help automate auditing for fairness in your data, you'll most likely need access to demographic attributes to decide which ones are important.
For instance, if you were a car insurance provider, would you care about gender or race or age?
Even if you end up not using demographic attributes as part of your model, it's still important to collect these attributes so that you can segment your users into demographic groups and monitor your model for fairness with consistent performance across groups.
Beyond Tools: Human Feedback Loops
Don't just rely on automated systems to do the work - also employ your users and implement feedback loops to identify emerging ethical issues over time. Many risks take time to materialize, and the environment around AI systems changes constantly.
You can implement feedback mechanisms where you invite user feedback, triage this feedback to separate individual instances from broader systemic issues, then regularly review feedback to identify ethical issues over time.
Three Ways to Fix Fairness Issues
When you detect fairness issues, there are three primary approaches:
Change the Data: A lot of ethical issues arise from how data is collected. You might collect additional data on under-represented groups, or re-collect data differently to mitigate historical or sampling bias.
Change the Model: Use a different type of model (like a more explainable one), or make different optimization decisions - for example, considering fairness as an optimization criteria.
Change the System: If users are using your system differently than intended, you might change the entire system to account for various ways users actually use it.
Key Takeaways: Be Proactive, Not Reactive
The key to designing ethical AI systems is anticipation of potential fairness issues. Be proactive rather than reactive - don't wait for issues to occur and then try to correct them.
Joy Buolamwini's work on "Fighting the Coded Gaze" highlights three critical points: Who codes matters (diverse teams help check blind spots). How we code matters (ensure we're factoring in fairness). Why we code matters (use technology to unlock equality, making social change a requirement, not an afterthought).
There are many sources of bias that can arise when building AI systems - from data collection to model training to actual usage. Our objective is designing AI that is fair, accountable, and transparent. The most important thing is involving diverse groups of users and stakeholders to help anticipate issues before they become problems.



Comments