Data ethics is your job
All too often in data science projects, data ethics is treated as someone else’s responsibility. It sits with legal, governance, product, or a committee that meets occasionally and far away from the day-to-day work. If it comes up at all, it’s often well after the key decisions have been made about what gets built. This post argues that data ethics is not an optional extra. Data scientists are often uniquely positioned to identify ethical risks and potential harms, and that position comes with an obligation to take them seriously.
All too often in data science projects, data ethics is treated as someone else’s responsibility. It sits with legal, governance, product, or a committee that meets occasionally and far away from the day-to-day work. If it comes up at all, it’s often well after the key decisions have been made about what gets built.
This post argues that data ethics is not an optional extra. Data scientists are often uniquely positioned to identify ethical risks and potential harms, and that position comes with an obligation to take them seriously.
Most data science work affects people
Depending on the sector you work in, it might not seem like your work has an obvious ethical dimension. You may not be working in an obviously sensitive domain, or on anything that looks high stakes on the surface. But decisions about how people are treated are inherently ethical in nature, and most data science work at least aims to inform or automate those kinds of decisions.
If your work affects who is prioritised, delayed, scrutinised, supported, excluded, or ignored, it already has ethical implications. Even when models operate at an aggregate level, their outputs usually feed into processes that affect real people.
A useful default assumption is that any process or decision that can affect people has ethical considerations until proven otherwise, not the other way around. Once you accept that people are affected, it becomes much harder to pretend that the technical choices we make are ethically neutral.
Technical decisions embed values
It’s easy to lose sight of this when you’re deep in the technical work. Data science offers plenty of interesting decisions to make: how to clean the data, who to exclude, which variables to include, what target to optimise, which metric to prioritise. These decisions are usually framed as technical, but they often embed value judgements.
When we choose a response variable, we’re deciding what outcome matters. When we set thresholds, we’re deciding which kinds of errors are more acceptable. When we drop records because they’re messy or incomplete, we’re deciding whose experiences count.
One way to make this concrete is to talk through what a confusion matrix actually represents. An evaluation metric can look impressive until you translate it back into real world terms.
Imagine a model that flags people as “high risk” so they can be prioritised for follow-up or additional support. A false negative is someone who needs help but isn’t flagged, and therefore doesn’t receive the help they need. A false positive is someone who is flagged despite not needing support, consuming limited resources and potentially crowding out others. Improving overall accuracy doesn’t remove this trade-off, it just changes who misses out.
This kind of translation is useful for stakeholders, but it’s also important for us. If we don’t explicitly connect technical performance to human consequences, we can end up making decisions that cause harm without even noticing.
Data scientists have unique visibility
In practice, technical, ethical, and organisational issues are tightly entangled. Data scientists often sit at the intersection of these layers in a way few others do.
We understand the data: who is represented, who isn’t, where the gaps are, and what biases are likely to be present. We understand the modelling choices and their limitations. We also usually understand the context in which model outputs will be used and what decisions they will inform, as well as the potential for them to be used in ways the model wasn’t designed for.
Decision makers may see a score or a recommendation without understanding how fragile it is. Legal teams may focus on compliance rather than system behaviour. Product teams may care about outcomes without visibility into how those outcomes are produced. The modelling decisions themselves are often invisible outside the data science team, and even when they are visible, their implications usually aren’t.
This means data scientists are often the first, and sometimes the only, people who can see when a system might work exactly as designed and still cause harm.
Asking questions is part of the job
Because of this visibility, part of the job is learning how to interrogate our own work even when a project appears to be progressing smoothly.
Some questions I find useful include:
Harm and impact
- Could this system work exactly as intended and still cause harm?
- What kinds of errors does this system make, and who bears the cost of those errors?
- Who is most exposed if this goes wrong, and how serious would the consequences be?
- What would harm look like in practice, and do we have any way of noticing it?
Data and assumptions
- Who is represented in the data, and who is missing or under-represented?
- Are there people affected by this system who never appear in the data at all?
- What assumptions are embedded in the choice of features, labels, and proxies?
- Are we treating something as “ground truth” that is actually a judgement or convenience?
- If the data reflects past decisions, are we at risk of reproducing them uncritically?
Evaluation and trade-offs
- What are we optimising for, and what are we implicitly deprioritising?
- Which kinds of mistakes are we more willing to tolerate, and why?
- Do our evaluation metrics reflect the real world consequences of errors?
- Who decided that these trade-offs were acceptable?
Use, misuse, and context
- How is this model intended to be used, and who actually controls its use?
- Could it reasonably be repurposed or combined with other systems in harmful ways?
- What decisions will people defer to the model, even if they’re told not to?
- What behaviours does this model reward or punish over time?
Accountability and transparency
- If someone is harmed, would it be possible to explain how this system contributed?
- Who would be expected to justify the model’s behaviour, and with what information?
- Would I be comfortable explaining these design choices to someone affected by them?
- Would the people whose data is being used expect this use, or find it surprising?
These are design and deployment questions, not abstract philosophical questions. Simply looking at your work through this lens will often surface issues that would otherwise remain hidden.
Ethical concerns need translation
In many organisations, speaking up about ethical concerns carries risk. Using explicitly moral language can backfire by making people feel accused or defensive rather than reflective. That doesn’t mean you’re wrong, but it does affect whether your concerns are heard.
In practice, it’s often more effective to frame ethical risks in terms organisations already understand: reputational damage, loss of trust, regulatory scrutiny, or strategic risk. Saying “I’m worried how this would look if it became public” often lands better than “this is unethical,” even when you believe both are true.
This can feel like a cop out, but the alternative is not being heard. Organisations are often more afraid of being seen to do something unethical than they are of doing it. That isn’t because people don’t care, but because organisational decision making tends to prioritise visible and reputational risks. Translating potential harms in to the language of organisational risk can be enough to change the trajectory of a project.
Not all work is ethically salvageable
A willingness to asking questions and raise concerns is necessary, but not always sufficient. There are situations where the risks and potential harms are so great that no amount of careful implementation, mitigation, or reframing makes a project acceptable.
Everyone’s personal boundaries will be different, shaped by their values, their sector, and the power they have to refuse work. One of mine is facial recognition. Even when a specific use case is presented as benign or beneficial, I believe the technology itself is too prone to misuse and function creep for me to be comfortable working on it.
If you don’t know where your lines in the sand are, that should give you pause. You may draw them in a different place to me, but knowing where they are matters. Otherwise, you’re likely to discover them only when you’ve already crossed them.
Tools and metrics are no substitute for judgement
When data scientists engage with ethics at all, it’s often reduced to a fairness metric on a dashboard. Metrics can be useful, but they don’t answer the hardest questions: whether a system should exist, how it might be repurposed, or what happens to responsibility once decisions are delegated to a model.
Responsibility may become diffuse when decisions are delegated to models, but it doesn’t disappear. If you’re building or deploying the system, you are shaping how people are treated. Data ethics may not feel like your job, but it is rarely anyone else’s.