Nobody’s responsible: inside the data broker supply chain
Data brokers hold detailed information about most Australians without their knowledge. When I spent several months making access requests to trace my own data through the supply chain, I found something more interesting than the data itself: nobody was responsible for it.
Here is some of the information a company I had never directly interacted with held about me: I’m male. I’m a new parent. I’m in-market for personal loans, wedding planning, insurance, and a new home. I’m an affluent consumer and a frequent luxury shopper. I visit Chemist Warehouse, Chin Chin, Vue de Monde, Dior, Medicare, the TAB, and dozens of other named businesses. I’m interested in vegan diets, law, healthcare, running, and video gaming.
They also had 275 precise GPS coordinates with timestamps and IP addresses, almost all of them identifying my home address.
Nearly all of this information is wrong. But the inaccuracy is almost beside the point. The more important issue is that this data exists at all: it was collected without my knowledge, traded through a supply chain I didn’t know about, and held by companies I’d never heard of until I spent months wheedling the information out of them. It took a series of formal data access requests to even begin to piece this together, and even now I still don’t know where the data ultimately came from or who else it’s been provided to.
This post is about what I found when I tried to trace that data supply chain, how accountability disappears when every party can point to someone else, and how tenuous the legal positions data brokers rely on actually are.
Tracing the data supply chain
I pieced this together by making data access requests under Australian Privacy Principle 12 (APP 12). I recently wrote about the experience of making 60 such requests to a variety of organisations, but by far the most difficult interactions were with data brokers. In this post I’ll use a case study of a downstream licensee, an intermediary, and an overseas supplier to show what those requests revealed, what remains unknown, and what the experience exposes about how this industry operates in practice.
I initially made a request to an Australian data broker whose website claimed to hold data on the majority of Australians. They were a downstream licensee of my data. Because data brokers often store information linked to device identifiers rather than conventional identifiers like name or address, I specifically asked them to search for my mobile advertising ID (MAID), a unique identifier assigned to my phone. The company acknowledged holding a record containing my MAID but refused to provide it, claiming the associated data was not my personal information. They did, however, reveal that they had obtained the data from another company, and suggested I contact them instead.
Things got off to a bad start with the intermediary company when an email to the privacy contact address listed in their privacy policy immediately bounced. When I eventually got through, they initially provided only a limited extract of data linked to my MAID. After I lodged a formal complaint, they provided the much more detailed information I described at the top of this post. They also told me that they had purchased this data from a third company based outside of Australia.
Unlike the first two companies, the overseas supplier has no apparent Australian presence. Emails to their privacy contact address received no response, and I was later told they had been caught by a spam filter. After many follow-ups, a representative eventually acknowledged that the company held my MAID but provided no information about what data was linked to it. The company has since stopped responding to my emails. This leaves the entity that apparently collected my data permanently out of reach.
We don’t have a working email address
Two of the three companies described here didn’t even clear the very first hurdle of having a functioning, monitored privacy contact email for individuals to direct access requests to. In one case the published address bounced immediately. In the other, emails sat unread in a spam folder for over a month.
If your business operates in what is at best a grey area of the Privacy Act, you might at least try to get the easy parts right. Failing at something that is both so simple and so visible tells you something about how seriously a company takes its privacy obligations. I should not have to guess an organisation’s email address to exercise a statutory right.
We don’t know who you are
Generally speaking, data brokers take a somewhat idiosyncratic position on personal information and therefore on their privacy obligations. If the data they held really weren’t about individuals, it’s hard to see how it could have any value. On the other hand, if it were personal information then that would create inconvenient restrictions on how it can be collected, used, and disclosed. It’s Schrödinger’s data: personal enough to sell, but not personal enough to give individuals any rights over it. In practice, this means data brokers want the benefits of personal data without the obligations that attach to it.
This dissonance came through very clearly in my interactions with data brokers. The downstream licensee simultaneously advanced three positions:
- Device identifiers can be personal information;
- The information linked to my device identifier is not personal information, so they are not required to give me access to it under APP 12; and
- They are prohibited from giving me access to the data linked to my device identifier under APPs 6 and 11.
The Australian Privacy Principles, famously, apply only to personal information.
They stated that the data held was in a “pseudonymised and unassociated state,” apparently a technical condition that resolves logical contradictions as well as privacy obligations (perhaps using a similar magic as data clean rooms).
The intermediary similarly described my MAID as “a unique, anonymous identifier.” A unique identifier is, by definition, not anonymous. They also rather theatrically announced that they had suppressed and removed my data after I contacted them, on the basis that once I identified myself as the person behind the identifier the data became personal information. Their position, in other words, is that the data is not personal information right up until someone exercises their rights, and at that point it must be immediately destroyed. (Somewhat undercutting their claim, they didn’t actually delete the data.)
Identity graph and identity resolution products exist specifically to link device IDs, cookies, hashed emails, and other pseudonymous identifiers back to known individuals. The companies involved could readily access such products, which means re-identification isn’t just theoretically possible: it’s a commercial service. The OAIC has stated that where there is uncertainty, entities should err on the side of treating data as personal information. These companies chose to err in the other direction.
We didn’t collect it
I asked both the downstream licensee and the intermediary how they came to acquire my data and the legal basis for holding it, given I had never interacted with either of them. Neither could answer questions about consent. Both directed me to the company they had purchased the data from, implying that ensuring a lawful basis for the data they held was someone else’s problem. In both cases, that company had a non-functional or unmonitored privacy contact email. The overseas supplier at the top of the chain eventually acknowledged holding my MAID but claimed to hold no other personal information. They have not responded to questions about how the data was collected, from which apps, or on what basis.
Under the Privacy Act, collection must be by lawful and fair means, and individuals must be notified at or before the time of collection. None of that seems to have happened here. But there is no practical way to enforce these obligations when the collecting entity is offshore and unresponsive, and every downstream entity maintains that collection is not their concern.
It’s not our responsibility
The pattern of upstream deflection extended well beyond questions about consent. The information held about me contained multiple errors, including about things as basic as my gender. When I raised concerns about the accuracy of the data, the companies again simply pointed me to whoever had supplied it to them. The inaccuracies show how little obligation anyone in the chain feels to know what they are holding, or to stand behind it.
More troublingly, the accounts each company gave of what they held were impossible to reconcile with each other. The overseas supplier claimed to hold only my MAID, with no associated data. It is hard to see why anyone would hold only an identifier with nothing attached to it, and harder still to square this with the hundreds of data points the intermediary said they had obtained from them. The downstream licensee told me they held only “a rough geolocation with a wide radius” covering “hundreds of different households.” The actual location data, when I eventually obtained it from the intermediary, included hundreds of high precision GPS coordinates clustered on my home address.
It is simply impossible for me to know what happened to my data at each step in this supply chain: what was collected, what was derived, what was passed on, and what was lost or transformed along the way. And that, I think, is exactly the intention.
You can’t tell anyone
The intermediary data broker eventually disclosed the name of the overseas supplier they had purchased my data from, having initially refused on the grounds that it was “commercial-in-confidence.” This is not a valid exemption under APP 12.3, and in any case I question whether information that arguably should be in a privacy policy can simultaneously be too commercially sensitive to share with individuals.
Their response was also marked “Without Prejudice” throughout. This is a legal convention normally reserved for settlement negotiations, signalling that the contents cannot be used against the sender in court. It has no obvious application to a response to a statutory data access request, but it does tell you something about how the company viewed the interaction. They also warned that if I publicly disclosed any of the information provided and it resulted in reputational harm, the company “reserves its rights to seek legal advice or pursue appropriate action.”
To be clear, this was not confidential commercial information obtained through a leak or a breach. It was information disclosed to me in response to a statutory access request about my own data. Threatening legal action against someone for exercising their statutory rights is an unorthodox way to demonstrate good privacy practice. If transparency is reputationally dangerous for you, there’s perhaps a problem with your business model. There’s also something ironic about a company that trades in people’s information without their knowledge threatening legal action against someone for sharing information about them.
This is not a grey area
Reading all this, you might conclude that the Privacy Act simply isn’t equipped to deal with data brokers. There are real structural challenges: responsibility is diffuse, the personal information boundary is contested, and notice and consent mean little when individuals have no relationship with the entity collecting their data.
But I don’t think that’s quite right. The OAIC has been explicit about how the Act applies to data brokers: pseudonymised identifiers are personal information, de-identification is not a fixed state, and covert collection without the individual’s knowledge would usually be unfair. They also note, with exceptional understatement, “an underlying tension between the data minimisation requirements of APP 3 and the activities of data brokers, whose business model is reliant on maximising the amount of data that they collect.” None of the companies I dealt with acted in accordance with any of these positions.
The problem isn’t that the law is silent. It’s that these interpretations haven’t been tested, and this allows data brokers to operate according to a conveniently imagined version of the Privacy Act. Many of the positions I encountered would not survive scrutiny, particularly given how internally incoherent they are. Data brokers are not so much operating in a grey area as in the space between the law as written and the law as enforced.
What makes this especially difficult to fix is that data brokering is invisible to the people it affects. Most people will never know these companies exist, let alone that they hold information about them. This makes individual complaints an almost absurd enforcement mechanism: you cannot complain about a practice you don’t know is happening, to a company you’ve never heard of, about data you didn’t know was collected.
Final thoughts
What was most striking to me about all this was how responsibility simply went missing across the data supply chain. While each company was eager to deflect me to the upstream supplier, ultimately no one acknowledged any responsibility for collecting or using my data. The involvement of multiple companies meant that each could somewhat plausibly claim ignorance of collection practices, accuracy, or downstream use. Even when I reached the company that appeared to have collected the data, they were effectively unreachable.
That opacity may serve more than just legal convenience. Data brokers sell products, and individual access at scale would create a real risk of revealing that those products are riddled with errors. Even when data brokers are criticised on privacy grounds, there’s often an assumption that the data itself at least does what it claims to do. My experience calls that assumption into question. Much of the information held about me was simply wrong, and far less impressive than the confidence of its marketing suggests. The sector may depend not just on people being unaware that this data exists, but on the data never being shown to the people it supposedly describes.
This relocation of accountability so that it’s always someone else’s problem is not accidental. It’s what allows the practice of buying and selling other people’s data to continue.