The Untold Worker Exploitation Behind Large Language Models

mrarup82June 29, 2025

0 1 7 minutes read

A few months ago, 28-year-old Scale AI CEO Alexandr Wang made headlines after his company, after a wildly successful funding round involving Meta, Amazon and Microsoft, raised over $1bn in venture capital. Although Wang’s success story is dramatic – from a 19 year old MIT dropout to the youngest self-made billionaire in the world – and well-reported, much less has been told about how his company, a massive data-labeling firm, came to be. Thanks to hours of research into institutional studies and relatively obscure reporting, I’ve become increasingly aware of a series of equally dramatic twists in the company’s recent history, one that intertwines the stories of hundreds of thousands of people and the ultimate fate of our world in the wake of the AI revolution. This article explains, in depth, my discoveries and knowledge regarding the travesties of human ingenuity ubiquitous in AI data-labeling outsourcing and how society, as a whole, should take action to address them.

The World of AI Data

Tech companies have been profiting off data almost since their inception; yet, the newfound prevalence of AI tools has resulted in an unprecedented new demand for data. To explain this requirement, let us employ the analogy of a pen and ink: an inkless pen would not be able to express itself on paper, as is the case with a bottle of ink with the absence of a pen. Likewise, an AI product in the likes of ChatGPT, Claude and Gemini is the careful combination of training data and an architecture – usually a Transformer-style model – allowing for the expression of the data into a useable form. Excluding the mathematical and programmatic formalities for the sake of accessibility, the process results in a program which is able to produce a unique output based on the training data in the context of an input, usually a prompt or a question.

A great quantity of data is necessary to train modern Large Language Models (ChatGPT 4o, a prominent LLM, is estimated to have about__1.8 billion parameter inputs__), and thus general-use AI companies turned to the single largest open-source repository of language data in human history – the internet. In addition to being trained on samples of text and images scraped from every website imaginable, state-of-the-art LLMs are constantly being fed more data from proprietary datasets and from conversations with its users. The unquenchable need for better data drives a whole market for reliable information used to train AI, itself a bitter ethical dilemma we will break down in future articles.

Why Humans?

Unsurprisingly, in large datasets, a variety of harmful content running the gamut from sexually explicit material to hate speech is juxtaposed with healthy, educational training data, as was the case with the subset of Internet data used to train most modern LLM chatbots. In a bid to filter out the expression of these blots in the training set (due to the doubtless impracticality of manually scouring terabytes of training data), most AI companies employ measures to detect and correct these side effects. To strike a compromise between using human review and preserving the scalability of the model, AI companies adopted the Reinforcement Learning from Human Feedback (RLHF) model: essentially, human workers would label and classify potentially harmful data, which is then fed through a reinforcement-learning sequence to adjust the model according to the contents of the training data.

In technical terms, RLHF is no different from standard machine learning Reinforcement Learning algorithms: a deep network is employed to assign a score to an AI-generated response based on real, human-labeled data, while the AI is repeatedly made to generate responses that are then fed into this deep network, effectively grading it on how harmful the response is. An output deemed similar to the human-labeled “harmful” data will negatively impact the model’s tendency to generate something similar in the future, while an output deemed dissimilar to the aforementioned data will grant the model a propensity to generate such outputs when prompted under similar conditions (If you are interested, I highly recommend checking out this blog post explaining pseudo-technical details regarding the use of RLHF in modern LLMs). In other words, this process “teaches” the AI model to produce responses as dissimilar as possible to the human-labeled harmful data, effectively fine-tuning it without the need to retrain.

The Issue?

Human-labeled data does not materialize out of thin air. Instead, the RLHF model that purportedly makes AI safe and benign is heavily dependent on hundreds of thousands of underpaid workers, most of whom contribute invaluable data-labeling services through online “cloudwork” platforms. Most of these platforms follow a “requester and contractor” format, in which multitudes of online workers are assigned to one task, receiving pay upon the completion of whatever the task stipulates, ranging from labeling the content of an image or video to scouring walls of text for possible keywords or contexts. These workers are invaluable contributors to the safety of AI as we know it – the reason why publicly released AI is known to be thriving sources of information exempt from the occasional vitriol and crudeness of the internet. While acknowledging that the concept of cloudwork platforms for AI data labeling is innocuous, even beneficial, the reality behind these digital workshops are far from so. The 2024 Fairwork report by the Oxford Internet Institute highlights the business malpractices taken upon workers by large cloudwork AI data-labeling platforms such as Amazon Mechanical Turk, Appen, and Remotasks. The report determined that:

None of the mentioned platforms have mechanisms or policies that ensure contractors receive payment from requesters for completed tasks.
Only one of the mentioned platforms (Appen) has policies that ensure worker QoL through mitigating overwork.
None of the mentioned platforms have clear-cut and easily interpreted contracts that stipulate conditions for payment and work.
None of the mentioned platforms have shown that they take in contractor feedback in making executive decisions.
Only one of the mentioned platforms (Appen) has policies that acknowledge the right of worker association (unionizing).

Real People, Real Impact

The dismal lack of rights these data-workers have results in the creation of an unrecognized and underappreciated “subclass” of AI laborers. These people, often desperate to earn a bit of extra revenue in order to support their family, spend hours upon hours in front of their computer every day, going through menial and repetitive tasks of data labeling, often coming across the worst manners of human depravity. Workers on these platforms have repeatedly reported being traumatized by graphic and detailed descriptions or depictions of rape, gore, self-harm, and animal abuse, the witness of all of which is part of their expected duty, with no follow-up or counseling afterwards.

As contractors, workers are not protected by wage laws, and, in many cases, are not paid for overtime work. Remotasks places a limit on the maximum number of hours of work a participant is paid for, and the platform assigns tasks after a lengthy “qualification” process for which a prospective tasker is not paid for. Contracts are poorly and often ambiguously written, allowing for many mistreatments of workers to slip by unnoticed. Apart from the trauma that many content-moderation data labelers experience, they are paid pitifully little – even experienced workers able to complete tasks efficiently occasionally struggles to earn at the advertised rate of 10-15$ an hour, and less able participants were paid even less, at less than 2 dollars in some cases. Even if they are able to earn some money, taskers are also plagued by requester dishonesty and pay delays. In one blatant example of worker mistreatment, Amazon Mechanical Turk refused to step in for their workers after AI Insights, an AI data firm, flat-out refused to pay hundreds AI data-labelers for their work – over 70,000 tasks’ worth – on the grounds that they were unsatisfactory.

Hundreds of thousands of people work in AI data-labeling daily, yet their frustration is casually smothered by poor customer support and a lack of a platform. When interviewed by Fairwork Institute researchers, taskers claimed that their experience with or knowledge of customer support were generally bad; some taskers even reported to MIT Tech Review that the customer support sometimes refused to handle their claims about overdue pay.

Coincidentally, Remotasks, one of the largest such AI data-labeling firms, is a key subsidy in Scale AI providing labeled training data as a part of the company’s AI-building services. Officially established to “preserve user confidentiality”, this separation attempts to mask the some of the less tasteful business practices of the big company from the public eye; indeed, when one considers that Remotasks scored only a 1/10 on Fairwork’s equitable work scale, with a 10/10 being the “minimum requirement for a fair work environment”, there is certainly much to be discussed about Scale’s mistreatment of the very workers who support AI products through their hard work.

This article is brought to you by Our AI, a student-founded and student-led AI Ethics organization seeking to diversify perspectives in AI beyond what is typically discussed in modern media. If you enjoyed this article, please check out our monthly publications and exclusive articles at https://www.our-ai.org/ai-nexus/read!

Raise Awareness

RLHF is a good thing done all the wrong ways. As a strong proponent of human-aligned AI development, I contend that the current standards and policies are insufficient to ensure that our technologies do not adversely and surreptitiously affect our humanity. If you care enough about the responsible and safe development of AI, not just for Silicon Valley but for humanity as a whole, I strongly encourage you to spread awareness of the worker rights violations in the AI industry – whether through sharing this article or doing your own research – and supporting policy decisions protecting the right of workers to a fair wage and workplace representation.

The world deserves to know – and you should play your part.

Written by Thomas Yin

mrarup82June 29, 2025

0 1 7 minutes read

mrarup82

Related Articles

DePIN Summit Africa 2025 Announced for July 2nd in Mombasa and July 4-5th in Zanzibar

AI deepfakes pose ‘significant’ risk to ‘identity systems upon which our entire economy relies,’ warns fintech CEO

Crypto exchange Gemini’s IPO muddied by Winklevoss feud with Trump nominee for CFTC

Bitcoin Hashrate Marches To New ATH Amid Price Rollercoaster

Leave a Reply Cancel reply