skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Detecting Human Trafficking: Automated Classification of Online Customer Reviews of Massage Businesses
Problem definition: Approximately 11,000 alleged illicit massage businesses (IMBs) exist across the United States hidden in plain sight among legitimate businesses. These illicit businesses frequently exploit workers, many of whom are victims of human trafficking, forced or coerced to provide commercial sex. Academic/practical relevance: Although IMB review boards like Rubmaps.ch can provide first-hand information to identify IMBs, these sites are likely to be closed by law enforcement. Open websites like Yelp.com provide more accessible and detailed information about a larger set of massage businesses. Reviews from these sites can be screened for risk factors of trafficking. Methodology: We develop a natural language processing approach to detect online customer reviews that indicate a massage business is likely engaged in human trafficking. We label data sets of Yelp reviews using knowledge of known IMBs. We develop a lexicon of key words/phrases related to human trafficking and commercial sex acts. We then build two classification models based on this lexicon. We also train two classification models using embeddings from the bidirectional encoder representations from transformers (BERT) model and the Doc2Vec model. Results: We evaluate the performance of these classification models and various ensemble models. The lexicon-based models achieve high precision, whereas the embedding-based models have relatively high recall. The ensemble models provide a compromise and achieve the best performance on the out-of-sample test. Our results verify the usefulness of ensemble methods for building robust models to detect risk factors of human trafficking in reviews on open websites like Yelp. Managerial implications: The proposed models can save countless hours in IMB investigations by automatically sorting through large quantities of data to flag potential illicit activity, eliminating the need for manual screening of these reviews by law enforcement and other stakeholders. Funding: This work was supported by the National Science Foundation [Grant 1936331]. Supplemental Material: The online appendix is available at https://doi.org/10.1287/msom.2023.1196 .  more » « less
Award ID(s):
1936331
PAR ID:
10481238
Author(s) / Creator(s):
; ; ; ;
Publisher / Repository:
INFORMS
Date Published:
Journal Name:
Manufacturing & Service Operations Management
Volume:
25
Issue:
3
ISSN:
1523-4614
Page Range / eLocation ID:
1051 to 1065
Subject(s) / Keyword(s):
human trafficking massage businesses online customer reviews Natural Language Processing ensemble learning
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Law enforcement interventions continue to be the primary mechanism used to identify offenders and illicit businesses involved in human trafficking, yet trafficking continues to be a thriving international operation. We explore alternative mechanisms to disrupt illicit operations and reduce victimization through labor trafficking supply chains using supply chain disruption theory. Using a case study approach to examine one federally prosecuted labor trafficking case in the agricultural sector, we (1) extend criminological concepts of disruption by identifying sources and methods of disruption and (2) inform criminal justice system responses by presenting novel methods of assessing effectiveness of anti-human trafficking policies and programs. 
    more » « less
  2. Online sex advertisements (sex ads) have been linked to many U.S. sex trafficking cases. However, since the closure of the dominant website, Backpage.com (Backpage), many competing sites have emerged that are hosted in countries where U.S. law enforcement organizations have no jurisdiction. Although the online ecosystem has changed significantly, very little research uses data from sites other than Backpage, and even less uses data from multiple sites. This paper presents an anonymized dataset derived from the text and image artifacts of more than 10 million sex ads. By making this dataset publicly available, we aim to reduce barriers to entry for researchers interested in conducting data-driven counter-trafficking research. The dataset can be used to test hypotheses related to sex ads and intersite connectivity, understand the posting processes employed by prominent sites in the current online sex ad ecosystem, and develop multidisciplinary approaches for estimating ad legitimacy. Progress in any of these areas can result in potentially lifesaving interventions for ST victims. 
    more » « less
  3. Legitimate companies are key facilitators of human trafficking. These corporate facilitators include not only websites providing advertisements for commercial sex services but also hotels and motels. Analysis of all active federal criminal sex trafficking cases in 2018 and 2019 reveals that in approximately 80% of these cases, victims were exploited at either hotels or motels. This paper studies the prevalence of the hospitality industry in the crime of sex trafficking and the failure of this industry to address this problem until recent civil suits were filed by victims against individual hotels and chains. Drawing on the civil cases filed in federal courts by victims of human trafficking between 2015 and 2021 along the East Coast of the United States, this paper assesses the characteristics of these hotels and the conditions in the hotels that facilitated sex trafficking. The paper then explores the moral and ethical problems posed by the facilitating role of hotel owners/operators in sex trafficking either through collusion or failure to act on and/or report evidence of individual abuse. Suggestions on how to address the problem are provided. 
    more » « less
  4. There is ongoing debate regarding the merits of decriminalization or outright legalization of commercial sex work in the United States. A few municipalities have officially legalized both the selling and purchasing of sex, while others unofficially criminalize purchasing sex but have decriminalized its sale. In addition, there are many other locales with no official guidance on the subject but have unofficially decriminalized sex work by designating specific areas in an urban landscape safe from law enforcement for commercial sex, by quietly ceasing to arrest sex sellers, or by declining to prosecute anyone selling or attempting to sell sex. Despite these efforts, it remains crucial to understand where in an urban area commercial sex exchanges occur—legalization and decriminalization may result in fewer arrests but is likely to increase the overall size of the sex market. This growth could result in an increase in sex trafficking victimization, which makes up the majority of commercial sex sellers in any domestic market. Given the distribution of prostitution activities in most communities, it is possible to use high-fidelity predictive models to identify intervention opportunities related to sex trafficking victimization. In this research, we construct several machine learning models and inform them with a range of known criminogenic factors to predict locations hosting high levels of prostitution. We demonstrate these methods in the city of Chicago, Illinois. The results of this exploratory analysis identified a range of explanatory factors driving prostitution activity throughout Chicago, and the best-performing model correctly predicted prostitution frequency with 94% accuracy. We conclude by exploring specific areas of under- and over-prediction throughout Chicago and discuss the implications of these results for allocating social support efforts. 
    more » « less
  5. Illicit Wildlife Trade (IWT) is a serious global crime that negatively impacts biodiversity, human health, national security, and economic development. Many flora and fauna are trafficked in different product forms. We investigate a network interdiction problem for wildlife trafficking and introduce a new model to tackle key challenges associated with IWT. Our model captures the interdiction problem faced by law enforcement impeding IWT on flight networks, though it can be extended to other types of transportation networks. We incorporate vital issues unique to IWT, including the need for training and difficulty recognizing illicit wildlife products, the impact of charismatic species and geopolitical differences, and the varying amounts of information and objectives traffickers may use when choosing transit routes. Additionally, we incorporate different detection probabilities at nodes and along arcs depending on law enforcement’s interdiction and training actions. We present solutions for several key IWT supply chains using realistic data from conservation research, seizure databases, and international reports. We compare our model to two benchmark models and highlight key features of the interdiction strategy. We discuss the implications of our models for combating IWT in practice and highlight critical areas of concern for stakeholders. 
    more » « less