skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Improving Internet Advertising Using Click – Through Rate Prediction
Online advertising is a billion-dollar industry, with many companies choosing online websites and various social media platforms to promote their products. The primary concerns in online marketing are to optimize the performance of a digital advert, reach the right audience, and maximize revenue, which can be achieved by predicting the accurate probability of a given ad being clicked, called the Click-Through Rate. It is assumed that a high CTR depicts the ad reaching its target customers while a low CTR shows that it is not reaching its desired audience, which may constitute a low return on investment (ROI). We propose a data-science-driven approach to help businesses improve their internet advertising campaigns which involves building various machine learning models to accurately predict the CTR and selecting the best-performing model. To build our classification models, we use the Avazu dataset, publicly available on the Kaggle website. Having insights on this metric will allow companies to compete in real-time bidding, gauge how relevant their keywords are in search engine querying, and mitigate an unexpected loss in spending budget. The authors in this paper strive to use modern machine learning tools and techniques to improve the performance of predicting Click-Through Rate (CTR) in online advertisements and bring change to the industry.  more » « less
Award ID(s):
1832536
PAR ID:
10487922
Author(s) / Creator(s):
; ; ;
Editor(s):
Waldemar Karwowski 
Publisher / Repository:
AHFE Open Access
Date Published:
Journal Name:
Applied Human Factors and Ergonomics International
ISSN:
2771-0718
Subject(s) / Keyword(s):
Click-through rate, Online advertising, Machine learning, Random forest
Format(s):
Medium: X
Location:
New York, USA
Sponsoring Org:
National Science Foundation
More Like this
  1. Bringmann, Karl; Grohe, Martin; Puppis, Gabriele; Svensson, Ola (Ed.)
    We study information design in click-through auctions, in which the bidders/advertisers bid for winning an opportunity to show their ads but only pay for realized clicks. The payment may or may not happen, and its probability is called the click-through rate (CTR). This auction format is widely used in the industry of online advertising. Bidders have private values, whereas the seller has private information about each bidder’s CTRs. We are interested in the seller’s problem of partially revealing CTR information to maximize revenue. Information design in click-through auctions turns out to be intriguingly different from almost all previous studies in this space since any revealed information about CTRs will never affect bidders' bidding behaviors - they will always bid their true value per click - but only affect the auction’s allocation and payment rule. In some sense, this makes information design effectively a constrained mechanism design problem. Our first result is an FPTAS to compute an approximately optimal mechanism under a constant number of bidders. The design of this algorithm leverages Bayesian bidder values which help to "smooth" the seller’s revenue function and lead to better tractability. The design of this FPTAS is complex and primarily algorithmic. Our second main result pursues the design of "simple" mechanisms that are approximately optimal yet more practical. We primarily focus on the two-bidder situation, which is already notoriously challenging as demonstrated in recent works. When bidders' CTR distribution is symmetric, we develop a simple prior-free signaling scheme, whose construction relies on a parameter termed optimal signal ratio. The constructed scheme provably obtains a good approximation as long as the maximum and minimum of bidders' value density functions do not differ much. 
    more » « less
  2. null (Ed.)
    Monetizing websites and web apps through online advertising is widespread in the web ecosystem, creating a billion-dollar market. This has led to the emergence of a vast network of tertiary ad providers and ad syndication to facilitate this growing market. Nowadays, the online advertising ecosystem forces publishers to integrate ads from these third-party domains. On the one hand, this raises several privacy and security concerns that are actively being studied in recent years. On the other hand, the ability of today's browsers to load dynamic web pages with complex animations and Javascript has also transformed online advertising. This can have a significant impact on webpage performance. The latter is a critical metric for optimization since it ultimately impacts user satisfaction. Unfortunately, there are limited literature studies on understanding the performance impacts of online advertising which we argue is as important as privacy and security. In this paper, we apply an in-depth and first-of-a-kind performance evaluation of web ads. Unlike prior efforts that rely primarily on adblockers, we perform a fine-grained analysis on the web browser's page loading process to demystify the performance cost of web ads. We aim to characterize the cost by every component of an ad, so the publisher, ad syndicate, and advertiser can improve the ad's performance with detailed guidance. For this purpose, we develop a tool, adPerf, for the Chrome browser that classifies page loading workloads into ad-related and main-content at the granularity of browser activities. Our evaluations show that online advertising entails more than 15% of browser page loading workload and approximately 88% of that is spent on JavaScript. On smartphones, this additional cost of ads is 7% lower since mobile pages include fewer and well-optimized ads. We also track the sources and delivery chain of web ads and analyze performance considering the origin of the ad contents. We observe that 2 of the well-known third-party ad domains contribute to 35% of the ads performance cost and surprisingly, top news websites implicitly include unknown third-party ads which in some cases build up to more than 37% of the ads performance cost. 
    more » « less
  3. null (Ed.)
    Online advertising, as a vast market, has gained significant attention in various platforms ranging from search engines, third-party websites, social media, and mobile apps. The prosperity of online campaigns is a challenge in online marketing and is usually evaluated by user response through different metrics, such as clicks on advertisement (ad) creatives, subscriptions to products, purchases of items, or explicit user feedback through online surveys. Recent years have witnessed a significant increase in the number of studies using computational approaches, including machine learning methods, for user response prediction. However, existing literature mainly focuses on algorithmic-driven designs to solve specific challenges, and no comprehensive review exists to answer many important questions. What are the parties involved in the online digital advertising eco-systems? What type of data are available for user response prediction? How do we predict user response in a reliable and/or transparent way? In this survey, we provide a comprehensive review of user response prediction in online advertising and related recommender applications. Our essential goal is to provide a thorough understanding of online advertising platforms, stakeholders, data availability, and typical ways of user response prediction. We propose a taxonomy to categorize state-of-the-art user response prediction methods, primarily focusing on the current progress of machine learning methods used in different online platforms. In addition, we also review applications of user response prediction, benchmark datasets, and open source codes in the field. 
    more » « less
  4. Targeted advertising remains an important part of the free web browsing experience, where advertisers' targeting and personalization algorithms together find the most relevant audience for millions of ads every day. However, given the wide use of advertising, this also enables using ads as a vehicle for problematic content, such as scams or clickbait. Recent work that explores people's sentiments toward online ads, and the impacts of these ads on people's online experiences, has found evidence that online ads can indeed be problematic. Further, there is the potential for personalization to aid the delivery of such ads, even when the advertiser targets with low specificity. In this paper, we study Facebook--one of the internet's largest ad platforms--and investigate key gaps in our understanding of problematic online advertising: (a) What categories of ads do people find problematic? (b) Are there disparities in the distribution of problematic ads to viewers? and if so, (c) Who is responsible--advertisers or advertising platforms? To answer these questions, we empirically measure a diverse sample of user experiences with Facebook ads via a 3-month longitudinal panel. We categorize over 32,000 ads collected from this panel (n = 132); and survey participants' sentiments toward their own ads to identify four categories of problematic ads. Statistically modeling the distribution of problematic ads across demographics, we find that older people and minority groups are especially likely to be shown such ads. Further, given that 22% of problematic ads had no specific targeting from advertisers, we infer that ad delivery algorithms (advertising platforms themselves) played a significant role in the biased distribution of these ads. 
    more » « less
  5. Abstract The rise of ad-blockers is viewed as an economic threat by online publishers who primarily rely on online advertising to monetize their services. To address this threat, publishers have started to retaliate by employing anti ad-blockers , which scout for ad-block users and react to them by pushing users to whitelist the website or disable ad-blockers altogether. The clash between ad-blockers and anti ad-blockers has resulted in a new arms race on the Web. In this paper, we present an automated machine learning based approach to identify anti ad-blockers that detect and react to ad-block users. The approach is promising with precision of 94.8% and recall of 93.1%. Our automated approach allows us to conduct a large-scale measurement study of anti ad-blockers on Alexa top-100K websites. We identify 686 websites that make visible changes to their page content in response to ad-block detection. We characterize the spectrum of different strategies used by anti ad-blockers. We find that a majority of publishers use fairly simple first-party anti ad-block scripts. However, we also note the use of third-party anti ad-block services that use more sophisticated tactics to detect and respond to ad-blockers. 
    more » « less