NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Robustly Improving Bandit Algorithms with Confounded and Selection Biased Offline Data: A Causal Approach

https://doi.org/10.1609/aaai.v38i18.30027

Huang, Wen; Wu, Xintao (March 2024, Proceedings of the AAAI Conference on Artificial Intelligence)

This paper studies bandit problems where an agent has access to offline data that might be utilized to potentially improve the estimation of each arm’s reward distribution. A major obstacle in this setting is the existence of compound biases from the observational data. Ignoring these biases and blindly fitting a model with the biased data could even negatively affect the online learning phase. In this work, we formulate this problem from a causal perspective. First, we categorize the biases into confounding bias and selection bias based on the causal structure they imply. Next, we extract the causal bound for each arm that is robust towards compound biases from biased observational data. The derived bounds contain theground truth mean reward and can effectively guide the bandit agent to learn a nearly-optimal decision policy. We also conduct regret analysis in both contextual and non-contextual bandit settings and show that prior causal bounds could helpconsistently reduce the asymptotic regret.
more » « less
Full Text Available
Achieving Counterfactual Fairness for Causal Bandit

https://doi.org/10.1609/aaai.v36i6.20653

Huang, Wen; Zhang, Lu; Wu, Xintao (June 2022, Proceedings of the AAAI Conference on Artificial Intelligence)

In online recommendation, customers arrive in a sequential and stochastic manner from an underlying distribution and the online decision model recommends a chosen item for each arriving individual based on some strategy. We study how to recommend an item at each step to maximize the expected reward while achieving user-side fairness for customers, i.e., customers who share similar profiles will receive a similar reward regardless of their sensitive attributes and items being recommended. By incorporating causal inference into bandits and adopting soft intervention to model the arm selection strategy, we first propose the d-separation based UCB algorithm (D-UCB) to explore the utilization of the d-separation set in reducing the amount of exploration needed to achieve low cumulative regret. Based on that, we then propose the fair causal bandit (F-UCB) for achieving the counterfactual individual fairness. Both theoretical analysis and empirical evaluation demonstrate effectiveness of our algorithms.
more » « less
Full Text Available
Achieving User-Side Fairness in Contextual Bandits

https://doi.org/10.1007/s44230-022-00008-w

Huang, Wen; Labille, Kevin; Wu, Xintao; Lee, Dongwon; Heffernan, Neil (January 2022, Human-Centric Intelligent Systems)

Personalized recommendation based on multi-arm bandit (MAB) algorithms has shown to lead to high utility and efficiency as it can dynamically adapt the recommendation strategy based on feedback. However, unfairness could incur in personalized recommendation. In this paper, we study how to achieve user-side fairness in personalized recommendation. We formulate our fair personalized recommendation as a modified contextual bandit and focus on achieving fairness on the individual whom is being recommended an item as opposed to achieving fairness on the items that are being recommended. We introduce and define a metric that captures the fairness in terms of rewards received for both the privileged and protected groups. We develop a fair contextual bandit algorithm, Fair-LinUCB, that improves upon the traditional LinUCB algorithm to achieve group-level fairness of users. Our algorithm detects and monitors unfairness while it learns to recommend personalized videos to students to achieve high efficiency. We provide a theoretical regret analysis and show that our algorithm has a slightly higher regret bound than LinUCB. We conduct numerous experimental evaluations to compare the performances of our fair contextual bandit to that of LinUCB and show that our approach achieves group-level fairness while maintaining a high utility.
more » « less
Full Text Available
Fairness-aware Bandit-based Recommendation

https://doi.org/10.1109/BigData52589.2021.9671959

Huang, Wen; Labille, Kevin; Wu, Xintao; Lee, Dongwon; Heffernan, Neil (December 2021, 2021 IEEE International Conference on Big Data (Big Data))

Full Text Available
Fair and Robust Classification Under Sample Selection Bias

https://doi.org/10.1145/3459637.3482104

Du, Wei; Wu, Xintao (October 2021, 30th ACM International Conference on Information & Knowledge Management)

To address the sample selection bias between the training and test data, previous research works focus on reweighing biased training data to match the test data and then building classification models on there weighed raining data. However, how to achieve fairness in the built classification models is under-explored. In this paper, we propose a framework for robust and fair learning under sample selection bias. Our framework adopts there weighing estimation approach for bias correction and the minimax robust estimation approach for achieving robustness on prediction accuracy. Moreover, during the minimax optimization, the fairness is achieved under the worst case, which guarantees the model’s fairness on test data. We further develop two algorithms to handle sample selection bias when test data is both available and unavailable.
more » « less
Full Text Available
Enhancing personalized modeling via weighted and adversarial learning

https://doi.org/10.1007/s41060-021-00263-3

Du, Wei; Wu, Xintao (June 2021, International Journal of Data Science and Analytics)
null (Ed.)
Full Text Available
Classifying Math Knowledge Components via Task-Adaptive Pre-Trained BERT.

https://doi.org/10.1007/978-3-030-78292-4_33

Shen, J.T.; Yamashita, M.; Prihar, E.; Heffernan, N.; Wu, X.; McGrew, S.; Lee, D. (January 2021, Artificial Intelligence in Education)
null; null; null; null; null (Ed.)
Educational content labeled with proper knowledge components (KCs) are particularly useful to teachers or content organizers. However, manually labeling educational content is labor intensive and error-prone. To address this challenge, prior research proposed machine learning based solutions to auto-label educational content with limited success. In this work, we significantly improve prior research by (1) expanding the input types to include KC descriptions, instructional video titles, and problem descriptions (i.e., three types of prediction task), (2) doubling the granularity of the prediction from 198 to 385 KC labels (i.e., more practical setting but much harder multinomial classification problem), (3) improving the prediction accuracies by 0.5–2.3% using Task-adaptive Pre-trained BERT, outperforming six baselines, and (4) proposing a simple evaluation measure by which we can recover 56–73% of mispredicted KC labels. All codes and data sets in the experiments are available at: https://github.com/tbs17/TAPT-BERT
more » « less
Full Text Available
Transferable Contextual Bandits with Prior Observations

https://doi.org/10.1007/978-3-030-75765-6_32

Labille, Kevin; Huang, Wen; Wu, Xintao (January 2021, Proceedings of 25th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining)

Full Text Available
AdvPL: Adversarial Personalized Learning

https://doi.org/10.1109/dsaa49011.2020.00021

Du, Wei; Wu, Xintao (November 2020, 2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA))
null (Ed.)
Full Text Available
Context-Aware Attentive Knowledge Tracing

https://doi.org/10.1145/3394486.3403282

Ghosh, Aritra; Heffernan, Neil; Lan, Andrew S (August 2020, ACM SIGKDD Conference on Knowledge Discovery and Data Mining)

Full Text Available

« Prev Next »

Search for: All records