NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Improving lung cancer diagnosis and survival prediction with deep learning and CT imaging

https://doi.org/10.1371/journal.pone.0323174

Wang, Xiawei; Sharpnack, James; Lee, Thomas CM (June 2025, PLOS One)
Yanwu, Xu (Ed.)
Lung cancer is a major cause of cancer-related deaths, and early diagnosis and treatment are crucial for improving patients’ survival outcomes. In this paper, we propose to employ convolutional neural networks to model the non-linear relationship between the risk of lung cancer and the lungs’ morphology revealed in the CT images. We apply a mini-batched loss that extends the Cox proportional hazards model to handle the non-convexity induced by neural networks, which also enables the training of large data sets. Additionally, we propose to combine mini-batched loss and binary cross-entropy to predict both lung cancer occurrence and the risk of mortality. Simulation results demonstrate the effectiveness of both the mini-batched loss with and without the censoring mechanism, as well as its combination with binary cross-entropy. We evaluate our approach on the National Lung Screening Trial data set with several 3D convolutional neural network architectures, achieving high AUC and C-index scores for lung cancer classification and survival prediction. These results, obtained from simulations and real data experiments, highlight the potential of our approach to improving the diagnosis and treatment of lung cancer.
more » « less
Free, publicly-accessible full text available June 11, 2026
Optimizing machine learning methods to discover strong gravitational lenses in the deep lens survey

https://doi.org/10.1093/mnras/stad1709

Keerthi_Vasan, G_C; Sheng, Stephen; Jones, Tucker; Choi, Chi_Po; Sharpnack, James (June 2023, Monthly Notices of the Royal Astronomical Society)

ABSTRACT Machine learning models can greatly improve the search for strong gravitational lenses in imaging surveys by reducing the amount of human inspection required. In this work, we test the performance of supervised, semi-supervised, and unsupervised learning algorithms trained with the ResNetV2 neural network architecture on their ability to efficiently find strong gravitational lenses in the Deep Lens Survey (DLS). We use galaxy images from the survey, combined with simulated lensed sources, as labeled data in our training data sets. We find that models using semi-supervised learning along with data augmentations (transformations applied to an image during training, e.g. rotation) and Generative Adversarial Network (GAN) generated images yield the best performance. They offer 5 – 10 times better precision across all recall values compared to supervised algorithms. Applying the best performing models to the full 20 deg2 DLS survey, we find 3 Grade-A lens candidates within the top 17 image predictions from the model. This increases to 9 Grade-A and 13 Grade-B candidates when 1 per cent (∼2500 images) of the model predictions are visually inspected. This is ≳ 10 × the sky density of lens candidates compared to current shallower wide-area surveys (such as the Dark Energy Survey), indicating a trove of lenses awaiting discovery in upcoming deeper all-sky surveys. These results suggest that pipelines tasked with finding strong lens systems can be highly efficient, minimizing human effort. We additionally report spectroscopic confirmation of the lensing nature of two Grade-A candidates identified by our model, further validating our methods.
more » « less
Impact of sensor data pre-processing strategies and selection of machine learning algorithm on the prediction of metritis events in dairy cattle

https://doi.org/10.1016/j.prevetmed.2023.105903

Vidal, Gema; Sharpnack, James; Pinedo, Pablo; Tsai, I Ching; Lee, Amanda Renee; Martínez-López, Beatriz (June 2023, Preventive Veterinary Medicine)

Full Text Available
RLSbench: Domain Adaptation Under Relaxed Label Shift

Garg, Saurabh; Erickson, Nick; Sharpnack, James; Smola, Alex; Balakrishnan, Sivaraman; Lipton, Zachary (January 2023, International Conference on Machine Learning)

Despite the emergence of principled methods for domain adaptation under label shift, their sensitivity to shifts in class conditional distributions is precariously under explored. Meanwhile, popular deep domain adaptation heuristics tend to falter when faced with label proportions shifts. While several papers modify these heuristics in attempts to handle label proportions shifts, inconsistencies in evaluation standards, datasets, and baselines make it difficult to gauge the current best practices. In this paper, we introduce RLSbench, a large-scale benchmark for relaxed label shift, consisting of >500 distribution shift pairs spanning vision, tabular, and language modalities, with varying label proportions. Unlike existing benchmarks, which primarily focus on shifts in class-conditional p(x|y), our benchmark also focuses on label marginal shifts. First, we assess 13 popular domain adaptation methods, demonstrating more widespread failures under label proportion shifts than were previously known. Next, we develop an effective two-step meta-algorithm that is compatible with most domain adaptation heuristics: (i) pseudo-balance the data at each epoch; and (ii) adjust the final classifier with target label distribution estimate. The meta-algorithm improves existing domain adaptation heuristics under large label proportion shifts, often by 2--10\% accuracy points, while conferring minimal effect (<0.5\%) when label proportions do not shift. We hope that these findings and the availability of RLSbench will encourage researchers to rigorously evaluate proposed methods in relaxed label shift settings.
more » « less
Full Text Available
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

Ding, Qin; Kang, Yue; Liu, Yi-Wei; Lee, Thomas; Hsieh, Cho-Jui; Sharpnack, James. (January 2022, Advances in neural information processing systems)

Full Text Available
Syndicated Bandits: A Framework for Auto Tuning Hyper-parameters in Contextual Bandit Algorithms

Ding, Qin; Kang, Yue; Liu, Yi-Wei; Lee, Thomas C.; Hsieh, Cho-Jui; Sharpnack, James (January 2022, 36th Conference on Neural Information Processing Systems (NeurIPS 2022))

Full Text Available
An efficient algorithm for generalized linear bandit: Online stochastic gradient descent and thompson sampling

Ding, Qin; Hsieh, Cho-Jui; Sharpnack, James (January 2021, Proceedings of Machine Learning Research)
Banerjee, Arindam; Fukumizu, Kenji (Ed.)
We consider the contextual bandit problem, where a player sequentially makes decisions based on past observations to maximize the cumulative reward. Although many algorithms have been proposed for contextual bandit, most of them rely on finding the maximum likelihood estimator at each iteration, which requires 𝑂(𝑡) time at the 𝑡-th iteration and are memory inefficient. A natural way to resolve this problem is to apply online stochastic gradient descent (SGD) so that the per-step time and memory complexity can be reduced to constant with respect to 𝑡, but a contextual bandit policy based on online SGD updates that balances exploration and exploitation has remained elusive. In this work, we show that online SGD can be applied to the generalized linear bandit problem. The proposed SGD-TS algorithm, which uses a single-step SGD update to exploit past information and uses Thompson Sampling for exploration, achieves 𝑂̃ (𝑇‾‾√) regret with the total time complexity that scales linearly in 𝑇 and 𝑑, where 𝑇 is the total number of rounds and 𝑑 is the number of features. Experimental results show that SGD-TS consistently outperforms existing algorithms on both synthetic and real datasets.
more » « less
Full Text Available
An Efficient Algorithm For Generalized Linear Bandit: Online Stochastic Gradient Descent and Thompson Sampling

Ding, Qin; Hsieh, Cho-Jui; Sharpnack, James. (January 2021, International Conference on Artificial Intelligence and Statistic (AISTATS))
null (Ed.)
Full Text Available
Fused density estimation: theory and methods

https://doi.org/10.1111/rssb.12338

Bassett, Robert; Sharpnack, James (November 2019, Journal of the Royal Statistical Society: Series B (Statistical Methodology))

Full Text Available
Adaptive nonparametric regression with the K-nearest neighbour fused lasso

https://doi.org/10.1093/biomet/asz071

Madrid Padilla, Oscar Hernan; Sharpnack, James; Chen, Yanzhen; Witten, Daniela M (January 2020, Biometrika)

Summary The fused lasso, also known as total-variation denoising, is a locally adaptive function estimator over a regular grid of design points. In this article, we extend the fused lasso to settings in which the points do not occur on a regular grid, leading to a method for nonparametric regression. This approach, which we call the $$K$$-nearest-neighbours fused lasso, involves computing the $$K$$-nearest-neighbours graph of the design points and then performing the fused lasso over this graph. We show that this procedure has a number of theoretical advantages over competing methods: specifically, it inherits local adaptivity from its connection to the fused lasso, and it inherits manifold adaptivity from its connection to the $$K$$-nearest-neighbours approach. In a simulation study and an application to flu data, we show that excellent results are obtained. For completeness, we also study an estimator that makes use of an $$\epsilon$$-graph rather than a $$K$$-nearest-neighbours graph and contrast it with the $$K$$-nearest-neighbours fused lasso.
more » « less
Full Text Available

« Prev Next »

Search for: All records