NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Identifying Diffuse Spatial Structures in High-energy Photon Lists

https://doi.org/10.3847/1538-3881/aca478

Fan, Minjie; Wang, Jue; Kashyap, Vinay L.; Lee, Thomas C. M.; van Dyk, David A.; Zezas, Andreas (January 2023, The Astronomical Journal)

Abstract Data from high-energy observations are usually obtained as lists of photon events. A common analysis task for such data is to identify whether diffuse emission exists, and to estimate its surface brightness, even in the presence of point sources that may be superposed. We have developed a novel nonparametric event list segmentation algorithm to divide up the field of view into distinct emission components. We use photon location data directly, without binning them into an image. We first construct a graph from the Voronoi tessellation of the observed photon locations and then grow segments using a new adaptation of seeded region growing that we callSeeded Region Growing on Graph, after which the overall method is namedSRGonG. Starting with a set of seed locations, this results in an oversegmented data set, whichSRGonGthen coalesces using a greedy algorithm where adjacent segments are merged to minimize a model comparison statistic; we use the Bayesian Information Criterion. UsingSRGonGwe are able to identify point-like and diffuse extended sources in the data with equal facility. We validateSRGonGusing simulations, demonstrating that it is capable of discerning irregularly shaped low-surface-brightness emission structures as well as point-like sources with strengths comparable to that seen in typical X-ray data. We demonstrateSRGonG’s use on the Chandra data of the Antennae galaxies and show that it segments the complex structures appropriately.
more » « less
Simultaneous Detection of Structural Breaks and Outliers in Time Series

https://doi.org/10.1111/jtsa.70010

Davis, Richard A; Lee, Thomas_C M; Rodriguez‐Yam, Gabriel A (July 2025, Journal of Time Series Analysis)

This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes in the presence of outliers. The number and locations of the piecewise AR segments, as well as the orders of the respective AR processes, are assumed to be unknown. In addition, each piece may contain an unknown number of innovational and/or additive outliers. The minimum description length (MDL) principle is applied to compare various segmented AR fits to the data. The goal is to find the “best” combination of the number of segments, the lengths of the segments, the orders of the piecewise AR processes, and the number and type of outliers. Such a “best” combination is implicitly defined as the optimizer of an MDL criterion. Since the optimization is carried over a large number of configurations of segments and positions of outliers, a genetic algorithm is used to find optimal or near‐optimal solutions. Numerical results from simulation experiments and real data analyses show that the procedure enjoys excellent empirical properties.
more » « less
Free, publicly-accessible full text available July 24, 2026
Improving lung cancer diagnosis and survival prediction with deep learning and CT imaging

https://doi.org/10.1371/journal.pone.0323174

Wang, Xiawei; Sharpnack, James; Lee, Thomas CM (June 2025, PLOS One)
Yanwu, Xu (Ed.)
Lung cancer is a major cause of cancer-related deaths, and early diagnosis and treatment are crucial for improving patients’ survival outcomes. In this paper, we propose to employ convolutional neural networks to model the non-linear relationship between the risk of lung cancer and the lungs’ morphology revealed in the CT images. We apply a mini-batched loss that extends the Cox proportional hazards model to handle the non-convexity induced by neural networks, which also enables the training of large data sets. Additionally, we propose to combine mini-batched loss and binary cross-entropy to predict both lung cancer occurrence and the risk of mortality. Simulation results demonstrate the effectiveness of both the mini-batched loss with and without the censoring mechanism, as well as its combination with binary cross-entropy. We evaluate our approach on the National Lung Screening Trial data set with several 3D convolutional neural network architectures, achieving high AUC and C-index scores for lung cancer classification and survival prediction. These results, obtained from simulations and real data experiments, highlight the potential of our approach to improving the diagnosis and treatment of lung cancer.
more » « less
Free, publicly-accessible full text available June 11, 2026
Auto-BUQ: Uncertainty Quantification for the Boundaries of Segmented Events

https://doi.org/10.3847/1538-3881/adc931

Wang, Jue; Kashyap, Vinay L; Lee, Thomas_C M; van_Dyk, David A; Zezas, Andreas (May 2025, The Astronomical Journal)

We present a new method to estimate the boundary of extended sources in high-energy photon lists and to quantify the uncertainty in the boundary. This method extends the graphed seeded region growing method developed by M. Fan et al. Here, we describe how an unambiguous boundary of a centrally concentrated astronomical source may be defined by first spatially segmenting the photon list, then forcibly merging the segments until only two segments—an extended source and its background—remain, and finally constructing a boundary as the connected outer edges of the Voronoi tessellation of the photons included in the source segment. The resulting boundary is then modeled using Fourier descriptors to generate a smooth curve, and this curve is bootstrapped to generate uncertainties. We apply the method to photon event lists obtained during the observations of galaxies NGC 2300 and Arp 299. We demonstrate how the derived extent and enclosed flux of NGC 2300 obtained with Chandra and XMM-Newton are comparable. We also show how complex internal structure, as in the case of Arp 299, may be subsumed to construct a compact boundary of the object.
more » « less
Free, publicly-accessible full text available May 22, 2026
AutoGFI: Streamlined Generalized Fiducial Inference for Modern Inference Problems in Models with Additive Errors

https://doi.org/10.1080/10618600.2024.2441165

Du, Wei; Hannig, Jan; Lee, Thomas_C M; Su, Yi; Zhang, Chunzhe (February 2025, Journal of Computational and Graphical Statistics)

Free, publicly-accessible full text available February 11, 2026
Structural Break Detection in Non-Stationary Network Vector Autoregression Models

https://doi.org/10.1109/TNSE.2024.3398002

Han, Yi; Lee, Thomas_C M (September 2024, IEEE Transactions on Network Science and Engineering)

Full Text Available
A refined reweighing technique for nondiscriminatory classification

https://doi.org/10.1371/journal.pone.0308661

Liang, Yuefeng; Hsieh, Cho-Jui; Lee, Thomas (August 2024, PLOS ONE)

Discrimination-aware classification methods remedy socioeconomic disparities exacerbated by machine learning systems. In this paper, we propose a novel data pre-processing technique that assigns weights to training instances in order to reduce discrimination without changing any of the inputs or labels. While the existing reweighing approach only looks into sensitive attributes, we refine the weights by utilizing both sensitive and insensitive ones. We formulate our weight assignment as a linear programming problem. The weights can be directly used in any classification model into which they are incorporated. We demonstrate three advantages of our approach on synthetic and benchmark datasets. First, discrimination reduction comes at a small cost in accuracy. Second, our method is more scalable than most other pre-processing methods. Third, the trade-off between fairness and accuracy can be explicitly monitored by model users. Code is available athttps://github.com/frnliang/refined_reweighing.
more » « less
Full Text Available
Online Continuous Hyperparameter Optimization for Generalized Linear Contextual Bandits

Kang, Yue; Hsieh, Cho-Jui; Lee, Thomas (March 2024, Transactions on machine learning research)

Full Text Available
Adversarial Examples Detection With Bayesian Neural Network

https://doi.org/10.1109/TETCI.2024.3372383

Li, Yao; Tang, Tongyi; Hsieh, Cho-Jui; Lee, Thomas_C M (March 2024, IEEE Transactions on Emerging Topics in Computational Intelligence)

In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate the output distribution of a deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATER, to improve the performance of adversarial example detection. Specifically, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of the Bayesian neural network to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of a Bayesian neural network is that the output is stochastic while a deep neural network without random components does not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that the proposed BATER outperforms the state-of-the-art detectors in adversarial example detection.
more » « less
Full Text Available
Estimating fiber orientation distribution with application to study brain lateralization using HCP D-MRI data

https://doi.org/10.1214/23-AOAS1781

Hwang, Seungyong; Lee, Thomas_C M; Paul, Debashis; Peng, Jie (March 2024, The Annals of Applied Statistics)

Full Text Available

« Prev Next »

Search for: All records