Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Abstract Data from high-energy observations are usually obtained as lists of photon events. A common analysis task for such data is to identify whether diffuse emission exists, and to estimate its surface brightness, even in the presence of point sources that may be superposed. We have developed a novel nonparametric event list segmentation algorithm to divide up the field of view into distinct emission components. We use photon location data directly, without binning them into an image. We first construct a graph from the Voronoi tessellation of the observed photon locations and then grow segments using a new adaptation of seeded region growing that we callSeeded Region Growing on Graph, after which the overall method is namedSRGonG. Starting with a set of seed locations, this results in an oversegmented data set, whichSRGonGthen coalesces using a greedy algorithm where adjacent segments are merged to minimize a model comparison statistic; we use the Bayesian Information Criterion. UsingSRGonGwe are able to identify point-like and diffuse extended sources in the data with equal facility. We validateSRGonGusing simulations, demonstrating that it is capable of discerning irregularly shaped low-surface-brightness emission structures as well as point-like sources with strengths comparable to that seen in typical X-ray data. We demonstrateSRGonG’s use on the Chandra data of the Antennae galaxies and show that it segments the complex structures appropriately.more » « less
-
This article considers the problem of modeling a class of nonstationary time series using piecewise autoregressive (AR) processes in the presence of outliers. The number and locations of the piecewise AR segments, as well as the orders of the respective AR processes, are assumed to be unknown. In addition, each piece may contain an unknown number of innovational and/or additive outliers. The minimum description length (MDL) principle is applied to compare various segmented AR fits to the data. The goal is to find the “best” combination of the number of segments, the lengths of the segments, the orders of the piecewise AR processes, and the number and type of outliers. Such a “best” combination is implicitly defined as the optimizer of an MDL criterion. Since the optimization is carried over a large number of configurations of segments and positions of outliers, a genetic algorithm is used to find optimal or near‐optimal solutions. Numerical results from simulation experiments and real data analyses show that the procedure enjoys excellent empirical properties.more » « lessFree, publicly-accessible full text available July 24, 2026
-
Yanwu, Xu (Ed.)Lung cancer is a major cause of cancer-related deaths, and early diagnosis and treatment are crucial for improving patients’ survival outcomes. In this paper, we propose to employ convolutional neural networks to model the non-linear relationship between the risk of lung cancer and the lungs’ morphology revealed in the CT images. We apply a mini-batched loss that extends the Cox proportional hazards model to handle the non-convexity induced by neural networks, which also enables the training of large data sets. Additionally, we propose to combine mini-batched loss and binary cross-entropy to predict both lung cancer occurrence and the risk of mortality. Simulation results demonstrate the effectiveness of both the mini-batched loss with and without the censoring mechanism, as well as its combination with binary cross-entropy. We evaluate our approach on the National Lung Screening Trial data set with several 3D convolutional neural network architectures, achieving high AUC and C-index scores for lung cancer classification and survival prediction. These results, obtained from simulations and real data experiments, highlight the potential of our approach to improving the diagnosis and treatment of lung cancer.more » « lessFree, publicly-accessible full text available June 11, 2026
-
We present a new method to estimate the boundary of extended sources in high-energy photon lists and to quantify the uncertainty in the boundary. This method extends the graphed seeded region growing method developed by M. Fan et al. Here, we describe how an unambiguous boundary of a centrally concentrated astronomical source may be defined by first spatially segmenting the photon list, then forcibly merging the segments until only two segments—an extended source and its background—remain, and finally constructing a boundary as the connected outer edges of the Voronoi tessellation of the photons included in the source segment. The resulting boundary is then modeled using Fourier descriptors to generate a smooth curve, and this curve is bootstrapped to generate uncertainties. We apply the method to photon event lists obtained during the observations of galaxies NGC 2300 and Arp 299. We demonstrate how the derived extent and enclosed flux of NGC 2300 obtained with Chandra and XMM-Newton are comparable. We also show how complex internal structure, as in the case of Arp 299, may be subsumed to construct a compact boundary of the object.more » « lessFree, publicly-accessible full text available May 22, 2026
-
Free, publicly-accessible full text available February 11, 2026
-
Discrimination-aware classification methods remedy socioeconomic disparities exacerbated by machine learning systems. In this paper, we propose a novel data pre-processing technique that assigns weights to training instances in order to reduce discrimination without changing any of the inputs or labels. While the existing reweighing approach only looks into sensitive attributes, we refine the weights by utilizing both sensitive and insensitive ones. We formulate our weight assignment as a linear programming problem. The weights can be directly used in any classification model into which they are incorporated. We demonstrate three advantages of our approach on synthetic and benchmark datasets. First, discrimination reduction comes at a small cost in accuracy. Second, our method is more scalable than most other pre-processing methods. Third, the trade-off between fairness and accuracy can be explicitly monitored by model users. Code is available athttps://github.com/frnliang/refined_reweighing.more » « less
-
In this paper, we propose a new framework to detect adversarial examples motivated by the observations that random components can improve the smoothness of predictors and make it easier to simulate the output distribution of a deep neural network. With these observations, we propose a novel Bayesian adversarial example detector, short for BATER, to improve the performance of adversarial example detection. Specifically, we study the distributional difference of hidden layer output between natural and adversarial examples, and propose to use the randomness of the Bayesian neural network to simulate hidden layer output distribution and leverage the distribution dispersion to detect adversarial examples. The advantage of a Bayesian neural network is that the output is stochastic while a deep neural network without random components does not have such characteristics. Empirical results on several benchmark datasets against popular attacks show that the proposed BATER outperforms the state-of-the-art detectors in adversarial example detection.more » « less
An official website of the United States government

Full Text Available