RationaleNitrogen isotopic compositions (δ15N) of source and trophic amino acids (AAs) are crucial tracers of N sources and trophic enrichments in diverse fields, including archeology, astrobiochemistry, ecology, oceanography, and paleo‐sciences. The current analytical technique using gas chromatography‐combustion‐isotope ratio mass spectrometry (GC/C/IRMS) requires derivatization, which is not compatible with some key AAs. Another approach using high‐performance liquid chromatography‐elemental analyzer‐IRMS (HPLC/EA/IRMS) may experience coelution issues with other compounds in certain types of samples, and the highly sensitive nano‐EA/IRMS instrumentations are not widely available. MethodsWe present a method for high‐precision δ15N measurements of AAs (δ15N‐AA) optimized for canonical source AA‐phenylalanine (Phe) and trophic AA‐glutamic acid (Glu). This offline approach entails purification and separation via high‐pressure ion‐exchange chromatography (IC) with automated fraction collection, the sequential chemical conversion of AA to nitrite and then to nitrous oxide (N2O), and the final determination of δ15N of the produced N2O via purge‐and‐trap continuous‐flow isotope ratio mass spectrometry (PT/CF/IRMS). ResultsThe cross‐plots of δ15N of Glu and Phe standards (four different natural‐abundance levels) generated by this method and their accepted values have a linear regression slope of 1 and small intercepts demonstrating high accuracy. The precisions were 0.36‰–0.67‰ for Phe standards and 0.27‰–0.35‰ for Glu standards. Our method and the GC/C/IRMS approach produced equivalent δ15N values for two lab standards (McCarthy Lab AA mixture and cyanobacteria) within error. We further tested our method on a wide range of natural sample matrices and obtained reasonable results. ConclusionsOur method provides a reliable alternative to the current methods for δ15N‐AA measurement as IC or HPLC‐based techniques that can collect underivatized AAs are widely available. Our chemical approach that converts AA to N2O can be easily implemented in laboratories currently analyzing δ15N of N2O using PT/CF/IRMS. This method will help promote the use of δ15N‐AA in important studies of N cycling and trophic ecology in a wide range of research areas.
more »
« less
Probabilistic methods for approximate archetypal analysis
Abstract Archetypal analysis (AA) is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of AA in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that provided data are approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of AA. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of AA. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets.
more »
« less
- PAR ID:
- 10367051
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Information and Inference: A Journal of the IMA
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2049-8772
- Page Range / eLocation ID:
- p. 466-493
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract The arrival time prediction of coronal mass ejections (CMEs) is an area of active research. Many methods with varying levels of complexity have been developed to predict CME arrival. However, the mean absolute error (MAE) of predictions remains above 12 hr, even with the increasing complexity of methods. In this work we develop a new method for CME arrival time prediction that uses magnetohydrodynamic simulations involving data-constrained flux-rope-based CMEs, which are introduced in a data-driven solar wind background. We found that for six CMEs studied in this work the MAE in arrival time was ∼8 hr. We further improved our arrival time predictions by using ensemble modeling and comparing the ensemble solutions with STEREO-A and STEREO-B heliospheric imager data. This was done by using our simulations to create synthetic J-maps. A machine-learning (ML) method called the lasso regression was used for this comparison. Using this approach, we could reduce the MAE to ∼4 hr. Another ML method based on the neural networks (NNs) made it possible to reduce the MAE to ∼5 hr for the cases when HI data from both STEREO-A and STEREO-B were available. NNs are capable of providing similar MAE when only the STEREO-A data are used. Our methods also resulted in very encouraging values of standard deviation (precision) of arrival time. The methods discussed in this paper demonstrate significant improvements in the CME arrival time predictions. Our work highlights the importance of using ML techniques in combination with data-constrained magnetohydrodynamic modeling to improve space weather predictions.more » « less
-
We present a novel method for identifying topological features of chromatin domains in live cells using single-particle tracking and topological data analysis (TDA). By applying TDA to particle trajectories, we can effectively detect complex spatial patterns, such as loops, that are often missed by traditional time series analysis. Using simulations of polymer bead–spring chains, we have validated the accuracy of our method and determined its limitations for detecting loops. Our approach offers a promising avenue for exploring the topological complexity of chromatin in living cells using TDA techniques.more » « less
-
Abstract Causal mediation analysis aims to examine the role of a mediator or a group of mediators that lie in the pathway between an exposure and an outcome. Recent biomedical studies often involve a large number of potential mediators based on high‐throughput technologies. Most of the current analytic methods focus on settings with one or a moderate number of potential mediators. With the expanding growth of ‐omics data, joint analysis of molecular‐level genomics data with epidemiological data through mediation analysis is becoming more common. However, such joint analysis requires methods that can simultaneously accommodate high‐dimensional mediators and that are currently lacking. To address this problem, we develop a Bayesian inference method using continuous shrinkage priors to extend previous causal mediation analysis techniques to a high‐dimensional setting. Simulations demonstrate that our method improves the power of global mediation analysis compared to simpler alternatives and has decent performance to identify true nonnull contributions to the mediation effects of the pathway. The Bayesian method also helps us to understand the structure of the composite null cases for inactive mediators in the pathway. We applied our method to Multi‐Ethnic Study of Atherosclerosis and identified DNA methylation regions that may actively mediate the effect of socioeconomic status on cardiometabolic outcomes.more » « less
-
Archetypal analysis (AA) is a versatile data analysis method to cluster distinct features within a data set. Here, we demonstrate a framework showing the power of AA to spatio-temporally resolve events in calcium imaging, an imaging modality commonly used in neurobiology and neuroscience to capture neuronal communication patterns. After validation of our AA-based approach on synthetic data sets, we were able to characterize neuronal communication patterns in recorded calcium waves. Clinical relevance– Transient calcium events play an essential role in brain cell communication, growth, and network formation, as well as in neurodegeneration. To reliably interpret calcium events from personalized medicine data, where patterns may differ from patient to patient, appropriate image processing and signal analysis methods need to be developed for optimal network characterization.more » « less