RationaleNitrogen isotopic compositions (δ15N) of source and trophic amino acids (AAs) are crucial tracers of N sources and trophic enrichments in diverse fields, including archeology, astrobiochemistry, ecology, oceanography, and paleo‐sciences. The current analytical technique using gas chromatography‐combustion‐isotope ratio mass spectrometry (GC/C/IRMS) requires derivatization, which is not compatible with some key AAs. Another approach using high‐performance liquid chromatography‐elemental analyzer‐IRMS (HPLC/EA/IRMS) may experience coelution issues with other compounds in certain types of samples, and the highly sensitive nano‐EA/IRMS instrumentations are not widely available. MethodsWe present a method for high‐precision δ15N measurements of AAs (δ15N‐AA) optimized for canonical source AA‐phenylalanine (Phe) and trophic AA‐glutamic acid (Glu). This offline approach entails purification and separation via high‐pressure ion‐exchange chromatography (IC) with automated fraction collection, the sequential chemical conversion of AA to nitrite and then to nitrous oxide (N2O), and the final determination of δ15N of the produced N2O via purge‐and‐trap continuous‐flow isotope ratio mass spectrometry (PT/CF/IRMS). ResultsThe cross‐plots of δ15N of Glu and Phe standards (four different natural‐abundance levels) generated by this method and their accepted values have a linear regression slope of 1 and small intercepts demonstrating high accuracy. The precisions were 0.36‰–0.67‰ for Phe standards and 0.27‰–0.35‰ for Glu standards. Our method and the GC/C/IRMS approach produced equivalent δ15N values for two lab standards (McCarthy Lab AA mixture and cyanobacteria) within error. We further tested our method on a wide range of natural sample matrices and obtained reasonable results. ConclusionsOur method provides a reliable alternative to the current methods for δ15N‐AA measurement as IC or HPLC‐based techniques that can collect underivatized AAs are widely available. Our chemical approach that converts AA to N2O can be easily implemented in laboratories currently analyzing δ15N of N2O using PT/CF/IRMS. This method will help promote the use of δ15N‐AA in important studies of N cycling and trophic ecology in a wide range of research areas. 
                        more » 
                        « less   
                    
                            
                            Probabilistic methods for approximate archetypal analysis
                        
                    
    
            Abstract Archetypal analysis (AA) is an unsupervised learning method for exploratory data analysis. One major challenge that limits the applicability of AA in practice is the inherent computational complexity of the existing algorithms. In this paper, we provide a novel approximation approach to partially address this issue. Utilizing probabilistic ideas from high-dimensional geometry, we introduce two preprocessing techniques to reduce the dimension and representation cardinality of the data, respectively. We prove that provided data are approximately embedded in a low-dimensional linear subspace and the convex hull of the corresponding representations is well approximated by a polytope with a few vertices, our method can effectively reduce the scaling of AA. Moreover, the solution of the reduced problem is near-optimal in terms of prediction errors. Our approach can be combined with other acceleration techniques to further mitigate the intrinsic complexity of AA. We demonstrate the usefulness of our results by applying our method to summarize several moderately large-scale datasets. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10367051
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Information and Inference: A Journal of the IMA
- Volume:
- 12
- Issue:
- 1
- ISSN:
- 2049-8772
- Page Range / eLocation ID:
- p. 466-493
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Abstract The arrival time prediction of coronal mass ejections (CMEs) is an area of active research. Many methods with varying levels of complexity have been developed to predict CME arrival. However, the mean absolute error (MAE) of predictions remains above 12 hr, even with the increasing complexity of methods. In this work we develop a new method for CME arrival time prediction that uses magnetohydrodynamic simulations involving data-constrained flux-rope-based CMEs, which are introduced in a data-driven solar wind background. We found that for six CMEs studied in this work the MAE in arrival time was ∼8 hr. We further improved our arrival time predictions by using ensemble modeling and comparing the ensemble solutions with STEREO-A and STEREO-B heliospheric imager data. This was done by using our simulations to create synthetic J-maps. A machine-learning (ML) method called the lasso regression was used for this comparison. Using this approach, we could reduce the MAE to ∼4 hr. Another ML method based on the neural networks (NNs) made it possible to reduce the MAE to ∼5 hr for the cases when HI data from both STEREO-A and STEREO-B were available. NNs are capable of providing similar MAE when only the STEREO-A data are used. Our methods also resulted in very encouraging values of standard deviation (precision) of arrival time. The methods discussed in this paper demonstrate significant improvements in the CME arrival time predictions. Our work highlights the importance of using ML techniques in combination with data-constrained magnetohydrodynamic modeling to improve space weather predictions.more » « less
- 
            We present a novel method for identifying topological features of chromatin domains in live cells using single-particle tracking and topological data analysis (TDA). By applying TDA to particle trajectories, we can effectively detect complex spatial patterns, such as loops, that are often missed by traditional time series analysis. Using simulations of polymer bead–spring chains, we have validated the accuracy of our method and determined its limitations for detecting loops. Our approach offers a promising avenue for exploring the topological complexity of chromatin in living cells using TDA techniques.more » « less
- 
            Abstract End‐member mixing analysis (EMMA) is widely used to analyze geoscience data for their end‐members and mixing proportions. Many traditional EMMA methods depend on known end‐members, which are sometimes uncertain or unknown. Unsupervised EMMA methods infer end‐members from data, but many existing ones don't strictly follow necessary constraints and lack full mathematical interpretability. Here, we introduce a novel unsupervised machine learning method, simplex projected gradient descent‐archetypal analysis (SPGD‐AA), which uses the ML model archetypal analysis to infer end‐members intuitively and interpretably without prior knowledge. SPGD‐AA uses extreme corners in data as end‐members or “archetypes,” and represents data as mixtures of end‐members. This method is most suitable for linear (conservative) mixing problems when samples with similar characteristics to end‐members are present in data. Validation on synthetic and real data sets, including river chemistry, deep‐sea sediment elemental composition, and hyperspectral imaging, shows that SPGD‐AA effectively recovers end‐members consistent with domain expertise and outperforms conventional approaches. SPGD‐AA is applicable to a wide range of geoscience data sets and beyond.more » « less
- 
            Summary Many high dimensional classification techniques have been proposed in the literature based on sparse linear discriminant analysis. To use them efficiently, sparsity of linear classifiers is a prerequisite. However, this might not be readily available in many applications, and rotations of data are required to create the sparsity needed. We propose a family of rotations to create the sparsity required. The basic idea is to use the principal components of the sample covariance matrix of the pooled samples and its variants to rotate the data first and then to apply an existing high dimensional classifier. This rotate-and-solve procedure can be combined with any existing classifiers and is robust against the level of sparsity of the true model. We show that these rotations do create the sparsity that is needed for high dimensional classifications and we provide theoretical understanding why such a rotation works empirically. The effectiveness of the method proposed is demonstrated by several simulated and real data examples, and the improvements of our method over some popular high dimensional classification rules are clearly shown.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
