skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Functional data analysis using deep neural networks
Abstract Functional data analysis is an evolving field focused on analyzing data that reveals insights into curves, surfaces, or entities within a continuous domain. This type of data is typically distinguished by the inherent dependence and smoothness observed within each data curve. Traditional functional data analysis approaches have predominantly relied on linear models, which, while foundational, often fall short in capturing the intricate, nonlinear relationships within the data. This paper seeks to bridge this gap by reviewing the integration of deep neural networks into functional data analysis. Deep neural networks present a transformative approach to navigating these complexities, excelling particularly in high‐dimensional spaces and demonstrating unparalleled flexibility in managing diverse data constructs. This review aims to advance functional data regression, classification, and representation by integrating deep neural networks with functional data analysis, fostering a harmonious and synergistic union between these two fields. The remarkable ability of deep neural networks to adeptly navigate the intricate functional data highlights a wealth of opportunities for ongoing exploration and research across various interdisciplinary areas. This article is categorized under:Data: Types and Structure > Time Series, Stochastic Processes, and Functional DataStatistical Learning and Exploratory Methods of the Data Sciences > Deep LearningStatistical Learning and Exploratory Methods of the Data Sciences > Neural Networks  more » « less
Award ID(s):
2319342
PAR ID:
10656816
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
Wiley WIREs Computational Statistics
Date Published:
Journal Name:
WIREs Computational Statistics
Volume:
16
Issue:
4
ISSN:
1939-5108
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Optimal transport (OT) methods seek a transformation map (or plan) between two probability measures, such that the transformation has the minimum transportation cost. Such a minimum transport cost, with a certain power transform, is called the Wasserstein distance. Recently, OT methods have drawn great attention in statistics, machine learning, and computer science, especially in deep generative neural networks. Despite its broad applications, the estimation of high‐dimensional Wasserstein distances is a well‐known challenging problem owing to the curse‐of‐dimensionality. There are some cutting‐edge projection‐based techniques that tackle high‐dimensional OT problems. Three major approaches of such techniques are introduced, respectively, the slicing approach, the iterative projection approach, and the projection robust OT approach. Open challenges are discussed at the end of the review. This article is categorized under:Statistical and Graphical Methods of Data Analysis > Dimension ReductionStatistical Learning and Exploratory Methods of the Data Sciences > Manifold Learning 
    more » « less
  2. Abstract A fundamental problem in functional data analysis is to classify a functional observation based on training data. The application of functional data classification has gained immense popularity and utility across a wide array of disciplines, encompassing biology, engineering, environmental science, medical science, neurology, social science, and beyond. The phenomenal growth of the application of functional data classification indicates the urgent need for a systematic approach to develop efficient classification methods and scalable algorithmic implementations. Therefore, we here conduct a comprehensive review of classification methods for functional data. The review aims to bridge the gap between the functional data analysis community and the machine learning community, and to intrigue new principles for functional data classification. This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and ClassificationStatistical Models > Classification ModelsData: Types and Structure > Time Series, Stochastic Processes, and Functional Data 
    more » « less
  3. Abstract Fusion learning methods, developed for the purpose of analyzing datasets from many different sources, have become a popular research topic in recent years. Individualized inference approaches through fusion learning extend fusion learning approaches to individualized inference problems over a heterogeneous population, where similar individuals are fused together to enhance the inference over the target individual. Both classical fusion learning and individualized inference approaches through fusion learning are established based on weighted aggregation of individual information, but the weight used in the latter is localized to thetargetindividual. This article provides a review on two individualized inference methods through fusion learning,iFusion andiGroup, that are developed under different asymptotic settings. Both procedures guarantee optimal asymptotic theoretical performance and computational scalability. This article is categorized under:Statistical Learning and Exploratory Methods of the Data Sciences > Manifold LearningStatistical Learning and Exploratory Methods of the Data Sciences > Modeling MethodsStatistical and Graphical Methods of Data Analysis > Nonparametric MethodsData: Types and Structure > Massive Data 
    more » « less
  4. Abstract The rapid development of modeling techniques has brought many opportunities for data‐driven discovery and prediction. However, this also leads to the challenge of selecting the most appropriate model for any particular data task. Information criteria, such as the Akaike information criterion (AIC) and Bayesian information criterion (BIC), have been developed as a general class of model selection methods with profound connections with foundational thoughts in statistics and information theory. Many perspectives and theoretical justifications have been developed to understand when and how to use information criteria, which often depend on particular data circumstances. This review article will revisit information criteria by summarizing their key concepts, evaluation metrics, fundamental properties, interconnections, recent advancements, and common misconceptions to enrich the understanding of model selection in general. This article is categorized under:Data: Types and Structure > Traditional Statistical DataStatistical Learning and Exploratory Methods of the Data Sciences > Modeling MethodsStatistical and Graphical Methods of Data Analysis > Information Theoretic MethodsStatistical Models > Model Selection 
    more » « less
  5. Summary Leaf vein network geometry can predict levels of resource transport, defence and mechanical support that operate at different spatial scales. However, it is challenging to quantify network architecture across scales due to the difficulties both in segmenting networks from images and in extracting multiscale statistics from subsequent network graph representations.Here we developed deep learning algorithms using convolutional neural networks (CNNs) to automatically segment leaf vein networks. Thirty‐eight CNNs were trained on subsets of manually defined ground‐truth regions from >700 leaves representing 50 southeast Asian plant families. Ensembles of six independently trained CNNs were used to segment networks from larger leaf regions (c. 100 mm2). Segmented networks were analysed using hierarchical loop decomposition to extract a range of statistics describing scale transitions in vein and areole geometry.The CNN approach gave a precision‐recall harmonic mean of 94.5% ± 6%, outperforming other current network extraction methods, and accurately described the widths, angles and connectivity of veins. Multiscale statistics then enabled the identification of previously undescribed variation in network architecture across species.We provide aLeafVeinCNNsoftware package to enable multiscale quantification of leaf vein networks, facilitating the comparison across species and the exploration of the functional significance of different leaf vein architectures. 
    more » « less