NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Structured and Balanced Multicomponent and Multilayer Neural Networks

https://doi.org/10.1137/24M1675990

Zhang, Shijun; Zhao, Hongkai; Zhong, Yimin; Zhou, Haomin (October 2025, SIAM Journal on Scientific Computing)

In this work, we propose a balanced multicomponent and multilayer neural network (MMNN) structure to accurately and efficiently approximate functions with complex features in terms of both degrees of freedom and computational cost. The main idea is inspired by a multicomponent approach in which each component can be effectively approximated by a single-layer network, combined with a multilayer decomposition strategy to capture the complexity of the target function. Although MMNNs can be viewed as a simple modification of fully connected neural networks (FCNNs) or multilayer perceptrons (MLPs) by introducing balanced multicomponent structures, they achieve a significant reduction in training parameters, a much more efficient training process, and improved accuracy compared to FCNNs or MLPs. Extensive numerical experiments demonstrate the effectiveness of MMNNs in approximating highly oscillatory functions and their ability to automatically adapt to localized features. Our code and implementations are available at GitHub.
more » « less
Free, publicly-accessible full text available October 31, 2026
Why shallow networks struggle to approximate and learn high frequencies

https://doi.org/10.1093/imaiai/iaaf022

Zhang, Shijun; Zhao, Hongkai; Zhong, Yimin; Zhou, Haomin (July 2025, Information and Inference: A Journal of the IMA)

Abstract In this work, we present a comprehensive study combining mathematical and computational analysis to explain why a two-layer neural network struggles to handle high frequencies in both approximation and learning, especially when machine precision, numerical noise and computational cost are significant factors in practice. Specifically, we investigate the following fundamental computational issues: (1) the minimal numerical error achievable under finite precision, (2) the computational cost required to attain a given accuracy and (3) the stability of the method with respect to perturbations. The core of our analysis lies in the conditioning of the representation and its learning dynamics. Explicit answers to these questions are provided, along with supporting numerical evidence.
more » « less
On Enhancing Expressive Power via Compositions of Single Fixed-Size ReLU Network

Zhang, Shijun; Lu, Jianfeng; Zhao, Hongkai (October 2023, Proceedings of the 40th International Conference on Machine Learning)

Full Text Available
Optimal approximation rate of ReLU networks in terms of width and depth

https://doi.org/10.1016/j.matpur.2021.07.009

Shen, Zuowei; Yang, Haizhao; Zhang, Shijun (January 2022, Journal de Mathématiques Pures et Appliquées)

Full Text Available
Neural network approximation: Three hidden layers are enough

https://doi.org/10.1016/j.neunet.2021.04.011

Shen, Zuowei; Yang, Haizhao; Zhang, Shijun (September 2021, Neural Networks)
null (Ed.)
Full Text Available
Deep Network With Approximation Error Being Reciprocal of Width to Power of Square Root of Depth

https://doi.org/10.1162/neco_a_01364

Shen, Zuowei; Yang, Haizhao; Zhang, Shijun (January 2021, Neural Computation)
null (Ed.)
A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU networks. For any hyperparameters N∈N+ and L∈N+, we show that Floor-ReLU networks with width max{d,5N+13} and depth 64dL+3 can uniformly approximate a Hölder function f on [0,1]d with an approximation error 3λdα/2N-αL, where α∈(0,1] and λ are the Hölder order and constant, respectively. More generally for an arbitrary continuous function f on [0,1]d with a modulus of continuity ωf(·), the constructive approximation rate is ωf(dN-L)+2ωf(d)N-L. As a consequence, this new class of networks overcomes the curse of dimensionality in approximation power when the variation of ωf(r) as r→0 is moderate (e.g., ωf(r)≲rα for Hölder continuous functions), since the major term to be considered in our approximation rate is essentially d times a function of N and L independent of d within the modulus of continuity.
more » « less
Full Text Available
Deep Network Approximation for Smooth Functions

https://doi.org/10.1137/20M134695X

Lu, Jianfeng; Shen, Zuowei; Yang, Haizhao; Zhang, Shijun (January 2021, SIAM Journal on Mathematical Analysis)

Full Text Available
Translational Knowledge Discovery Between Drug Interactions and Pharmacogenetics

https://doi.org/10.1002/cpt.1745

Wu, Heng‐Yi; Shendre, Aditi; Zhang, Shijun; Zhang, Pengyue; Wang, Lei; Zeruesenay, Desta; Rocha, Luis M.; Shatkay, Hagit; Quinney, Sara K.; Ning, Xia; et al (April 2020, Clinical Pharmacology & Therapeutics)

Full Text Available

Search for: All records