skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on January 22, 2026

Title: Expressivity of Neural Networks with Random Weights and Learned Biases
Landmark universal function approximation results for neural networks with trained weights and biases provided the impetus for the ubiquitous use of neural networks as learning models in neuroscience and Artificial Intelligence (AI). Recent work has extended these results to networks in which a smaller subset of weights (e.g., output weights) are tuned, leaving other parameters random. However, it remains an open question whether universal approximation holds when only biases are learned, despite evidence from neuroscience and AI that biases significantly shape neural responses. The current paper answers this question. We provide theoretical and numerical evidence demonstrating that feedforward neural networks with fixed random weights can approximate any continuous function on compact sets. We further show an analogous result for the approximation of dynamical systems with recurrent neural networks. Our findings are relevant to neuroscience, where they demonstrate the potential for behaviourally relevant changes in dynamics without modifying synaptic weights, as well as for AI, where they shed light on recent fine-tuning methods for large language models, like bias and prefix-based approaches.  more » « less
Award ID(s):
2238247
PAR ID:
10631315
Author(s) / Creator(s):
; ; ; ; ; ;
Publisher / Repository:
International Conference on Learning Representations
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. To verify safety and robustness of neural networks, researchers have successfully appliedabstract interpretation, primarily using the interval abstract domain. In this paper, we study the theoretical power and limits of the interval domain for neural-network verification. First, we introduce theinterval universal approximation(IUA) theorem. IUA shows that neural networks not only can approximate any continuous functionf(universal approximation) as we have known for decades, but we can find a neural network, using any well-behaved activation function, whose interval bounds are an arbitrarily close approximation of the set semantics off(the result of applyingfto a set of inputs). We call this notion of approximationinterval approximation. Our theorem generalizes the recent result of Baader et al. from ReLUs to a rich class of activation functions that we callsquashable functions. Additionally, the IUA theorem implies that we can always construct provably robust neural networks under ℓ-norm using almost any practical activation function. Second, we study the computational complexity of constructing neural networks that are amenable to precise interval analysis. This is a crucial question, as our constructive proof of IUA is exponential in the size of the approximation domain. We boil this question down to the problem of approximating the range of a neural network with squashable activation functions. We show that the range approximation problem (RA) is a Δ2-intermediate problem, which is strictly harder thanNP-complete problems, assumingcoNP⊄NP. As a result,IUA is an inherently hard problem: No matter what abstract domain or computational tools we consider to achieve interval approximation, there is no efficient construction of such a universal approximator. This implies that it is hard to construct a provably robust network, even if we have a robust network to start with. 
    more » « less
  2. In computational neuroscience, recurrent neural networks are widely used to model neural activity and learning. In many studies, fixed points of recurrent neural networks are used to model neural responses to static or slowly changing stimuli, such as visual cortical responses to static visual stimuli. These applications raise the question of how to train the weights in a recurrent neural network to minimize a loss function evaluated on fixed points. In parallel, training fixed points is a central topic in the study of deep equilibrium models in machine learning. A natural approach is to use gradient descent on the Euclidean space of weights. We show that this approach can lead to poor learning performance due in part to singularities that arise in the loss surface. We use a reparameterization of the recurrent network model to derive two alternative learning rules that produce more robust learning dynamics. We demonstrate that these learning rules avoid singularities and learn more effectively than standard gradient descent. The new learning rules can be interpreted as steepest descent and gradient descent, respectively, under a non-Euclidean metric on the space of recurrent weights. Our results question the common, implicit assumption that learning in the brain should be expected to follow the negative Euclidean gradient of synaptic weights. 
    more » « less
  3. Abstract One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal approximation property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive and elementary framework for neural network universality remains unavailable. This paper is an effort to provide a unified and constructive framework for the universality of a large class of activation functions including most of the existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is as follows: any nAI activation function is universal in the space of continuous functions on compacta. It turns out that most of the existing activation functions are nAI, and thus universal. The framework induces several advantages over the contemporary counterparts. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is one of the first unified and constructive attempts that is valid for most of the existing activation functions. Third, it provides new proofs for most activation functions. Fourth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with a predetermined number of neurons and the values of weights/biases. Fifth, the framework allows us to abstractly present the first universal approximation with a favorable non-asymptotic rate. Sixth, our framework also provides insights into the developments, and hence providing constructive derivations, of some of the existing approaches. 
    more » « less
  4. The learning speed of feed-forward neural networks is notoriously slow and has presented a bottleneck in deep learning applications for several decades. For instance, gradient-based learning algorithms, which are used extensively to train neural networks, tend to work slowly when all of the network parameters must be iteratively tuned. To counter this, both researchers and practitioners have tried introducing randomness to reduce the learning requirement. Based on the original construction of Igelnik and Pao, single layer neural-networks with random input-to-hidden layer weights and biases have seen success in practice, but the necessary theoretical justification is lacking. In this study, we begin to fill this theoretical gap. We then extend this result to the non-asymptotic setting using a concentration inequality for Monte-Carlo integral approximations. We provide a (corrected) rigorous proof that the Igelnik and Pao construction is a universal approximator for continuous functions on compact domains, with approximation error squared decaying asymptotically likeO(1/n) for the numbernof network nodes. We then extend this result to the non-asymptotic setting, proving that one can achieve any desired approximation error with high probability providednis sufficiently large. We further adapt this randomized neural network architecture to approximate functions on smooth, compact submanifolds of Euclidean space, providing theoretical guarantees in both the asymptotic and non-asymptotic forms. Finally, we illustrate our results on manifolds with numerical experiments. 
    more » « less
  5. The classical universal approximation (UA) theorem for neural networks establishes mild conditions under which a feedforward neural network can approximate a continuous functionfwith arbitrary accuracy. A recent result shows that neural networks also enjoy a more generalintervaluniversal approximation (IUA) theorem, in the sense that the abstract interpretation semantics of the network using the interval domain can approximate the direct image map off(i.e., the result of applyingfto a set of inputs) with arbitrary accuracy. These theorems, however, rest on the unrealistic assumption that the neural network computes over infinitely precise real numbers, whereas their software implementations in practice compute over finite-precision floating-point numbers. An open question is whether the IUA theorem still holds in the floating-point setting. This paper introduces the first IUA theorem forfloating-pointneural networks that proves their remarkable ability toperfectly capturethe direct image map of any rounded target functionf, showing no limits exist on their expressiveness. Our IUA theorem in the floating-point setting exhibits material differences from the real-valued setting, which reflects the fundamental distinctions between these two computational models. This theorem also implies surprising corollaries, which include (i) the existence ofprovably robustfloating-point neural networks; and (ii) thecomputational completenessof the class of straight-line programs that use only floating-point additions and multiplications for the class of all floating-point programs that halt. 
    more » « less