skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A sequential simplex algorithm for automatic data and center selecting radial basis functions
We propose a sequential algorithm for learning sparse radial basis approximations for streaming data. The initial phase of the algorithm formulates the RBF training as a convex optimization problem with an objective function on the expansion weights while the data fitting problem imposed only as an ℓ∞-norm constraint. Each new data point observed is tested for feasibility, i.e., whether the data fitting constraint is satisfied. If so, that point is discarded and no model update is required. If it is infeasible, a new basic variable is added to the linear program. The result is a primal infeasible-dual feasible solution. The dual simplex algorithm is applied to determine a new optimal solution. A large fraction of the streaming data points does not require updates to the RBF model since they are similar enough to previously observed data and satisfy the data fitting constraints. The structure of the simplex algorithm makes the update to the solution particularly efficient given the inverse of the new basis matrix is easily computed from the old inverse. The second phase of the algorithm involves a non-convex refinement of the convex problem. Given the sparse nature of the LP solution, the computational expense of the non-convex algorithm is greatly reduced. We have also found that a small subset of the training data that includes the novel data identified by the algorithm can be used to train the non-convex optimization problem with substantial computation savings and comparable errors on the test data. We illustrate the method on the Mackey-Glass chaotic time-series, the monthly sunspot data, and a Fort Collins, Colorado weather data set. In each case we compare the results to artificial neural networks (ANN) and standard skew-RBFs.  more » « less
Award ID(s):
1633830
PAR ID:
10045395
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
2017 International Joint Conference on Neural Networks (IJCNN)
Page Range / eLocation ID:
549 to 556
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Graphical representations are essential for comprehending high-dimensional data across diverse fields, yet their construction often presents challenges due to the limitations of traditional methods. This paper introduces a novel methodology, Beyond Simplex Sparse Representation (BSSR), which addresses critical issues such as parameter dependencies, scale inconsistencies, and biased data interpretation in constructing similarity graphs. BSSR leverages the robustness of sparse representation to noise and outliers, while incorporating deep learning techniques to enhance scalability and accuracy. Furthermore, we tackle the optimization of the standard simplex, a pervasive problem, by introducing a transformative approach that converts the constraint into a smooth manifold using the Hadamard parametrization. Our proposed Tangent Perturbed Riemannian Gradient Descent (T-PRGD) algorithm provides an efficient and scalable solution for optimization problems with standard simplex or L1-norm sphere constraints. These contributions, including the BSSR methodology, robustness and scalability through deep representation, shift-invariant sparse representation, and optimization on the unit sphere, represent major advancements in the field. Our work offers novel perspectives on data representation challenges and sets the stage for more accurate analysis in the era of big data. 
    more » « less
  2. Supervised learning models have been used in various domains such as lending, college admission, face recognition, natural language processing, etc. However, they may inherit pre-existing biases from training data and exhibit discrimination against protected social groups. Various fairness notions have been proposed to address unfairness issues. In this work, we focus on Equalized Loss (EL), a fairness notion that requires the expected loss to be (approximately) equalized across different groups. Imposing EL on the learning process leads to a non-convex optimization problem even if the loss function is convex, and the existing fair learning algorithms cannot properly be adopted to find the fair predictor under the EL constraint. This paper introduces an algorithm that can leverage off-the-shelf convex programming tools (e.g., CVXPY (Diamond and Boyd, 2016; Agrawal et al., 2018)) to efficiently find the global optimum of this non-convex optimization. In particular, we propose the ELminimizer algorithm, which finds the optimal fair predictor under EL by reducing the non-convex optimization to a sequence of convex optimization problems. We theoretically prove that our algorithm finds the global optimal solution under certain conditions. Then, we support our theoretical results through several empirical studies 
    more » « less
  3. Supervised learning models have been used in various domains such as lending, college admission, face recognition, natural language processing, etc. However, they may inherit pre-existing biases from training data and exhibit discrimination against protected social groups. Various fairness notions have been proposed to address unfairness issues. In this work, we focus on Equalized Loss (EL), a fairness notion that requires the expected loss to be (approximately) equalized across different groups. Imposing EL on the learning process leads to a non-convex optimization problem even if the loss function is convex, and the existing fair learning algorithms cannot properly be adopted to find the fair predictor under the EL constraint. This paper introduces an algorithm that can leverage off-the-shelf convex programming tools (e.g., CVXPY) to efficiently find the global optimum of this non-convex optimization. In particular, we propose the ELminimizer algorithm, which finds the optimal fair predictor under EL by reducing the non-convex optimization to a sequence of convex optimization problems. We theoretically prove that our algorithm finds the global optimal solution under certain conditions. Then, we support our theoretical results through several empirical studies 
    more » « less
  4. We consider the problem of subset selection in the online setting, where data arrive incrementally. Instead of storing and running subset selection on the entire dataset, we propose an incremental subset selection framework that, at each time instant, uses the previously selected set of representatives and the new batch of data in order to update the set of representatives. We cast the problem as an integer bi- nary optimization minimizing the encoding cost of the data via representatives regularized by the number of selected items. As the proposed optimization is, in general, NP-hard and non-convex, we study a greedy approach based on un- constrained submodular optimization and also propose an efficient convex relaxation. We show that, under appropriate conditions, the solution of our proposed convex algorithm achieves the global optimal solution of the non-convex problem. Our results also address the conventional problem of subset selection in the offline setting, as a special case. By extensive experiments on the problem of video summarization, we demonstrate that our proposed online subset selection algorithms perform well on real data, capturing diverse representative events in videos, while they obtain objective function values close to the offline setting. 
    more » « less
  5. We consider the problem of subset selection in the online setting, where data arrive incrementally. Instead of storing and running subset selection on the entire dataset, we propose an incremental subset selection framework that, at each time instant, uses the previously selected set of representatives and the new batch of data in order to update the set of representatives. We cast the problem as an integer binary optimization minimizing the encoding cost of the data via representatives regularized by the number of selected items. As the proposed optimization is, in general, NP-hard and non-convex, we study a greedy approach based on unconstrained submodular optimization and also propose an efficient convex relaxation. We show that, under appropriate conditions, the solution of our proposed convex algorithm achieves the global optimal solution of the non-convex problem. Our results also address the conventional problem of subset selection in the offline setting, as a special case. By extensive experiments on the problem of video summarization, we demonstrate that our proposed online subset selection algorithms perform well on real data, capturing diverse representative events in videos, while they obtain objective function values close to the offline setting. 
    more » « less