Deep neural networks are increasingly used on mobile devices, where computational resources are limited. In this paper we develop CondenseNet, a novel network architec- ture with unprecedented efficiency. It combines dense con- nectivity between layers with a mechanism to remove un- used connections. The dense connectivity facilitates feature re-use in the network, whereas learned group convolution- s remove connections between layers for which this feature re-use is superfluous. At test time, our model can be imple- mented using standard grouped convolutions—allowing for efficient computation in practice. Our experiments demon- strate that CondenseNets are much more efficient than state- of-the-art compact convolutional networks such as Mo- bileNets and ShuffleNets.
more »
« less
CondenseNet: An Efficient DenseNet using Learned Group Convolutions
Deep neural networks are increasingly used on mobile devices, where computational resources are limited. In this paper we develop CondenseNet, a novel network architec- ture with unprecedented efficiency. It combines dense con- nectivity between layers with a mechanism to remove un- used connections. The dense connectivity facilitates feature re-use in the network, whereas learned group convolution- s remove connections between layers for which this feature re-use is superfluous. At test time, our model can be imple- mented using standard grouped convolutions—allowing for efficient computation in practice. Our experiments demon- strate that CondenseNets are much more efficient than state- of-the-art compact convolutional networks such as Mo- bileNets and ShuffleNets.
more »
« less
- Award ID(s):
- 1740822
- PAR ID:
- 10064651
- Date Published:
- Journal Name:
- CVPR 2018
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
null (Ed.)Attention networks have successfully boosted the performance in various vision problems. Previous works lay emphasis on designing a new attention module and individually plug them into the networks. Our paper proposes a novel-and-simple framework that shares an attention module throughout different network layers to encourage the integration of layer-wise information and this parameter-sharing module is referred to as Dense-and-Implicit-Attention (DIA) unit. Many choices of modules can be used in the DIA unit. Since Long Short Term Memory (LSTM) has a capacity of capturing long-distance dependency, we focus on the case when the DIA unit is the modified LSTM (called DIA-LSTM). Experiments on benchmark datasets show that the DIA-LSTM unit is capable of emphasizing layer-wise feature interrelation and leads to significant improvement of image classification accuracy. We further empirically show that the DIA-LSTM has a strong regularization ability on stabilizing the training of deep networks by the experiments with the removal of skip connections (He et al. 2016a) or Batch Normalization (Ioffe and Szegedy 2015) in the whole residual network.more » « less
-
Space division multiplexed elastic optical networks (SDM-EONs) enhance service provisioning by offering increased fiber capacity through the use of flexible spectrum allocation, multiple spatial modes, and efficient modulations. In these networks, the problem of allocating resources for connections involves assigning routes, modulations, cores, and spectrum (RMCSA). However, the presence of intercore crosstalk (XT) between ongoing connections on adjacent cores can degrade signal transmission, necessitating proper handling during resource assignment. The use of multiple modulations in translucent optical networks presents a challenge in balancing spectrum utilization and XT accumulation. In this paper, we propose a dual-optimized RMCSA algorithm called the Capacity Loss Aware Resource Assignment Algorithm (CLARA+), which optimizes network capacity utilization to improve resource availability and network performance. A two-step machine-learning-enabled optimization is used to improve the resource allocations by balancing the tradeoff between spectrum utilization and XT accumulation with the help of feature extraction from the network. Extensive simulations demonstrate that CLARA+ significantly reduces bandwidth blocking probability and enhances resource utilization across various scenarios. We show that our strategy applied to a few algorithms from the literature improves the bandwidth blocking probability by up to three orders of magnitude. The algorithm effectively balances spectrum utilization and XT accumulation more efficiently compared to existing algorithms in the literature.more » « less
-
Noise and inconsistency commonly exist in real-world information networks, due to the inherent error-prone nature of human or user privacy concerns. To date, tremendous efforts have been made to advance feature learning from networks, including the most recent graph convolutional networks (GCNs) or attention GCN, by integrating node content and topology structures. However, all existing methods consider networks as error-free sources and treat feature content in each node as independent and equally important to model node relations. Noisy node content, combined with sparse features, provides essential challenges for existing methods to be used in real-world noisy networks. In this article, we propose feature-based attention GCN (FA-GCN), a feature-attention graph convolution learning framework, to handle networks with noisy and sparse node content. To tackle noise and sparse content in each node, FA-GCN first employs a long short-term memory (LSTM) network to learn dense representation for each node feature. To model interactions between neighboring nodes, a feature-attention mechanism is introduced to allow neighboring nodes to learn and vary feature importance, with respect to their connections. By using a spectral-based graph convolution aggregation process, each node is allowed to concentrate more on the most determining neighborhood features aligned with the corresponding learning task. Experiments and validations, w.r.t. different noise levels, demonstrate that FA-GCN achieves better performance than the state-of-the-art methods in both noise-free and noisy network environments.more » « less
-
The lack of large-scale, continuously evolving empirical data usually limits the study of networks to the analysis of snapshots in time. This approach has been used for verification of network evolution mechanisms, such as preferential attachment. However, these studies are mostly restricted to the analysis of the first links established by a new node in the network and typically ignore connections made after each node’s initial introduction. Here, we show that the subsequent actions of individuals, such as their second network link, are not random and can be decoupled from the mechanism behind the first network link. We show that this feature has strong influence on the network topology. Moreover, snapshots in time can now provide information on the mechanism used to establish the second connection. We interpret these empirical results by introducing the “propinquity model,” in which we control and vary the distance of the second link established by a new node and find that this can lead to networks with tunable density scaling, as found in real networks. Our work shows that sociologically meaningful mechanisms are influencing network evolution and provides indications of the importance of measuring the distance between successive connections.more » « less
An official website of the United States government

