skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Spatial-Frequency Network for Segmentation of Remote Sensing Images
We describe a deep learning system for satellite image segmentation. Our CNN model embeds contextual feature dependencies in both spatial and frequency domains. Its Spatial Weighting Module uses a multi-scale pooling layer to represent correlations at longer length scales in the spatial domain. Its Frequency Weighting Module uses frequency-domain information to better discriminate between object classes. Experimental results on the Potsdam dataset demonstrate that our model has a 1.9% higher average F1 accuracy than previous methods.  more » « less
Award ID(s):
2008151
PAR ID:
10483594
Author(s) / Creator(s):
;
Publisher / Repository:
Proc. Int. Conf. on Image Processing
Date Published:
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. We introduce Spatial-Temporal Memory Networks for video object detection. At its core, a novel Spatial-Temporal Memory module (STMM) serves as the recurrent computation unit to model long-term temporal appearance and motion dynamics. The STMM's design enables full integration of pretrained backbone CNN weights, which we find to be critical for accurate detection. Furthermore, in order to tackle object motion in videos, we propose a novel MatchTrans module to align the spatial-temporal memory from frame to frame. Our method produces state-of-the-art results on the benchmark ImageNet VID dataset, and our ablative studies clearly demonstrate the contribution of our different design choices. 
    more » « less
  2. This paper presents a 3D model of a photoconductive antenna (PCA) on semiconductor substrate. The simulations were conducted using the COMSOL Multiphysics package. The model considers the laser excitation and the carrier generation acceleration in the semiconductor layer. The computational work was achieved using the frequency-domain RF module and the semiconductor module. The results demonstrate that simulating the active area alone produces sufficient accuracy ~ 0.01% in the RF module solution (solution of the electric and magnetic fields) and ~ 0.23% in the semiconductor solution (photocurrent solution). The reduction in the simulated area helps minimizing the required CPU time and memory requirement in the 3D model at THz frequencies. The largest case in this study was simulated at the National XSEDE Supercomputing with ~ 0.3 billion unknowns and memory requirement of ~ 3.2TB in the RF module. 
    more » « less
  3. The shortest light pulses produced to date are of the order of a few tens of attoseconds, with central frequencies in the extreme UV range and bandwidths exceeding tens of electronvolts. They are often produced as a train of pulses separated by half the driving laser period, leading in the frequency domain to a spectrum of high, odd-order harmonics. As light pulses become shorter and more spectrally wide, the widely used approximation consisting of writing the optical waveform as a product of temporal and spatial amplitudes does not apply anymore. Here, we investigate the interplay of temporal and spatial properties of attosecond pulses. We show that the divergence and focus position of the generated harmonics often strongly depend on their frequency, leading to strong chromatic aberrations of the broadband attosecond pulses. Our argument uses a simple analytical model based on Gaussian optics, numerical propagation calculations, and experimental harmonic divergence measurements. This effect needs to be considered for future applications requiring high-quality focusing while retaining the broadband/ultrashort characteristics of the radiation. 
    more » « less
  4. SUMMARY The uneven distribution of earthquakes and stations in seismic tomography leads to slower convergence of nonlinear inversions and spatial bias in inversion results. Including dense regional arrays, such as USArray or Hi-Net, in global tomography causes severe convergence and spatial bias problems, against which conventional pre-conditioning schemes are ineffective. To save computational cost and reduce model bias, we propose a new strategy based on a geographical weighting of sources and receivers. Unlike approaches based on ray density or the Voronoi tessellation, this method scales to large full-waveform inversion problems and avoids instabilities at the edges of dense receiver or source clusters. We validate our strategy using a 2-D global waveform inversion test and show that the new weighting scheme leads to a nearly twofold reduction in model error and much faster convergence relative to a conventionally pre-conditioned inversion. We implement this geographical weighting strategy for global adjoint tomography. 
    more » « less
  5. A promising approach to preserving model performance in linearized transformers is to employ position-based re-weighting functions. However, state-of-the-art re-weighting functions rely heavily on target sequence lengths, making it difficult or impossible to apply them to autoregressive and simultaneous tasks, where the target and sometimes even the input sequence length are unknown. To address this issue, we propose Learned Proportions (LeaP) and LeaPformers. Our contribution is built on two major components. First, we generalize the dependence on explicit positional representations and sequence lengths into dependence on sequence proportions for re-weighting. Second, we replace static positional representations with dynamic proportions derived via a compact module, enabling more flexible attention concentration patterns. We evaluate LeaPformer against eight representative efficient transformers on the Long-Range Arena benchmark, where we show that LeaPformer achieves the best quality-throughput trade-off, as well as apply LeaPformer to Wikitext-103b autoregressive language modeling and simultaneous speech-to-text translation for two language pairs, achieving competitive results in both tasks. 
    more » « less