Abstract To explore the hypothesis of a common source of variability in two time series, observers may estimate the magnitude‐squared coherence (MSC), which is a frequency‐domain view of the cross correlation. For time series that do not have uniform observing cadence, MSC can be estimated using Welch's overlapping segment averaging. However, multitaper has superior statistical properties to Welch's method in terms of the tradeoff between bias, variance, and bandwidth. The classical multitaper technique has recently been extended to accommodate time series with underlying uniform observing cadence from which some observations are missing. This situation is common for solar and geomagnetic data sets, which may have gaps due to breaks in satellite coverage, instrument downtime, or poor observing conditions. We demonstrate the scientific use of missing‐data multitaper magnitude‐squared coherence by detecting known solar mid‐term oscillations in simultaneous, missing‐data time series of solar Lyman flux and geomagnetic Disturbance Storm Time index. Due to their superior statistical properties, we recommend that multitaper methods be used for all heliospheric time series with underlying uniform observing cadence.
more »
« less
Optimal Frequency-domain Analysis for Spacecraft Time Series: Introducing the Missing-data Multitaper Power Spectrum Estimator
Abstract While the Lomb–Scargle periodogram is foundational to astronomy, it has a significant shortcoming: the variance in the estimated power spectrum does not decrease as more data are acquired. Statisticians have a 60 yr history of developing variance-suppressing power spectrum estimators, but most are not used in astronomy because they are formulated for time series with uniform observing cadence and without seasonal or daily gaps. Here we demonstrate how to apply the missing-data multitaper power spectrum estimator to spacecraft data with uniform time intervals between observations but missing data during thruster fires or momentum dumps. TheF-test for harmonic components may be applied to multitaper power spectrum estimates to identify statistically significant oscillations that would not rise above a white noise–based false alarm probability. Multitapering improves the dynamic range of the power spectrum estimate and suppresses spectral window artifacts. We show that the multitaper–F-test combination applied to Kepler observations of KIC 6102338 detects differential rotation without requiring iterative sinusoid fitting and subtraction. Significant signals reside at harmonics of both fundamental rotation frequencies and suggest an antisolar rotation profile. Next we use the missing-data multitaper power spectrum estimator to identify the oscillation modes responsible for the complex “scallop-shell” shape of the K2 light curve of EPIC 203354381. We argue that multitaper power spectrum estimators should be used for all time series with regular observing cadence.
more »
« less
- PAR ID:
- 10480048
- Publisher / Repository:
- DOI PREFIX: 10.3847
- Date Published:
- Journal Name:
- The Astronomical Journal
- Volume:
- 167
- Issue:
- 1
- ISSN:
- 0004-6256
- Format(s):
- Medium: X Size: Article No. 22
- Size(s):
- Article No. 22
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Abstract GJ 1002 is reported to host two Earth-mass planets in the habitable zone. We applied frequency-domain techniques for validating planet discoveries to the ESPRESSO and CARMENES radial velocity (RV) and cross-correlation full width at half maximum time series. Siegel’s test applied to Welch’s power spectrum estimates suggests that periodicity in the RV time series may not be statistically significant. The frequency response to a sinusoid with the frequency of planet b, or pseudowindow, shows excess power near the frequency of planet c. Finally, a Monte Carlo experiment where we create new RV time series realizations by adding white noise to original data reveals that the highest periodogram peak sometimes occurs at half the frequency of planet c. More extreme-precision observations are required to confirm RV oscillations and constrain the rotation period.more » « less
-
Abstract We present the power spectrum methodology used for the first-season COMAP analysis, and assess the quality of the current data set. The main results are derived through the Feed–Feed Pseudo-Cross-Spectrum (FPXS) method, which is a robust estimator with respect to both noise modeling errors and experimental systematics. We use effective transfer functions to take into account the effects of instrumental beam smoothing and various filter operations applied during the low-level data processing. The power spectra estimated in this way have allowed us to identify a systematic error associated with one of our two scanning strategies, believed to be due to residual ground or atmospheric contamination. We omit these data from our analysis and no longer use this scanning technique for observations. We present the power spectra from our first season of observing, and demonstrate that the uncertainties are integrating as expected for uncorrelated noise, with any residual systematics suppressed to a level below the noise. Using the FPXS method, and combining data on scalesk= 0.051–0.62 Mpc−1, we estimatePCO(k) = −2. 7 ± 1.7 × 104μK2Mpc3, the first direct 3D constraint on the clustering component of the CO(1–0) power spectrum in the literature.more » « less
-
Abstract Multiyear observations from the Sloan Digital Sky Survey (SDSS) Reverberation Mapping (RM) project have significantly increased the number of quasars with reliable RM lag measurements. We statistically analyze target properties, light-curve characteristics, and survey design choices to identify factors crucial for successful and efficient RM surveys. Analyzing 172 high-confidence (“gold”) lag measurements from SDSS-RM for the Hβ, Mgii, and Civemission lines, we find that the Durbin–Watson statistic (a statistical test for residual correlation) is the most significant predictor of light curves suitable for lag detection. The variability signal-to-noise ratio and emission-line placement on the detector also correlate with successful lag measurements. We further investigate the impact of the observing cadence on the survey design by analyzing the effect of reducing observations in the first year of SDSS-RM. Our results demonstrate that a modest reduction in the observing cadence to ∼1.5 weeks between observations can retain approximately 90% of the lag measurements compared to twice-weekly observations in the initial year. Provided similar and uniform sampling in subsequent years, this adjustment has a minimal effect on the overall recovery of lags across all emission lines. These results provide valuable inputs for optimizing future RM surveys.more » « less
-
We develop an analytical framework to appropriately model and adequately analyze A/B tests in presence of nonparametric nonstationarities in the targeted business metrics. A/B tests, also known as online randomized controlled experiments, have been used at scale by data-driven enterprises to guide decisions and test innovative ideas to improve core business metrics. Meanwhile, nonstationarities, such as the time-of-day effect and the day-of-week effect, can often arise nonparametrically in key business metrics involving purchases, revenue, conversions, customer experiences, and so on. First, we develop a generic nonparametric stochastic model to capture nonstationarities in A/B test experiments, where each sample represents a visit or action associated with a time label. We build a practically relevant limiting regime to facilitate analyzing large-sample estimator performances under nonparametric nonstationarities. Second, we show that ignoring or inadequately addressing nonstationarities can cause standard A/B test estimators to have suboptimal variance and nonvanishing bias, therefore leading to loss of statistical efficiency and accuracy. We provide a new estimator that views time as a continuous strata and performs poststratification with a data-dependent number of stratification levels. Without making parametric assumptions, we prove a central limit theorem for the proposed estimator and show that the estimator attains the best achievable asymptotic variance and is asymptotically unbiased. Third, we propose a time-grouped randomization that is designed to balance treatment and control assignments at granular time scales. We show that when the time-grouped randomization is integrated to standard experimental designs to generate experiment data, simple A/B test estimators can achieve asymptotically optimal variance. A brief account of numerical experiments are conducted to illustrate the analysis. This paper was accepted by Baris Ata, stochastic models and simulation. Supplemental Material: The online appendices and data files are available at https://doi.org/10.1287/mnsc.2022.01205 .more » « less
An official website of the United States government
