Structured point process data harvested from various platforms poses new challenges to the machine learning community. To cluster repeatedly observed marked point processes, we propose a novel mixture model of multi-level marked point processes for identifying potential heterogeneity in the observed data. Specifically, we study a matrix whose entries are marked log-Gaussian Cox processes and cluster rows of such a matrix. An efficient semi-parametric Expectation-Solution (ES) algorithm combined with functional principal component analysis (FPCA) of point processes is proposed for model estimation. The effectiveness of the proposed framework is demonstrated through simulation studies and real data analyses.
more »
« less
Smoothability of relative stable maps to stacky curves
Using log geometry, we study smoothability of genus zero twisted stable mapsto stacky curves relative to a collection of marked points. One application isto smoothing semi-log canonical fibered surfaces with marked singular fibers.
more »
« less
- PAR ID:
- 10428571
- Date Published:
- Journal Name:
- Épijournal de Géométrie Algébrique
- Volume:
- Volume 7
- ISSN:
- 2491-6765
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The manner in which acoustic features contribute to perceiving speaker identity remains unclear. In an attempt to better understand speaker perception, we investigated human and machine speaker discrimination with utterances shorter than 2 seconds. Sixty-five listeners performed a same vs. different task. Machine performance was estimated with i-vector/PLDA-based automatic speaker verification systems, one using mel-frequency cepstral coefficients (MFCCs) and the other using voice quality features (VQual2) inspired by a psychoacoustic model of voice quality. Machine performance was measured in terms of the detection and log-likelihood-ratio cost functions. Humans showed higher confidence for correct target decisions compared to correct non-target decisions, suggesting that they rely on different features and/or decision making strategies when identifying a single speaker compared to when distinguishing between speakers. For non-target trials, responses were highly correlated between humans and the VQual2-based system, especially when speakers were perceptually marked. Fusing human responses with an MFCC-based system improved performance over human-only or MFCC-only results, while fusing with the VQual2-based system did not. The study is a step towards understanding human speaker discrimination strategies and suggests that automatic systems might be able to supplement human decisions especially when speakers are marked.more » « less
-
Abstract The 1-s-resolution U.S. radiosonde data are analyzed for unstable layers, where the potential temperature decreases with increasing altitude, in the troposphere and lower stratosphere (LS). Care is taken to exclude spurious unstable layers arising from noise in the soundings and also to allow for the destabilizing influence of water vapor in saturated layers. Riverton, Wyoming, and Greensboro, North Carolina, in the extratropics, are analyzed in detail, where it is found that the annual and diurnal variations are largest, and the interannual variations are smallest in the LS. More unstable layer occurrences in the LS at Riverton are found at 0000 UTC, while at Greensboro, more unstable layer occurrences in the LS are at 1200 UTC, consistent with a geographical pattern where greater unstable layer occurrences in the LS are at 0000 UTC in the western United States, while greater unstable layer occurrences are at 1200 UTC in the eastern United States. The picture at Koror, Palau, in the tropics is different in that the diurnal and interannual variations in unstable layer occurrences in the LS are largest, with much smaller annual variations. At Koror, more frequent unstable layer occurrences in the LS occur at 0000 UTC. Also, a “notch” in the frequencies of occurrence of thin unstable layers at about 12 km is observed at Koror, with large frequencies of occurrence of thick layers at that altitude. Histograms are produced for the two midlatitude stations and one tropical station analyzed. The log–log slopes for troposphere histograms are in reasonable agreement with earlier results, but the LS histograms show a steeper log–log slope, consistent with more thin unstable layers and fewer thick unstable layers there. Some radiosonde stations are excluded from this analysis since a marked change in unstable layer occurrences was identified when a change in radiosonde instrumentation occurred.more » « less
-
null (Ed.)Data scientists have embraced computational notebooks to author analysis code and accompanying visualizations within a single document. Currently, although these media may be interleaved, they remain siloed: interactive visualizations must be manually specified as they are divorced from the analysis provenance expressed via dataframes, while code cells have no access to users' interactions with visualizations, and hence no way to operate on the results of interaction. To bridge this divide, we present B2, a set of techniques grounded in treating data queries as a shared representation between the code and interactive visualizations. B2 instruments data frames to track the queries expressed in code and synthesize corresponding visualizations. These visualizations are displayed in a dashboard to facilitate interactive analysis. When an interaction occurs, B2 reifies it as a data query and generates a history log in a new code cell. Subsequent cells can use this log to further analyze interaction results and, when marked as reactive, to ensure that code is automatically recomputed when new interaction occurs. In an evaluative study with data scientists, we find that B2 promotes a tighter feedback loop between coding and interacting with visualizations. All participants frequently moved from code to visualization and vice-versa, which facilitated their exploratory data analysis in the notebook.more » « less
-
Continuous-time event data are common in applications such as individual behavior data, financial transactions, and medical health records. Modeling such data can be very challenging, in particular for applications with many different types of events, since it requires a model to predict the event types as well as the time of occurrence. Recurrent neural networks that parameterize time-varying intensity functions are the current state-of-the-art for predictive modeling with such data. These models typically assume that all event sequences come from the same data distribution. However, in many applications event sequences are generated by different sources, or users, and their characteristics can be very different. In this paper, we extend the broad class of neural marked point process models to mixtures of latent embeddings, where each mixture component models the characteristic traits of a given user. Our approach relies on augmenting these models with a latent variable that encodes user characteristics, represented by a mixture model over user behavior that is trained via amortized variational inference. We evaluate our methods on four large real-world datasets and demonstrate systematic improvements from our approach over existing work for a variety of predictive metrics such as log-likelihood, next event ranking, and source-of-sequence identification.more » « less
An official website of the United States government

