Sparseness-constrained nonnegative tensor factorization for detecting topics at different time scales

Kassab, Lara; Kryshchenko, Alona; Lyu, Hanbaek; Molitor, Denali; Needell, Deanna; Rebrova, Elizaveta; Yuan, Jiahong

doi:10.3389/fams.2024.1287074

Citation Details

Sparseness-constrained nonnegative tensor factorization for detecting topics at different time scales

Temporal text data, such as news articles or Twitter feeds, often comprises a mixture of long-lasting trends and transient topics. Effective topic modeling strategies should detect both types and clearly locate them in time. We first demonstrate that nonnegative CANDECOMP/PARAFAC decomposition (NCPD) can automatically identify topics of variable persistence. We then introduce sparseness-constrained NCPD (S-NCPD) and its online variant to control the duration of the detected topics more effectively and efficiently, along with theoretical analysis of the proposed algorithms. Through an extensive study on both semi-synthetic and real-world datasets, we find that our S-NCPD and its online variant can identify both short- and long-lasting temporal topics in a quantifiable and controlled manner, which traditional topic modeling methods are unable to achieve. Additionally, the online variant of S-NCPD shows a faster reduction in reconstruction error and results in more coherent topics compared to S-NCPD, thus achieving both computational efficiency and quality of the resulting topics. Our findings indicate that S-NCPD and its online variant are effective tools for detecting and controlling the duration of topics in temporal text data, providing valuable insights into both persistent and transient trends. more »

Award ID(s):: 2011140 2023239 2206296 2232241

PAR ID:: 10529237

Author(s) / Creator(s):: Kassab, Lara; Kryshchenko, Alona; Lyu, Hanbaek; Molitor, Denali; Needell, Deanna; Rebrova, Elizaveta; Yuan, Jiahong

Publisher / Repository:: Frontiers

Date Published:: 2024-07-22

Journal Name:: Frontiers in Applied Mathematics and Statistics

Volume:: 10

ISSN:: 2297-4687

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.3389/fams.2024.1287074

More Like this