NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

COUnty aggRegation mixup AuGmEntation (COURAGE) COVID-19 prediction

https://doi.org/10.1038/s41598-021-93545-6

Er, Siawpeng; Yang, Shihao; Zhao, Tuo (December 2021, Scientific Reports)

Abstract The global spread of COVID-19, the disease caused by the novel coronavirus SARS-CoV-2, has casted a significant threat to mankind. As the COVID-19 situation continues to evolve, predicting localized disease severity is crucial for advanced resource allocation. This paper proposes a method named COURAGE (COUnty aggRegation mixup AuGmEntation) to generate a short-term prediction of 2-week-ahead COVID-19 related deaths for each county in the United States, leveraging modern deep learning techniques. Specifically, our method adopts a self-attention model from Natural Language Processing, known as the transformer model, to capture both short-term and long-term dependencies within the time series while enjoying computational efficiency. Our model solely utilizes publicly available information for COVID-19 related confirmed cases, deaths, community mobility trends and demographic information, and can produce state-level predictions as an aggregation of the corresponding county-level predictions. Our numerical experiments demonstrate that our model achieves the state-of-the-art performance among the publicly available benchmark models.
more » « less
Full Text Available
Named Entity Recognition with Small Strongly Labeled and Large Weakly Labeled Data

https://doi.org/10.18653/v1/2021.acl-long.140

Jiang, Haoming; Zhang, Danqing; Cao, Tianyu; Yin, Bing; Zhao, Tuo (August 2021, Annual Meeting of the Association for Computational Linguistics)

Full Text Available
Super Tickets in Pre-Trained Language Models: From Model Compression to Improving Generalization

https://doi.org/10.18653/v1/2021.acl-long.510

Liang, Chen; Zuo, Simiao; Chen, Minshuo; Jiang, Haoming; Liu, Xiaodong; He, Pengcheng; Zhao, Tuo; Chen, Weizhu (July 2021, Annual Meeting of the Association for Computational Linguistics)

Full Text Available
Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

Liu, Hao; Chen, Minshuo; Zhao, Tuo; Liao, Wenjing. (July 2021, International Conference on Machine Learning)

Full Text Available
A Hypergradient Approach to Robust Regression without Correspondence

Xie, Yujia; Mao, Yixiu; Zuo, Simiao; Xu, Hongteng; Ye, Xiaojing; Zhao, Tuo; Zha, Hongyuan. (April 2021, International Conference on Learning Representations)

Full Text Available
Learning to Defend by Learning to Attack

Jiang, Haoming; Chen, Zhehui; Shi, Yuyang; Dai, Bo; Zhao, Tuo. (April 2021, International Conference on Artificial Intelligence and Statistics)

Full Text Available
BOND: Bert-Assisted Open-Domain Named Entity Recognition with Distant Supervision

https://doi.org/10.1145/3394486.3403149

Liang, Chen; Yu, Yue; Jiang, Haoming; Er, Siawpeng; Wang, Ruijia; Zhao, Tuo; Zhang, Chao. (August 2020, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining)

We study the open-domain named entity recognition (NER) prob- lem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly in- complete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework – BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algo- rithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 bench- mark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.
more » « less
Full Text Available
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization

https://doi.org/10.18653/v1/2020.acl-main.197

Jiang, Haoming; He, Pengcheng; Chen, Weizhu; Liu, Xiaodong; Gao, Jianfeng; Zhao, Tuo. (July 2020, Annual Meeting of the Association for Computational Linguistics)

Transfer learning has fundamentally changed the landscape of natural language processing (NLP). Many state-of-the-art models are first pre-trained on a large text corpus and then fine-tuned on downstream tasks. However, due to limited data resources from downstream tasks and the extremely high complexity of pre-trained models, aggressive fine-tuning of- ten causes the fine-tuned model to overfit the training data of downstream tasks and fail to generalize to unseen data. To address such an issue in a principled manner, we propose a new learning framework for robust and efficient fine-tuning for pre-trained models to attain better generalization performance. The pro- posed framework contains two important in- gredients: 1. Smoothness-inducing regulariza- tion, which effectively manages the complex- ity of the model; 2. Bregman proximal point optimization, which is an instance of trust- region methods and can prevent aggressive up- dating. Our experiments show that the pro- posed framework achieves new state-of-the-art performance on a number of NLP tasks includ- ing GLUE, SNLI, SciTail and ANLI. More- over, it also outperforms the state-of-the-art T5 model, which is the largest pre-trained model containing 11 billion parameters, on GLUE.
more » « less
Full Text Available
Transformer Hawkes Process

Zuo, Simiao; Jiang, Haoming; Li, Zichong; Zhao, Tuo; Zha, Hongyuan. (July 2020, International Conference on Machine Learning)

Modern data acquisition routinely produce massive amounts of event sequence data in various domains, such as social media, healthcare, and financial markets. These data often ex- hibit complicated short-term and long-term temporal dependencies. However, most of the ex- isting recurrent neural network-based point process models fail to capture such dependencies, and yield unreliable prediction performance. To address this issue, we propose a Transformer Hawkes Process (THP) model, which leverages the self-attention mechanism to capture long- term dependencies and meanwhile enjoys computational efficiency. Numerical experiments on various datasets show that THP outperforms existing models in terms of both likelihood and event prediction accuracy by a notable margin. Moreover, THP is quite general and can incorpo- rate additional structural knowledge. We provide a concrete example, where THP achieves im- proved prediction performance for learning multiple point processes when incorporating their relational information.
more » « less
Full Text Available
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing

https://doi.org/10.18653/v1/2020.acl-main.165

Jiang, Haoming; Liang, Chen; Wang, Chong; Zhao, Tuo. (July 2020, Annual Meeting of the Association for Computational Linguistics)

Full Text Available

« Prev Next »

Search for: All records