Abstract Text augmentation is an effective technique in alleviating overfitting in NLP tasks. In existing methods, text augmentation and downstream tasks are mostly performed separately. As a result, the augmented texts may not be optimal to train the downstream model. To address this problem, we propose a three-level optimization framework to perform text augmentation and the downstream task end-to- end. The augmentation model is trained in a way tailored to the downstream task. Our framework consists of three learning stages. A text summarization model is trained to perform data augmentation at the first stage. Each summarization example is associated with a weight to account for its domain difference with the text classification data. At the second stage, we use the model trained at the first stage to perform text augmentation and train a text classification model on the augmented texts. At the third stage, we evaluate the text classification model trained at the second stage and update weights of summarization examples by minimizing the validation loss. These three stages are performed end-to-end. We evaluate our method on several text classification datasets where the results demonstrate the effectiveness of our method. Code is available at https://github.com/Sai-Ashish/End-to-End-Text-Augmentation.
more »
« less
This content will become publicly available on April 1, 2026
System-Level Design Space Exploration for High-Level Synthesis Under End-to-End Latency Constraints
- Award ID(s):
- 1844952
- PAR ID:
- 10625524
- Publisher / Repository:
- IEEE
- Date Published:
- Journal Name:
- IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
- Volume:
- 44
- Issue:
- 4
- ISSN:
- 0278-0070
- Page Range / eLocation ID:
- 1354 to 1365
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Convolutional Neural Networks (CNNs) are widely used due to their effectiveness in various AI applications such as object recognition, speech processing, etc., where the multiply-and-accumulate (MAC) operation contributes to ∼95% of the computation time. From the hardware implementation perspective, the performance of current CMOS-based MAC accelerators is limited mainly due to their von-Neumann architecture and corresponding limited memory bandwidth. In this way, silicon photonics has been recently explored as a promising solution for accelerator design to improve the speed and power-efficiency of the designs as opposed to electronic memristive crossbars. In this work, we briefly study recent silicon photonics accelerators and take initial steps to develop an open-source and adaptive crossbar architecture simulator for that. Keeping the original functionality of the MNSIM tool [1], we add a new photonic mode that utilizes the pre-existing algorithm to work with a photonic Phase Change Memory (pPCM) based crossbar structure. With inputs from the CNN's topology, the accelerator configuration, and experimentally-benchmarked data, the presented simulator can report the optimal crossbar size, the number of crossbars needed, and the estimation of total area, power, and latency.more » « less
-
Disfluency detection and classification on children’s speech has a great potential for teaching reading skills. Word-level assessment of children’s speech can help teachers to effectively gauge their students’ progress. Hence, we propose a novel attention-based model to perform word-level disfluency detection and classification in a fully end-to-end (E2E) manner making it fast and easy to use. We develop a word-level disfluency annotation scheme using which we annotate a dataset of children read speech, the reading races dataset (READR). We also annotate disfluencies in the existing CMU Kids corpus. The proposed model significantly outperforms traditional cascaded baselines, which use forced alignments, on both datasets. To deal with the inevitable class-imbalance in the datasets, we propose a novel technique called HiDeC (Hierarchical Detection and Classification) which yields a detection improvement of 23% and 16% and a classification improvement of 3.8% and 19.3% relative F1-score on the READR and CMU Kids datasets respectively.more » « less
-
Dialog history enhances downstream classification performance in both speech and text based dialog systems. However, there still exists a gap in dialog history integration in a fully end-to-end (E2E) spoken dialog system (SDS) versus a textual dia- log system. Text-based dialog systems use large language models (LLMs) to encode long-range dependencies by attending to the entire conversation as a contiguous token sequence. This is not possible in an E2E SDS, as speech sequences can be intractably long. We propose a convolution subsampling approach to make the speech sequence of a conversation tractable and use a conformer to attend to the speech-based conversation in a fine-grained manner. This model is further enhanced via a conversation-level knowledge transfer from a LLM using a token-level alignment strategy. Finetuning the E2E model pretrained this way gives significant gains, of up to 8%, over strong non-contextual baselines in the E2E dialog act classification task on two datasets.more » « less
An official website of the United States government
