skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: A Survey of Knowledge-Enhanced Text Generation
The goal of text-to-text generation is to make machines express like a human in many applications such as conversation, summarization, and translation. It is one of the most important yet challenging tasks in natural language processing (NLP). Various neural encoder-decoder models have been proposed to achieve the goal by learning to map input text to output text. However, the input text alone often provides limited knowledge to generate the desired output, so the performance of text generation is still far from satisfaction in many real-world scenarios. To address this issue, researchers have considered incorporating (i) internal knowledge embedded in the input text and (ii) external knowledge from outside sources such as knowledge base and knowledge graph into the text generation system. This research topic is known as knowledge-enhanced text generation. In this survey, we present a comprehensive review of the research on this topic over the past five years. The main content includes two parts: (i) general methods and architectures for integrating knowledge into text generation; (ii) specific techniques and applications according to different forms of knowledge data. This survey can have broad audiences, researchers and practitioners, in academia and industry.  more » « less
Award ID(s):
1849816 1901059 2119531
PAR ID:
10334382
Author(s) / Creator(s):
; ; ; ; ; ;
Date Published:
Journal Name:
ACM Computing Surveys
ISSN:
0360-0300
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Co-creation in higher education is the process where students collaborate with instructors in designing the curriculum and associated educational material. This can take place in different scenarios, such as integrating co-creation into an ongoing course, modifying a previously taken course, or while creating a new course. In this Work-In-Progress, we investigate training and formative assessment models for preparing graduate students in engineering to participate as co-creators of educational material on an interdisciplinary topic. The topic of cyber-physical systems engineering and product lifecycle management with application to structural health monitoring is considered in this co-creation project. This entails not only topics from different disciplines of civil, computer, electrical and environmental engineering, business, and information sciences, but also humanistic issues of sustainability, environment, ethical and legal concerns in data-driven decision-making that support the control of cyber-physical systems. Aside from the objective of creating modules accessible to students with different levels of disciplinary knowledge, the goal of this research is to investigate if the co-creation process and the resulting modules also promote interest and engagement in interdisciplinary research. A literature survey of effective training approaches for co-creation and associated educational theories is summarized. For students, essential training components include providing (i) opportunities to align their interests, knowledge, skills, and values with the topic presented; (ii) experiential learning on the topic to help develop and enhance critical thinking and question posing skills, and (iii) safe spaces to reflect, voice their opinions, concerns, and suggestions. In this research we investigate the adaption of project-based learning (PjBL) strategies and practices to support (i) and (ii) and focus groups for participatory action research (PAR) as safe spaces for reflection, feedback, and action in item (iii). The co-creation process is assessed through qualitative analysis of data collected through the PjBL activities and PAR focus groups and other qualitative data (i.e., focus group transcripts, interview transcripts, project materials, fieldnotes, etc.). The eventual outcome of the co-creation process will be an on-line course module that is designed to be integrated in existing engineering graduate and undergraduate courses at four different institutions, which includes two state universities and two that are historically black colleges and universities. 
    more » « less
  2. Cyber-physical systems (CPS) have been increasingly attacked by hackers. CPS are especially vulnerable to attackers that have full knowledge of the system's configuration. Therefore, novel anomaly detection algorithms in the presence of a knowledgeable adversary need to be developed. However, this research is still in its infancy due to limited attack data availability and test beds. By proposing a holistic attack modeling framework, we aim to show the vulnerability of existing detection algorithms and provide a basis for novel sensor-based cyber-attack detection. Stealthy Attack GEneration (SAGE) for CPS serves as a tool for cyber-risk assessment of existing systems and detection algorithms for practitioners and researchers alike. Stealthy attacks are characterized by malicious injections into the CPS through input, output, or both, which produce bounded changes in the detection residue. By using the SAGE framework, we generate stealthy attacks to achieve three objectives: (i) Maximize damage, (ii) Avoid detection, and (iii) Minimize the attack cost. Additionally, an attacker needs to adhere to the physical principles in a CPS (objective iv). The goal of SAGE is to model worst-case attacks, where we assume limited information asymmetries between attackers and defenders (e.g., insider knowledge of the attacker). Those worst-case attacks are the hardest to detect, but common in practice and allow understanding of the maximum conceivable damage. We propose an efficient solution procedure for the novel SAGE optimization problem. The SAGE framework is illustrated in three case studies. Those case studies serve as modeling guidelines for the development of novel attack detection algorithms and comprehensive cyber-physical risk assessment of CPS. The results show that SAGE attacks can cause severe damage to a CPS, while only changing the input control signals minimally. This avoids detection and keeps the cost of an attack low. This highlights the need for more advanced detection algorithms and novel research in cyber-physical security. 
    more » « less
  3. A Natural Language Interface (NLI) enables the use of human languages to interact with computer systems, including smart phones and robots. Compared to other types of interfaces, such as command line interfaces (CLIs) or graphical user interfaces (GUIs), NLIs stand to enable more people to have access to functionality behind databases or APIs as they only require knowledge of natural languages. Many NLI applications involve structured data for the domain (e.g., applications such as hotel booking, product search, and factual question answering.) Thus, to fully process user questions, in addition to natural language comprehension, understanding of structured data is also crucial for the model. In this paper, we study neural network methods for building Natural Language Interfaces (NLIs) with a focus on learning structure data representations that can generalize to novel data sources and schemata not seen at training time. Specifically, we review two tasks related to natural language interfaces: i) semantic parsing where we focus on text-to-SQL for database access, and ii) task-oriented dialog systems for API access. We survey representative methods for text-to-SQL and task-oriented dialog tasks, focusing on representing and incorporating structured data. Lastly, we present two of our original studies on structured data representation methods for NLIs to enable access to i) databases, and ii) visualization APIs. 
    more » « less
  4. This report will discuss the importance of network security. Network Security is important because it prevents hackers from gaining access to data and personal information. The issue in society is that users get their data stolen every day and are scared that their information is blasted out to the world. Within this paper I will talk to you about the importance of network security and how it can change your day-to-day life using cyber security. In addition, I will create a survey for computer science majors to see if network security is important. Also, I will send a survey to a DISA employee to get his perspective on this topic and his comments as well. The best method to incorporate both user input and research into this paper is to use user input to back up the research. User input will be a great addition because it gives the readers a real-world opinion on if this topic is valid. 
    more » « less
  5. Fields in the social sciences, such as education research, have started to expand the use of computer-based research methods to supplement traditional research approaches. Natural language processing techniques, such as topic modeling, may support qualitative data analysis by providing early categories that researchers may interpret and refine. This study contributes to this body of research and answers the following research questions: (RQ1) What is the relative coverage of the latent Dirichlet allocation (LDA) topic model and human coding in terms of the breadth of the topics/themes extracted from the text collection? (RQ2) What is the relative depth or level of detail among identified topics using LDA topic models and human coding approaches? A dataset of student reflections was qualitatively analyzed using LDA topic modeling and human coding approaches, and the results were compared. The findings suggest that topic models can provide reliable coverage and depth of themes present in a textual collection comparable to human coding but require manual interpretation of topics. The breadth and depth of human coding output is heavily dependent on the expertise of coders and the size of the collection; these factors are better handled in the topic modeling approach. 
    more » « less