skip to main content

This content will become publicly available on January 1, 2024

Title: What Do NLP Researchers Believe? Results of the NLP Community Metasurvey
Award ID(s):
Author(s) / Creator(s):
; ; ; ; ; ; ; ; ; ;
Date Published:
Journal Name:
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Page Range / eLocation ID:
16334 to 16368
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Traditional manual building code compliance checking is costly, time-consuming, and human error-prone. With the adoption of Building Information Modeling (BIM), automation in such a checking process becomes more feasible. However, existing methods still face limited automation when applied to different building codes. To address that, in this paper, the authors proposed a new framework that requires minimal input from users and strives for full automation, namely, the Invariant signature, logic reasoning, and Semantic Natural language processing (NLP)-based Automated building Code compliance Checking (I-SNACC) framework. The authors developed an automated building code compliance checking (ACC) prototype system under this framework and tested it on Chapter 10 of the International Building Codes 2015 (IBC 2015). The system was tested on two real projects and achieved 95.2% precision and 100% recall in non-compliance detection. The experiment showed that the framework is promising in building code compliance checking. Compared to the state-of-the-art methods, the new framework increases the degree of automation and saves manual efforts for finding non-compliance cases. 
    more » « less
  2. Even with many successful phishing email detectors, phishing emails still cost businesses and individuals millions of dollars per year. Most of these models seem to ignore features like word count, stopword count, and punctuations; they use features like n-grams and part of speech tagging. Previous phishing email research ignores or removes the stopwords, and features relating to punctuation only count as a minor part of the detector. Even with a strong unconventional focus on features like word counts, stopwords, punctuation, and uniqueness factors, an ensemble learning model based on a linear kernel SVM gave a true positive rate of 83% and a true negative rate of 96%. Moreover, these features are robustly detected even in noisy email data. It is much easier to detect our features than correct part-of-speech tags or named entities in emails. 
    more » « less