Most of the existing automated code compliance checking (ACC) methods are unable to fully automatically convert complex building-code requirements into computer-processable forms. Such complex requirements usually have hierarchically complex clause and sentence structures. There is, thus, a need to decompose such complex requirements into hierarchies of much smaller, manageable requirement units that would be processable using most of the existing ACC methods. Rule-based methods have been used to deal with such complex requirements and have achieved high performance. However, they lack scalability, because the rules are developed manually and need to be updated and/or adapted when applied to a different type of building code. More research is, thus, needed to develop a scalable method to automatically convert the complex requirements into hierarchies of requirement units to facilitate the succeeding steps of ACC such as information extraction and compliance reasoning. To address this need, this paper proposes a new, machine learning-based method to automatically extract requirement hierarchies from building codes. The proposed method consists of five main steps: (1) data preparation and preprocessing; (2) data adaptation; (3) deep neural network model training for dependency parsing; (4) automated requirement segmentation and restriction interpretation based on the extracted dependencies; and (5) evaluation. The proposed method was trained using the English Treebank data; and was tested on sentences from the 2009 International Building Code (IBC) and the Champaign 2015 IBC Amendments. The preliminary results show that the proposed method achieved an average normalized edit distance of 0.32, a precision of 89%, a recall of 76%, and an F1-measure of 82%, which indicates good requirement hierarchy extraction performance.
more »
« less
Interactive Visual Representation of Inter-Connected Requirements in Building Codes
To facilitate a better understanding of building codes, the visualization of the embedded structures of the provisions and requirements of the codes is needed. Existing research efforts in building code compliance checking mostly do not purposefully represent building codes in formats that facilitate human understanding and interaction with the codes, such as XML and hypertext (text with links to other text). Visual programming commonly represents building codes more visually as flowcharts. However, flowcharts are static, and the generation of flowcharts is still manual. To address this lack of interactive visual representation of building code requirement structures, this paper proposes an automated building code structure extraction and visualization method for visualizing building code contents in a way that clearly shows the inter-connections between requirements and allows intuitive user interaction. In this method, to extract the chapter-section-subsection hierarchical structure and cross-reference structure, a new extraction method named Building Code Network Generator (BCNG) is proposed to automatically generate an interactive visualization using a directed network. The performance of the proposed BCNG was empirically tested on Chapters 5 and 10 of the International Building Code 2015, with a resulting precision, recall, and F1-score of 99.4%, 96.3%, and 97.8%, respectively. In addition, the extracted hierarchical and cross-reference structures were displayed using an open-source network visualization tool to facilitate human understanding and interactions with the building code requirements in automated compliance checking systems.
more »
« less
- Award ID(s):
- 1827733
- PAR ID:
- 10324475
- Date Published:
- Journal Name:
- Construction Research Congress 2022
- Page Range / eLocation ID:
- 1004 to 1012
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
One main challenge in the full automation of building code compliance checking is in the extraction and transformation of building code requirements into computable representations. Semantic rule-based approach has been taken mainly due to its expected better performance than machine learning-based approach on this particular task. With the recent advancement in deep learning AI, particularly the launch of ChatGPT by OpenAI, there is a potential for this landscape to be shifted given the highly regarded capabilities of ChatGPT in processing (i.e., understanding and generating) natural language texts and computer codes. In this paper, the author preliminarily explored the use of ChatGPT in converting (i.e., extracting and transforming) building code requirements into computer codes, and compared it with the results from cutting-edge semantic rule-based approach. It was found that comparing to the semantic rule-based approach, the conversion results from ChatGPT still has limitations, but there is a great potential for it to help speed up the implementation and scale-up of automated building code compliance checking systems.more » « less
-
Existing automated code checking methods/tools are unable to automatically analyze and represent all types of requirements (e.g., requirements that are too complex or that require human judgement). Recent efforts in the area of augmented data analytics have proposed the use of templates to facilitate the analysis of text. However, most of these efforts have constructed such templates manually, which is labor-intensive. More importantly, it is difficult for manually-developed templates to capture the linguistic variations in building codes. More research is, thus, needed to automate the generation of templates to support the tagging and extraction of information from building codes. To address this need, this paper proposes an unsupervised machine-learning based method to extract sentence templates that describe syntactic and semantic features and patterns from building codes. The proposed method is composed of four main steps: (1) data preprocessing; (2) identifying the different groups of sentence fragments using clustering; (3) identifying the fixed parts and the slots in the templates based on the syntactic and semantic patterns of the sentence fragment groups; and (4) evaluating the extracted templates. The proposed method was implemented and tested on a corpus of text from the International Building Code. An accuracy of 0.76 was achieved.more » « less
-
As the number, size and complexity of building construction projects increase, code compliance checking becomes more challenging because of the time-consuming, costly, and error-prone nature of a manual checking process. A fully automated code compliance checking would be desirable in facilitating a more efficient, cost effective, and human error-proof code checking. Such automation requires automated information extraction from building designs and building codes, and automated information transformation to a format that allows automated reasoning. Natural Language Processing (NLP) is an important technology to support such automated processing of building codes, because building codes are represented in natural language texts. Part-of-speech (POS) tagging, as an important basis of NLP tasks, must have a high performance to ensure the quality of the automated processing of building codes in such a compliance checking system. However, no systematic testing of existing POS taggers on domain specific building codes data have been performed. To address this gap, the authors analyzed the performance of seven state-of-the-at POS taggers on tagging building codes and compared their results to a manually-labeled gold standard. The authors aim to: (1) find the best performing tagger in terms of accuracy, and (2) identify common sources of errors. In providing the POS tags, the authors used the Penn Treebank tagset, which is a widely used tagset with a proper balance between conciseness and information richness. An average accuracy of 88.80% was found on the testing data. The Standford coreNLP tagger outperformed the other taggers in the experiment. Common sources of errors were identified to be: (1) word ambiguity, (2) rare words, and (3) unique meaning of common English words in the construction context. The found result of machine taggers on building codes calls for performance improvement, such as error-fixing transformational rules and machine taggers that are trained on building codes.more » « less
-
Traditional manual building code compliance checking is costly, time-consuming, and human error-prone. With the adoption of Building Information Modeling (BIM), automation in such a checking process becomes more feasible. However, existing methods still face limited automation when applied to different building codes. To address that, in this paper, the authors proposed a new framework that requires minimal input from users and strives for full automation, namely, the Invariant signature, logic reasoning, and Semantic Natural language processing (NLP)-based Automated building Code compliance Checking (I-SNACC) framework. The authors developed an automated building code compliance checking (ACC) prototype system under this framework and tested it on Chapter 10 of the International Building Codes 2015 (IBC 2015). The system was tested on two real projects and achieved 95.2% precision and 100% recall in non-compliance detection. The experiment showed that the framework is promising in building code compliance checking. Compared to the state-of-the-art methods, the new framework increases the degree of automation and saves manual efforts for finding non-compliance cases.more » « less
An official website of the United States government

