skip to main content


Title: Crowdsourcing BIM-guided collection of construction material library from site photologs
Abstract Background

With advances in technologies that enabled massive visual data collection and BIM, the AEC industry now has an unprecedented amount of visual data (e.g., images and videos) and BIMs. One of the past efforts to leverage these data includes the Construction Material Library (CML) that was created for inferring construction progress by automatically detecting construction materials. CML has a limited number of construction material classes because it is merely impossible for an individual or a group of researchers to collect all possible variations of construction materials.

Methods

This paper proposes a web-based platform that streamlines the data collection process for creating annotated material patches guided by BIM overlays.

Result

Construction site images with BIM overlays are automatically generated after image-based 3D reconstruction. These images are deployed on a web-based platform for annotations.

Conclusion

The proposed crowdsourcing method using this platform has potential to scale up data collection for expanding the existing CML. A case study was conducted to validate the feasibility of the proposed method and to improve the web interface before deployment to a public cloud environment.

 
more » « less
NSF-PAR ID:
10273159
Author(s) / Creator(s):
;
Publisher / Repository:
Springer Science + Business Media
Date Published:
Journal Name:
Visualization in Engineering
Volume:
5
Issue:
1
ISSN:
2213-7459
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Automated monitoring of dark web (DW) platforms on a large scale is the first step toward developing proactive Cyber Threat Intelligence (CTI). While there are efficient methods for collecting data from the surface web, large-scale dark web data collection is often hindered by anti-crawling measures. In particular, text-based CAPTCHA serves as the most prevalent and prohibiting type of these measures in the dark web. Text-based CAPTCHA identifies and blocks automated crawlers by forcing the user to enter a combination of hard-to-recognize alphanumeric characters. In the dark web, CAPTCHA images are meticulously designed with additional background noise and variable character length to prevent automated CAPTCHA breaking. Existing automated CAPTCHA breaking methods have difficulties in overcoming these dark web challenges. As such, solving dark web text-based CAPTCHA has been relying heavily on human involvement, which is labor-intensive and time-consuming. In this study, we propose a novel framework for automated breaking of dark web CAPTCHA to facilitate dark web data collection. This framework encompasses a novel generative method to recognize dark web text-based CAPTCHA with noisy background and variable character length. To eliminate the need for human involvement, the proposed framework utilizes Generative Adversarial Network (GAN) to counteract dark web background noise and leverages an enhanced character segmentation algorithm to handle CAPTCHA images with variable character length. Our proposed framework, DW-GAN, was systematically evaluated on multiple dark web CAPTCHA testbeds. DW-GAN significantly outperformed the state-of-the-art benchmark methods on all datasets, achieving over 94.4% success rate on a carefully collected real-world dark web dataset. We further conducted a case study on an emergent Dark Net Marketplace (DNM) to demonstrate that DW-GAN eliminated human involvement by automatically solving CAPTCHA challenges with no more than three attempts. Our research enables the CTI community to develop advanced, large-scale dark web monitoring. We make DW-GAN code available to the community as an open-source tool in GitHub. 
    more » « less
  2. Nowadays, to assess and document construction and building performance, large amount of visual data are captured and stored through camera equipped platforms such as wearable cameras, unmanned aerial/ground vehicles, and smart phones. However, due to the nonstop fashion in recording such visual data, not all of the frames in captured consecutive footages are intentionally taken, and thus not every frame is worthy of being processed for construction and building performance analysis. Since many frames will simply have non-construction related contents, before processing the visual data, the content of each recorded frame should be manually investigated depending on the association with the goal of the visual assessment. To address such challenges, this paper aims to automatically filter construction big visual data that requires no human annotations. To overcome challenges in pure discriminative approach using manually labeled images, we construct a generative model with unlabeled visual dataset, and use it to find construction-related frames in big visual dataset from jobsites. First, through composition-based snap point detection together with domain adaptation, we filter and remove most of accidently recorded frames in the footage. Then, we create discriminative classifier trained with visual data from jobsites to eliminate non-construction related images. To evaluate the reliability of the proposed method, we have obtained the ground truth based on human judgment for each photo in our testing dataset. Despite learning without any explicit labels, the proposed method shows a reasonable practical range of accuracy, which generally outperforms prior snap point detection. Through the case studies, the fidelity of the algorithm is discussed in detail. By being able to focus on selective visual data, practitioners will spend less time on browsing large amounts of visual data; rather spend more time on looking at how to leverage the visual data to facilitate decision-makings in built environments. 
    more » « less
  3. Building Information Modelling (BIM) is an integrated informational process and plays a key role in enabling efficient planning and control of a project in the Architecture, Engineering, and Construction (AEC) domain. Industry Foundation Classes (IFC)-based BIM allows building information to be interoperable among different BIM applications. Different stakeholders take different responsibilities in a project and therefore keep different types of information to meet project requirements. In this paper, the authors proposed and adopted a six-step methodology to support BIM interoperability between architectural design and structural analysis at both AEC project level and information level, in which: (1) the intrinsic and extrinsic information transferred between architectural models and structural models were analyzed and demonstrated by a Business Process Model and Notation (BPMN) model that the authors developed; (2) the proposed technical routes with different combinations, and their applications to different project delivery methods provided new instruments to stakeholders in industry for efficient and accurate decision-making; (3) the material centered invariant signature with portability can improve information exchange between different data formats and models to support interoperable BIM applications; and (4) a developed formal material information representation and checking method was tested on a case study where its efficiency was demonstrated to outperform: (1) proprietary representations and information checking method based on a manual operation, and (2) MVD-based information checking method. The proposed invariant signatures-based material information representation and checking method brings a better efficiency for information transfer between architectural design and structural analysis, which can have significant positive effect on a project delivery, due to the frequent and iterative update of a project design. This improves the information transfer and coordination between architects and structural engineers and therefore the efficiency of the whole project. The proposed method can be extended and applied to other application phases and functions such as cost estimation, scheduling, and energy analysis. 
    more » « less
  4. Abstract

    Many vision‐based indoor localization methods require tedious and comprehensive pre‐mapping of built environments. This research proposes a mapping‐free approach to estimating indoor camera poses based on a 3D style‐transferred building information model (BIM) and photogrammetry technique. To address the cross‐domain gap between virtual 3D models and real‐life photographs, a CycleGAN model was developed to transform BIM renderings into photorealistic images. A photogrammetry‐based algorithm was developed to estimate camera pose using the visual and spatial information extracted from the style‐transferred BIM. The experiments demonstrated the efficacy of CycleGAN in bridging the cross‐domain gap, which significantly improved performance in terms of image retrieval and feature correspondence detection. With the 3D coordinates retrieved from BIM, the proposed method can achieve near real‐time camera pose estimation with an accuracy of 1.38 m and 10.1° in indoor environments.

     
    more » « less
  5. Abstract Motivation

    Timetrees depict evolutionary relationships between species and the geological times of their divergence. Hundreds of research articles containing timetrees are published in scientific journals every year. The TimeTree (TT) project has been manually locating, curating and synthesizing timetrees from these articles for almost two decades into a TimeTree of Life, delivered through a unique, user-friendly web interface (timetree.org). The manual process of finding articles containing timetrees is becoming increasingly expensive and time-consuming. So, we have explored the effectiveness of text-mining approaches and developed optimizations to find research articles containing timetrees automatically.

    Results

    We have developed an optimized machine learning system to determine if a research article contains an evolutionary timetree appropriate for inclusion in the TT resource. We found that BERT classification fine-tuned on whole-text articles achieved an F1 score of 0.67, which we increased to 0.88 by text-mining article excerpts surrounding the mentioning of figures. The new method is implemented in the TimeTreeFinder (TTF) tool, which automatically processes millions of articles to discover timetree-containing articles. We estimate that the TTF tool would produce twice as many timetree-containing articles as those discovered manually, whose inclusion in the TT database would potentially double the knowledge accessible to a wider community. Manual inspection showed that the precision on out-of-distribution recently published articles is 87%. This automation will speed up the collection and curation of timetrees with much lower human and time costs.

    Availability and implementation

    https://github.com/marija-stanojevic/time-tree-classification.

    Supplementary information

    Supplementary data are available at Bioinformatics online.

     
    more » « less