skip to main content


Title: Towards Intelligent Agents to Assist in Modular Construction: Evaluation of Datasets Generated in Virtual Environments for AI training
Modular construction aims at overcoming challenges faced by the traditional construction process such as the shortage of skilled workers, fast-track project requirements, and cost associated with on-site productivity losses and recurrent rework. Since manufacturing is done off-site in controlled factory settings, modular construction is associated with increased productivity and better quality control. However, because every construction project is unique and results in distinct work pieces and building elements to be assembled, modular construction factories necessitate better mechanisms to assist workers during the assembly process in order to minimize errors in selecting the pieces to be assembled and idle times while figuring out the next step in an assembly sequence. Machine intelligence provides opportunities for such assistance; however, a challenge is to rapidly generate large datasets with rich contextual data to train such intelligent agents. This work overviews a mechanism to generate such datasets in virtual environments and evaluates the performance of AI models trained using data generated in virtual environments in recognizing the next installation step in modular assembly sequences. Performance of the trained MV-CNN models (with accuracy of 0.97) shows that virtual environments can potentially be used to generate the required datasets for AI without the costly, time-consuming, and labor-intensive investments needed upfront for capturing real-world data.  more » « less
Award ID(s):
2036870
NSF-PAR ID:
10388386
Author(s) / Creator(s):
Date Published:
Journal Name:
Proceedings of the ISARC
ISSN:
2413-5844
Page Range / eLocation ID:
327-333
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Nicewonger, Todd E. ; McNair, Lisa D. ; Fritz, Stacey (Ed.)
    https://pressbooks.lib.vt.edu/alaskanative/ At the start of the pandemic, the editors of this annotated bibliography initiated a remote (i.e., largely virtual) ethnographic research project that investigated how COVID-19 was impacting off-site modular construction practices in Alaska Native communities. Many of these communities are located off the road system and thus face not only dramatically higher costs but multiple logistical challenges in securing licensed tradesmen and construction crews and in shipping building supplies and equipment to their communities. These barriers, as well as the region’s long winters and short building seasons, complicate the construction of homes and related infrastructure projects. Historically, these communities have also grappled with inadequate housing, including severe overcrowding and poor-quality building stock that is rarely designed for northern Alaska’s climate (Marino 2015). Moreover, state and federal bureaucracies and their associated funding opportunities often further complicate home building by failing to accommodate the digital divide in rural Alaska and the cultural values and practices of Native communities.[1] It is not surprising, then, that as we were conducting fieldwork for this project, we began hearing stories about these issues and about how the restrictions caused by the pandemic were further exacerbating them. Amidst these stories, we learned about how modular home construction was being imagined as a possible means for addressing both the complications caused by the pandemic and the need for housing in the region (McKinstry 2021). As a result, we began to investigate how modular construction practices were figuring into emergent responses to housing needs in Alaska communities. We soon realized that we needed to broaden our focus to capture a variety of prefabricated building methods that are often colloquially or idiomatically referred to as “modular.” This included a range of prefabricated building systems (e.g., manufactured, volumetric modular, system-built, and Quonset huts and other reused military buildings[2]). Our further questions about prefabricated housing in the region became the basis for this annotated bibliography. Thus, while this bibliography is one of multiple methods used to investigate these issues, it played a significant role in guiding our research and helped us bring together the diverse perspectives we were hearing from our interviews with building experts in the region and the wider debates that were circulating in the media and, to a lesser degree, in academia. The actual research for each of three sections was carried out by graduate students Lauren Criss-Carboy and Laura Supple.[3] They worked with us to identify source materials and their hard work led to the team identifying three themes that cover intersecting topics related to housing security in Alaska during the pandemic. The source materials collected in these sections can be used in a variety of ways depending on what readers are interested in exploring, including insights into debates on housing security in the region as the pandemic was unfolding (2021-2022). The bibliography can also be used as a tool for thinking about the relational aspects of these themes or the diversity of ways in which information on housing was circulating during the pandemic (and the implications that may have had on community well-being and preparedness). That said, this bibliography is not a comprehensive analysis. Instead, by bringing these three sections together with one another to provide a snapshot of what was happening at that time, it provides a critical jumping off point for scholars working on these issues. The first section focuses on how modular housing figured into pandemic responses to housing needs. In exploring this issue, author Laura Supple attends to both state and national perspectives as part of a broader effort to situate Alaska issues with modular housing in relation to wider national trends. This led to the identification of multiple kinds of literature, ranging from published articles to publicly circulated memos, blog posts, and presentations. These materials are important source materials that will likely fade in the vastness of the Internet and thus may help provide researchers with specific insights into how off-site modular construction was used – and perhaps hyped – to address pandemic concerns over housing, which in turn may raise wider questions about how networks, institutions, and historical experiences with modular construction are organized and positioned to respond to major societal disruptions like the pandemic. As Supple pointed out, most of the material identified in this review speaks to national issues and only a scattering of examples was identified that reflect on the Alaskan context. The second section gathers a diverse set of communications exploring housing security and homelessness in the region. The lack of adequate, healthy housing in remote Alaska communities, often referred to as Alaska’s housing crisis, is well-documented and preceded the pandemic (Guy 2020). As the pandemic unfolded, journalists and other writers reported on the immense stress that was placed on already taxed housing resources in these communities (Smith 2020; Lerner 2021). The resulting picture led the editors to describe in their work how housing security in the region exists along a spectrum that includes poor quality housing as well as various forms of houselessness including, particularly relevant for the context, “hidden homelessness” (Hope 2020; Rogers 2020). The term houseless is a revised notion of homelessness because it captures a richer array of both permanent and temporary forms of housing precarity that people may experience in a region (Christensen et al. 2107). By identifying sources that reflect on the multiple forms of housing insecurity that people were facing, this section highlights the forms of disparity that complicated pandemic responses. Moreover, this section underscores ingenuity (Graham 2019; Smith 2020; Jason and Fashant 2021) that people on the ground used to address the needs of their communities. The third section provides a snapshot from the first year of the pandemic into how CARES Act funds were allocated to Native Alaska communities and used to address housing security. This subject was extremely complicated in Alaska due to the existence of for-profit Alaska Native Corporations and disputes over eligibility for the funds impacted disbursements nationwide. The resources in this section cover that dispute, impacts of the pandemic on housing security, and efforts to use the funds for housing as well as barriers Alaska communities faced trying to secure and use the funds. In summary, this annotated bibliography provides an overview of what was happening, in real time, during the pandemic around a specific topic: housing security in largely remote Alaska Native communities. The media used by housing specialists to communicate the issues discussed here are diverse, ranging from news reports to podcasts and from blogs to journal articles. This diversity speaks to the multiple ways in which information was circulating on housing at a time when the nightly news and radio broadcasts focused heavily on national and state health updates and policy developments. Finding these materials took time, and we share them here because they illustrate why attention to housing security issues is critical for addressing crises like the pandemic. For instance, one theme that emerged out of a recent National Science Foundation workshop on COVID research in the North NSF Conference[4] was that Indigenous communities are not only recovering from the pandemic but also evaluating lessons learned to better prepare for the next one, and resilience will depend significantly on more—and more adaptable—infrastructure and greater housing security. 
    more » « less
  2. Relevance to proposal: This project evaluates the generalizability of real and synthetic training datasets which can be used to train model-free techniques for multi-agent applications. We evaluate different methods of generating training corpora and machine learning techniques including Behavior Cloning and Generative Adversarial Imitation Learning. Our results indicate that the utility-guided selection of representative scenarios to generate synthetic data can have significant improvements on model performance. Paper abstract: Crowd simulation, the study of the movement of multiple agents in complex environments, presents a unique application domain for machine learning. One challenge in crowd simulation is to imitate the movement of expert agents in highly dense crowds. An imitation model could substitute an expert agent if the model behaves as good as the expert. This will bring many exciting applications. However, we believe no prior studies have considered the critical question of how training data and training methods affect imitators when these models are applied to novel scenarios. In this work, a general imitation model is represented by applying either the Behavior Cloning (BC) training method or a more sophisticated Generative Adversarial Imitation Learning (GAIL) method, on three typical types of data domains: standard benchmarks for evaluating crowd models, random sampling of state-action pairs, and egocentric scenarios that capture local interactions. Simulated results suggest that (i) simpler training methods are overall better than more complex training methods, (ii) training samples with diverse agent-agent and agent-obstacle interactions are beneficial for reducing collisions when the trained models are applied to new scenarios. We additionally evaluated our models in their ability to imitate real world crowd trajectories observed from surveillance videos. Our findings indicate that models trained on representative scenarios generalize to new, unseen situations observed in real human crowds. 
    more » « less
  3. Abstract

    Data-driven generative design (DDGD) methods utilize deep neural networks to create novel designs based on existing data. The structure-aware DDGD method can handle complex geometries and automate the assembly of separate components into systems, showing promise in facilitating creative designs. However, determining the appropriate vectorized design representation (VDR) to evaluate 3D shapes generated from the structure-aware DDGD model remains largely unexplored. To that end, we conducted a comparative analysis of surrogate models’ performance in predicting the engineering performance of 3D shapes using VDRs from two sources: the trained latent space of structure-aware DDGD models encoding structural and geometric information and an embedding method encoding only geometric information. We conducted two case studies: one involving 3D car models focusing on drag coefficients and the other involving 3D aircraft models considering both drag and lift coefficients. Our results demonstrate that using latent vectors as VDRs can significantly deteriorate surrogate models’ predictions. Moreover, increasing the dimensionality of the VDRs in the embedding method may not necessarily improve the prediction, especially when the VDRs contain more information irrelevant to the engineering performance. Therefore, when selecting VDRs for surrogate modeling, the latent vectors obtained from training structure-aware DDGD models must be used with caution, although they are more accessible once training is complete. The underlying physics associated with the engineering performance should be paid attention. This paper provides empirical evidence for the effectiveness of different types of VDRs of structure-aware DDGD for surrogate modeling, thus facilitating the construction of better surrogate models for AI-generated designs.

     
    more » « less
  4. Abstract Introduction Digital twins, a form of artificial intelligence, are virtual representations of the physical world. In the past 20 years, digital twins have been utilized to track wind turbines' operations, monitor spacecraft's status, and even create a model of the Earth for climate research. While digital twins hold much promise for the neurocritical care unit, the question remains on how to best establish the rules that govern these models. This model will expand on our group’s existing digital twin model for the treatment of sepsis. Methods The authors of this project collaborated to create a Direct Acyclic Graph (DAG) and an initial series of 20 DELPHI statements, each with six accompanying sub-statements that captured the pathophysiology surrounding the management of acute ischemic strokes in the practice of Neurocritical Care (NCC). Agreement from a panel of 18 experts in the field of NCC was collected through a 7-point Likert scale with consensus defined a-priori by ≥ 80% selection of a 6 (“agree”) or 7 (“strongly agree”). The endpoint of the study was defined as the completion of three separate rounds of DELPHI consensus. DELPHI statements that had met consensus would not be included in subsequent rounds of DELPHI consensus. The authors refined DELPHI statements that did not reach consensus with the guidance of de-identified expert comments for subsequent rounds of DELPHI. All DELPHI statements that reached consensus by the end of three rounds of DELPHI consensus would go on to be used to inform the construction of the digital twin model. Results After the completion of three rounds of DELPHI, 93 (77.5%) statements reached consensus, 11 (9.2%) statements were excluded, and 16 (13.3%) statements did not reach a consensus of the original 120 DELPHI statements. Conclusion This descriptive study demonstrates the use of the DELPHI process to generate consensus among experts and establish a set of rules for the development of a digital twin model for use in the neurologic ICU. Compared to associative models of AI, which develop rules based on finding associations in datasets, digital twin AI created by the DELPHI process are easily interpretable models based on a current understanding of underlying physiology. 
    more » « less
  5. Obeid, Iyad ; Picone, Joseph ; Selesnick, Ivan (Ed.)
    The Neural Engineering Data Consortium (NEDC) is developing a large open source database of high-resolution digital pathology images known as the Temple University Digital Pathology Corpus (TUDP) [1]. Our long-term goal is to release one million images. We expect to release the first 100,000 image corpus by December 2020. The data is being acquired at the Department of Pathology at Temple University Hospital (TUH) using a Leica Biosystems Aperio AT2 scanner [2] and consists entirely of clinical pathology images. More information about the data and the project can be found in Shawki et al. [3]. We currently have a National Science Foundation (NSF) planning grant [4] to explore how best the community can leverage this resource. One goal of this poster presentation is to stimulate community-wide discussions about this project and determine how this valuable resource can best meet the needs of the public. The computing infrastructure required to support this database is extensive [5] and includes two HIPAA-secure computer networks, dual petabyte file servers, and Aperio’s eSlide Manager (eSM) software [6]. We currently have digitized over 50,000 slides from 2,846 patients and 2,942 clinical cases. There is an average of 12.4 slides per patient and 10.5 slides per case with one report per case. The data is organized by tissue type as shown below: Filenames: tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_0a001_00123456_lvl0001_s000.svs tudp/v1.0.0/svs/gastro/000001/00123456/2015_03_05/0s15_12345/0s15_12345_00123456.docx Explanation: tudp: root directory of the corpus v1.0.0: version number of the release svs: the image data type gastro: the type of tissue 000001: six-digit sequence number used to control directory complexity 00123456: 8-digit patient MRN 2015_03_05: the date the specimen was captured 0s15_12345: the clinical case name 0s15_12345_0a001_00123456_lvl0001_s000.svs: the actual image filename consisting of a repeat of the case name, a site code (e.g., 0a001), the type and depth of the cut (e.g., lvl0001) and a token number (e.g., s000) 0s15_12345_00123456.docx: the filename for the corresponding case report We currently recognize fifteen tissue types in the first installment of the corpus. The raw image data is stored in Aperio’s “.svs” format, which is a multi-layered compressed JPEG format [3,7]. Pathology reports containing a summary of how a pathologist interpreted the slide are also provided in a flat text file format. A more complete summary of the demographics of this pilot corpus will be presented at the conference. Another goal of this poster presentation is to share our experiences with the larger community since many of these details have not been adequately documented in scientific publications. There are quite a few obstacles in collecting this data that have slowed down the process and need to be discussed publicly. Our backlog of slides dates back to 1997, meaning there are a lot that need to be sifted through and discarded for peeling or cracking. Additionally, during scanning a slide can get stuck, stalling a scan session for hours, resulting in a significant loss of productivity. Over the past two years, we have accumulated significant experience with how to scan a diverse inventory of slides using the Aperio AT2 high-volume scanner. We have been working closely with the vendor to resolve many problems associated with the use of this scanner for research purposes. This scanning project began in January of 2018 when the scanner was first installed. The scanning process was slow at first since there was a learning curve with how the scanner worked and how to obtain samples from the hospital. From its start date until May of 2019 ~20,000 slides we scanned. In the past 6 months from May to November we have tripled that number and how hold ~60,000 slides in our database. This dramatic increase in productivity was due to additional undergraduate staff members and an emphasis on efficient workflow. The Aperio AT2 scans 400 slides a day, requiring at least eight hours of scan time. The efficiency of these scans can vary greatly. When our team first started, approximately 5% of slides failed the scanning process due to focal point errors. We have been able to reduce that to 1% through a variety of means: (1) best practices regarding daily and monthly recalibrations, (2) tweaking the software such as the tissue finder parameter settings, and (3) experience with how to clean and prep slides so they scan properly. Nevertheless, this is not a completely automated process, making it very difficult to reach our production targets. With a staff of three undergraduate workers spending a total of 30 hours per week, we find it difficult to scan more than 2,000 slides per week using a single scanner (400 slides per night x 5 nights per week). The main limitation in achieving this level of production is the lack of a completely automated scanning process, it takes a couple of hours to sort, clean and load slides. We have streamlined all other aspects of the workflow required to database the scanned slides so that there are no additional bottlenecks. To bridge the gap between hospital operations and research, we are using Aperio’s eSM software. Our goal is to provide pathologists access to high quality digital images of their patients’ slides. eSM is a secure website that holds the images with their metadata labels, patient report, and path to where the image is located on our file server. Although eSM includes significant infrastructure to import slides into the database using barcodes, TUH does not currently support barcode use. Therefore, we manage the data using a mixture of Python scripts and manual import functions available in eSM. The database and associated tools are based on proprietary formats developed by Aperio, making this another important point of community-wide discussion on how best to disseminate such information. Our near-term goal for the TUDP Corpus is to release 100,000 slides by December 2020. We hope to continue data collection over the next decade until we reach one million slides. We are creating two pilot corpora using the first 50,000 slides we have collected. The first corpus consists of 500 slides with a marker stain and another 500 without it. This set was designed to let people debug their basic deep learning processing flow on these high-resolution images. We discuss our preliminary experiments on this corpus and the challenges in processing these high-resolution images using deep learning in [3]. We are able to achieve a mean sensitivity of 99.0% for slides with pen marks, and 98.9% for slides without marks, using a multistage deep learning algorithm. While this dataset was very useful in initial debugging, we are in the midst of creating a new, more challenging pilot corpus using actual tissue samples annotated by experts. The task will be to detect ductal carcinoma (DCIS) or invasive breast cancer tissue. There will be approximately 1,000 images per class in this corpus. Based on the number of features annotated, we can train on a two class problem of DCIS or benign, or increase the difficulty by increasing the classes to include DCIS, benign, stroma, pink tissue, non-neoplastic etc. Those interested in the corpus or in participating in community-wide discussions should join our listserv, nedc_tuh_dpath@googlegroups.com, to be kept informed of the latest developments in this project. You can learn more from our project website: https://www.isip.piconepress.com/projects/nsf_dpath. 
    more » « less