skip to main content


Title: MOSS—Multi-Modal Best Subset Modeling in Smart Manufacturing
Smart manufacturing, which integrates a multi-sensing system with physical manufacturing processes, has been widely adopted in the industry to support online and real-time decision making to improve manufacturing quality. A multi-sensing system for each specific manufacturing process can efficiently collect the in situ process variables from different sensor modalities to reflect the process variations in real-time. However, in practice, we usually do not have enough budget to equip too many sensors in each manufacturing process due to the cost consideration. Moreover, it is also important to better interpret the relationship between the sensing modalities and the quality variables based on the model. Therefore, it is necessary to model the quality-process relationship by selecting the most relevant sensor modalities with the specific quality measurement from the multi-modal sensing system in smart manufacturing. In this research, we adopted the concept of best subset variable selection and proposed a new model called Multi-mOdal beSt Subset modeling (MOSS). The proposed MOSS can effectively select the important sensor modalities and improve the modeling accuracy in quality-process modeling via functional norms that characterize the overall effects of individual modalities. The significance of sensor modalities can be used to determine the sensor placement strategy in smart manufacturing. Moreover, the selected modalities can better interpret the quality-process model by identifying the most correlated root cause of quality variations. The merits of the proposed model are illustrated by both simulations and a real case study in an additive manufacturing (i.e., fused deposition modeling) process.  more » « less
Award ID(s):
1916174
NSF-PAR ID:
10281315
Author(s) / Creator(s):
; ;
Date Published:
Journal Name:
Sensors
Volume:
21
Issue:
1
ISSN:
1424-8220
Page Range / eLocation ID:
243
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Modeling corrosion growth for complex systems such as the oil refinery system is a major challenge since the corrosion process of oil and gas pipelines are inherently stochastic and depends on many factors including exposures to environmental conditions, operating conditions, and electrochemical reactions. Moreover, the number of sensors is usually limited, and sensor data are incomplete and scattering, which hinders the capability of capturing the corrosion growth behaviors. Therefore, this paper proposes Multi-sensor Corrosion Growth Model with Latent Variables to predict the corrosion growth process in oil refinery piping. The proposed model is a combination of the hierarchical clustering algorithm and the vector autoregression (VAR) model. The clustering algorithm aims to find the hidden (i.e., latent) data clusters of the measured time series data, from which the time series from the same cluster will be included in the VAR model to predict the corrosion depth from multiple sensors. The model can capture the relationship between sensor time series data and identify latent variables. A real case study of an oil refinery system, in which in-line inspection (ILI) data were collected, was utilized to validate model. Regarding corrosion growth prediction, the paper compared the prediction accuracy of VAR model with other three forms of power law model, which is widely accepted to expect the time-dependent depth of corrosion such as power function (PF), PF with initiation time of corrosion (PFIT), and PF with initiation time of corrosion and covariates (PFCOV). The results showed that VAR model has the lowest prediction error based on the mean absolute percentage error (MAPE) evaluation for test data. Finally, the proposed model is believed to be useful for dealing with a complex system that has a variety of corrosion growth behaviors, such as the oil refinery system, as well as it can be applied in other real-time applications. 
    more » « less
  2. Training and on-site assistance is critical to help workers master required skills, improve worker productivity, and guarantee the product quality. Traditional training methods lack worker-centered considerations that are particularly in need when workers are facing ever changing demands. In this study, we propose a worker-centered training & assistant system for intelligent manufacturing, which is featured with self-awareness and active-guidance. Multi-modal sensing techniques are applied to perceive each individual worker and a deep learning approach is developed to understand the worker’s behavior and intention. Moreover, an object detection algorithm is implemented to identify the parts/tools the worker is interacting with. Then the worker’s current state is inferred and used for quantifying and assessing the worker performance, from which the worker’s potential guidance demands are analyzed. Furthermore, onsite guidance with multi-modal augmented reality is provided actively and continuously during the operational process. Two case studies are used to demonstrate the feasibility and great potential of our proposed approach and system for applying to the manufacturing industry for frontline workers. 
    more » « less
  3. Introduction: Computed tomography perfusion (CTP) imaging requires injection of an intravenous contrast agent and increased exposure to ionizing radiation. This process can be lengthy, costly, and potentially dangerous to patients, especially in emergency settings. We propose MAGIC, a multitask, generative adversarial network-based deep learning model to synthesize an entire CTP series from only a non-contrasted CT (NCCT) input. Materials and Methods: NCCT and CTP series were retrospectively retrieved from 493 patients at UF Health with IRB approval. The data were deidentified and all images were resized to 256x256 pixels. The collected perfusion data were analyzed using the RapidAI CT Perfusion analysis software (iSchemaView, Inc. CA) to generate each CTP map. For each subject, 10 CTP slices were selected. Each slice was paired with one NCCT slice at the same location and two NCCT slices at a predefined vertical offset, resulting in 4.3K CTP images and 12.9K NCCT images used for training. The incorporation of a spatial offset into the NCCT input allows MAGIC to more accurately synthesize cerebral perfusive structures, increasing the quality of the generated images. The studies included a variety of indications, including healthy tissue, mild infarction, and severe infarction. The proposed MAGIC model incorporates a novel multitask architecture, allowing for the simultaneous synthesis of four CTP modalities: mean transit time (MTT), cerebral blood flow (CBF), cerebral blood volume (CBV), and time to peak (TTP). We propose a novel Physicians-in-the-loop module in the model's architecture, acting as a tunable layer that allows physicians to manually adjust the amount of anatomic detail present in the synthesized CTP series. Additionally, we propose two novel loss terms: multi-modal connectivity loss and extrema loss. The multi-modal connectivity loss leverages the multi-task nature to assert that the mathematical relationship between MTT, CBF, and CBV is satisfied. The extrema loss aids in learning regions of elevated and decreased activity in each modality, allowing for MAGIC to accurately learn the characteristics of diagnostic regions of interest. Corresponding NCCT and CTP slices were paired along the vertical axis. The model was trained for 100 epochs on a NVIDIA TITAN X GPU. Results and Discussion: The MAGIC model’s performance was evaluated on a sample of 40 patients from the UF Health dataset. Across all CTP modalities, MAGIC was able to accurately produce images with high structural agreement between the entire synthesized and clinical perfusion images (SSIMmean=0.801 , UQImean=0.926). MAGIC was able to synthesize CTP images to accurately characterize cerebral circulatory structures and identify regions of infarct tissue, as shown in Figure 1. A blind binary evaluation was conducted to assess the presence of cerebral infarction in both the synthesized and clinical perfusion images, resulting in the synthesized images correctly predicting the presence of cerebral infarction with 87.5% accuracy. Conclusions: We proposed a MAGIC model whose novel deep learning structures and loss terms enable high-quality synthesis of CTP maps and characterization of circulatory structures solely from NCCT images, potentially eliminating the requirement for the injection of an intravenous contrast agent and elevated radiation exposure during perfusion imaging. This makes MAGIC a beneficial tool in a clinical scenario increasing the overall safety, accessibility, and efficiency of cerebral perfusion and facilitating better patient outcomes. Acknowledgements: This work was partially supported by the National Science Foundation, IIS-1908299 III: Small: Modeling Multi-Level Connectivity of Brain Dynamics + REU Supplement, to the University of Florida. 
    more » « less
  4. In multistage manufacturing systems, modeling multiple quality indices based on the process sensing variables is important. However, the classic modeling technique predicts each quality variable one at a time, which fails to consider the correlation within or between stages. We propose a deep multistage multi-task learning framework to jointly predict all output sensing variables in a unified end-to-end learning framework according to the sequential system architecture in the MMS. Our numerical studies and real case study have shown that the new model has a superior performance compared to many benchmark methods as well as great interpretability through developed variable selection techniques. 
    more » « less
  5. In this paper, we present ViTag to associate user identities across multimodal data, particularly those obtained from cameras and smartphones. ViTag associates a sequence of vision tracker generated bounding boxes with Inertial Measurement Unit (IMU) data and Wi-Fi Fine Time Measurements (FTM) from smartphones. We formulate the problem as association by sequence to sequence (seq2seq) translation. In this two-step process, our system first performs cross-modal translation using a multimodal LSTM encoder-decoder network (X-Translator) that translates one modality to another, e.g. reconstructing IMU and FTM readings purely from camera bounding boxes. Second, an association module finds identity matches between camera and phone domains, where the translated modality is then matched with the observed data from the same modality. In contrast to existing works, our proposed approach can associate identities in multi-person scenarios where all users may be performing the same activity. Extensive experiments in real-world indoor and outdoor environments demonstrate that online association on camera and phone data (IMU and FTM) achieves an average Identity Precision Accuracy (IDP) of 88.39% on a 1 to 3 seconds window, outperforming the state-of-the-art Vi-Fi (82.93%). Further study on modalities within the phone domain shows the FTM can improve association performance by 12.56% on average. Finally, results from our sensitivity experiments demonstrate the robustness of ViTag under different noise and environment variations. 
    more » « less