Vision Transformers (ViTs) have shown impressive per-formance but still require a high computation cost as compared to convolutional neural networks (CNNs), one rea-son is that ViTs' attention measures global similarities and thus has a quadratic complexity with the number of in-put tokens. Existing efficient ViTs adopt local attention or linear attention, which sacrifice ViTs' capabilities of capturing either global or local context. In this work, we ask an important research question: Can ViTs learn both global and local context while being more efficient during inference? To this end, we propose a framework called Castling- ViT, which trains ViTs using both linear-angular attention and masked softmax-based quadratic attention, but then switches to having only linear-angular attention during inference. Our Castling- ViT leverages angular ker-nels to measure the similarities between queries and keys via spectral angles. And we further simplify it with two techniques: (1) a novel linear-angular attention mechanism: we decompose the angular kernels into linear terms and high-order residuals, and only keep the linear terms; and (2) we adopt two parameterized modules to approximate high-order residuals: a depthwise convolution and an aux-iliary masked softmax attention to help learn global and lo-cal information, where the masks for softmax attention are regularized to gradually become zeros and thus incur no overhead during inference. Extensive experiments validate the effectiveness of our Castling- ViT, e.g., achieving up to a 1.8% higher accuracy or 40% MACs reduction on classification and 1.2 higher mAP on detection under comparable FLOPs, as compared to ViTs with vanilla softmax-based at-tentions. Project page is available at here. 
                        more » 
                        « less   
                    This content will become publicly available on April 19, 2026
                            
                            Transformer-based approach for printing quality recognition in fused filament fabrication
                        
                    
    
            Ensuring high-quality prints in additive manufacturing is a critical challenge due to the variability in materials, process parameters, and equipment. Machine learning models are increasingly being employed for real-time quality monitoring, enabling the detection and classification of defects such as under-extrusion and over-extrusion. Vision Transformers (ViTs), with their global self-attention mechanisms, offer a promising alternative to traditional convolutional neural networks (CNNs). This paper presents a transformer-based approach for print quality recognition in additive manufacturing technologies, with a focus on fused filament fabrication (FFF), leveraging advanced self-supervised representation learning techniques to enhance the robustness and generalizability of ViTs. We show that the ViT model effectively classifies printing quality into different levels of extrusion, achieving exceptional performance across varying dataset scales and noise levels. Training evaluations show a steady decrease in cross-entropy loss, with prediction accuracy, precision, recall, and the harmonic mean of precision and recall (F1) scores reaching close to 1 within 40 epochs, demonstrating excellent performance across all classes. The macro and micro F1 scores further emphasize the ability of ViT to handle both class imbalance and instance-level accuracy effectively. Our results also demonstrate that ViT outperforms CNN in all scenarios, particularly in noisy conditions and with small datasets. Comparative analysis reveals ViT advantages, particularly in leveraging global self-attention and robust feature extraction methods, enhancing its ability to generalize effectively and remain resilient with limited data. These findings underline the potential of the transformer-based approach as a scalable, interpretable, and reliable solution to real-time quality monitoring in FFF, addressing key challenges in additive manufacturing defect detection and ensuring process efficiency. 
        more » 
        « less   
        
    
                            - Award ID(s):
- 2323731
- PAR ID:
- 10643106
- Publisher / Repository:
- Springer Nature
- Date Published:
- Journal Name:
- npj Advanced Manufacturing
- Volume:
- 2
- Issue:
- 1
- ISSN:
- 3004-8621
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            The remarkable success of the Transformer model in Natural Language Processing (NLP) is increasingly capturing the attention of vision researchers in contemporary times. The Vision Transformer (ViT) model effectively models long-range dependencies while utilizing a self-attention mechanism by converting image information into meaningful representations. Moreover, the parallelism property of ViT ensures better scalability and model generalization compared to Recurrent Neural Networks (RNN). However, developing robust ViT models for high-risk vision applications, such as self-driving cars, is critical. Deterministic ViT models are susceptible to noise and adversarial attacks and incapable of yielding a level of confidence in output predictions. Quantifying the confidence (or uncertainty) level in the decision is highly important in such real-world applications. In this work, we introduce a probabilistic framework for ViT to quantify the level of uncertainty in the model's decision. We approximate the posterior distribution of network parameters using variational inference. While progressing through non-linear layers, the first-order Taylor approximation was deployed. The developed framework propagates the mean and covariance of the posterior distribution through layers of the probabilistic ViT model and quantifies uncertainty at the output predictions. Quantifying uncertainty aids in providing warning signals to real-world applications in case of noisy situations. Experimental results from extensive simulation conducted on numerous benchmark datasets (e.g., MNIST and Fashion-MNIST) for image classification tasks exhibit 1) higher accuracy of proposed probabilistic ViT under noise or adversarial attacks compared to the deterministic ViT. 2) Self-evaluation through uncertainty becomes notably pronounced as noise levels escalate. Simulations were conducted at the Texas Advanced Computing Center (TACC) on the Lonestar6 supercomputer node. With the help of this vital resource, we completed all the experiments within a reasonable period.more » « less
- 
            The emergence of bio-additive manufacturing marks a crucial advancement in the field of biomedical engineering. For successful biomedical applications including bioprinted organ transplants, ensuring the quality of printed structures poses a significant challenge. Among the major challenges encountered in ensuring the structural integrity of bioprinting, nozzle clogging stands out as one of the frequent concerns in the process. It disrupts the uniform distribution of extrusion pressure, leading to the formation of defective structures. This study focused on detecting defects arising from the irregularities in extrusion pressure. To address this concern, a video-based motion estimation technique, which emerged as a novel non-contact and non-destructive technique for assessing bio 3D printed structures, is employed in this research. While other advancements, including contact-based and laser-based approaches, may offer limited performance due to the soft, lightweight, and translucent nature of bioconstructs. In this study, defective and non-defective ear models are additively manufactured by an extrusion-based bioprinter with pneumatic dispensing. Extrusion pressure was strategically controlled to introduce defective bioprints similar to those caused by nozzle malfunctions. The vibration characteristics of the ear structures are captured by a high-speed camera and analyzed using phase-based motion estimation approaches. In addition to ambient excitations from the printing process, acoustic excitations from a subwoofer are employed to assess its impact on print quality. The increase in extrusion pressure, simulating clogged nozzle issues, resulted in significant changes in the vibration characteristics, including shifts in the resonance frequencies. By monitoring these modal property changes, defective bioconstructs could be reliably determined. These findings suggest that the proposed approach could effectively verify the structural integrity of additively manufactured bioconstructs. Implementing this method along with the real time defect detection technique will significantly enhance the structural integrity of additively manufactured bioconstructs and ultimately improve the production of healthy artificial organs, potentially saving countless lives.more » « less
- 
            Fused deposition modeling (FDMTM), also referred to as fused filament fabrication (FFF) is an additive manufacturing technique in which extruded material is deposited into roads and layers to form complex products. This paper provides a physics-based model for predicting and controlling the effect of compressibility in material extrusion including elasticity in the driven filament and compression of the melt in the hot end. The model is validated with a test part embodying a full factorial design of experiments with three print speeds. The model is used for control and shows elimination of 50% of the associated road width variance due to compressibility, thereby enabling higher quality levels even at higher print speeds.more » « less
- 
            Abstract Wire arc additive manufacturing (WAAM) has gained attention as a feasible process in large-scale metal additive manufacturing due to its high deposition rate, cost efficiency, and material diversity. However, WAAM induces a degree of uncertainty in the process stability and the part quality owing to its non-equilibrium thermal cycles and layer-by-layer stacking mechanism. Anomaly detection is therefore necessary for the quality monitoring of the parts. Most relevant studies have applied machine learning to derive data-driven models that detect defects through feature and pattern learning. However, acquiring sufficient data is time- and/or resource-intensive, which introduces a challenge to applying machine learning-based anomaly detection. This study proposes a multisource transfer learning method that generates anomaly detection models for balling defect detection, thus ensuring quality monitoring in WAAM. The proposed method uses convolutional neural network models to extract sufficient image features from multisource materials, then transfers and fine-tunes the models for anomaly detection in the target material. Stepwise learning is applied to extract image features sequentially from individual source materials, and composite learning is employed to assign the optimal frozen ratio for converging transferred and present features. Experiments were performed using a gas tungsten arc welding-based WAAM process to validate the classification accuracy of the models using low-carbon steel, stainless steel, and Inconel.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
