Do multimodal large language models understand welding?

Khvatskii, Grigorii; Lee, Yong Suk; Angst, Corey; Gibbs, Maria; Landers, Robert; Chawla, Nitesh V

doi:10.1016/j.inffus.2025.103121

Citation Details

This content will become publicly available on August 1, 2026

Do multimodal large language models understand welding?

This paper examines the performance of Multimodal LLMs (MLLMs) in skilled production work, with a focus on welding. Using a novel data set of real-world and online weld images, annotated by a domain expert, we evaluate the performance of two state-of-the-art MLLMs in assessing weld acceptability across three contexts: RV & Marine, Aeronautical, and Farming. While both models perform better on online images, likely due to prior exposure or memorization, they also perform relatively well on unseen, real-world weld images. Additionally, we introduce WeldPrompt, a prompting strategy that combines Chain-of-Thought generation with in-context learning to mitigate hallucinations and improve reasoning. WeldPrompt improves model recall in certain contexts but exhibits inconsistent performance across others. These results underscore the limitations and potentials of MLLMs in high-stakes technical domains and highlight the importance of fine-tuning, domain-specific data, and more sophisticated prompting strategies to improve model reliability. The study opens avenues for further research into multimodal learning in industry applications. more »

Award ID(s):: 2222751

PAR ID:: 10640736

Author(s) / Creator(s):: Khvatskii, Grigorii; Lee, Yong Suk; Angst, Corey; Gibbs, Maria; Landers, Robert; Chawla, Nitesh V

Publisher / Repository:: Elsevier

Date Published:: 2025-08-01

Journal Name:: Information fusion

Volume:: 120

ISSN:: 1872-6305

Subject(s) / Keyword(s):: AI in manufacturing Multimodal Large Language Models (MLLMs) Welding Skilled production work Real-world image classification WeldPrompt

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
This content will become publicly available on August 1, 2026
Journal Article:
https://doi.org/10.1016/j.inffus.2025.103121

More Like this