Discovering novel molecules with targeted properties remains a formidable challenge in materials science, often likened to finding a needle in a haystack. Traditional experimental approaches are slow, costly, and inefficient. In this study, we present an inverse design framework based on a molecular graph conditional variational autoencoder (CVAE) that enables the generation of new molecules with user-specified optical properties, particularly molar extinction coefficient ($$\varepsilon$$). Our model encodes molecular graphs, derived from SMILES strings, into a structured latent space, and then decodes them into valid molecular structures conditioned on a target $$\varepsilon$$ value. Trained on a curated dataset of known molecules with corresponding extinction coefficients, the CVAE learns to generate chemically valid structures, as verified by RDKit. Subsequent Density Functional Theory (DFT) simulations confirm that many of the generated molecules exhibit the electronic structures similar to those molecules with desired $$\varepsilon$$ values. We have also verified the $$\varepsilon$$ values of the generated molecules using a graph neural network (GNN) and the synthesizability of those molecules using an open-source module named ASKCOS. This approach demonstrates the potential of CVAEs to accelerate molecular discovery by enabling user-guided, property-driven molecule generation -- offering a scalable, data-driven alternative to traditional trial-and-error synthesis. 
                        more » 
                        « less   
                    This content will become publicly available on June 23, 2026
                            
                            Potency of Latent Spaces in Inverse Quantum Dye Design
                        
                    
    
            The discovery of functional dye materials with superior optical properties is crucial for advancing technologies in biomedical imaging, organic photovoltaics, and quantum information systems. Recent advancements highlight the need to accelerate this discovery process by integrating computational strategies with experimental methods. In this regard, we have employed a computational approach to explore the latent space of dye materials, utilizing swarm optimization techniques to efficiently navigate complex chemical spaces and identify optimal values of molecular properties using machine learning methods based on target properties, such as high extinction coefficients ($$\varepsilon$$). The latent space based evaluation outperformed all available features of a domain. This approach enhances inverse material design by systematically correlating molecular parameters with desired optical characteristics by implementing VAEs. In this process, by defining target properties as inputs, the model effectively determines the key molecular features necessary for engineering high-performance dye compounds. 
        more » 
        « less   
        
    
    
                            - PAR ID:
- 10631942
- Publisher / Repository:
- ACM
- Date Published:
- ISBN:
- 9798400714627
- Page Range / eLocation ID:
- 1 to 7
- Format(s):
- Medium: X
- Location:
- Columbus USA
- Sponsoring Org:
- National Science Foundation
More Like this
- 
            
- 
            Graph neural networks (GNNs) have been used extensively for addressing problems in drug design and discovery. Both ligand and target molecules are represented as graphs with node and edge features encoding information about atomic elements and bonds respectively. Although existing deep learning models perform remarkably well at predicting physicochemical properties and binding affinities, the generation of new molecules with optimized properties remains challenging. Inherently, most GNNs perform poorly in whole-graph representation due to the limitations of the message-passing paradigm. Furthermore, step-by-step graph generation frameworks that use reinforcement learning or other sequential processing can be slow and result in a high proportion of invalid molecules with substantial post-processing needed in order to satisfy the principles of stoichiometry. To address these issues, we propose a representation-first approach to molecular graph generation. We guide the latent representation of an autoencoder by capturing graph structure information with the geometric scattering transform and apply penalties that structure the representation also by molecular properties. We show that this highly structured latent space can be directly used for molecular graph generation by the use of a GAN. We demonstrate that our architecture learns meaningful representations of drug datasets and provides a platform for goal-directed drug synthesis.more » « less
- 
            Generating molecular structures with desired properties is a critical task with broad applications in drug discovery and materials design. We propose 3M-Diffusion, a novel multi-modal molecular graph generation method, to generate diverse, ideally novel molecular structures with desired properties. 3M-Diffusion encodes molecular graphs into a graph latent space which it then aligns with the text space learned by encoder based LLMs from textual descriptions. It then reconstructs the molecular structure and atomic attributes based on the given text descriptions using the molecule decoder. It then learns a probabilistic mapping from the text space to the latent molecular graph space using a diffusion model. The results of our extensive experiments on several datasets demonstrate that 3M-Diffusion can generate high-quality, novel and diverse molecular graphs that semantically match the textual description provided. The code is available on github.more » « less
- 
            This review provides an overview of the fabrication methods for Ti3C2Tx MXene-based hybrid photocatalysts and evaluates their role in degrading organic dye pollutants. Ti3C2Tx MXene has emerged as a promising material for hybrid photocatalysts due to its high metallic conductivity, excellent hydrophilicity, strong molecular adsorption, and efficient charge transfer. These properties facilitate faster charge separation and minimize electron–hole recombination, leading to exceptional photodegradation performance, long-term stability, and significant attention in dye degradation applications. Ti3C2Tx MXene-based hybrid photocatalysts significantly improve dye degradation efficiency, as evidenced by higher percentage degradation and reduced degradation time compared to conventional semiconducting materials. This review also highlights computational techniques employed to assess and enhance the performance of Ti3C2Tx MXene-based hybrid photocatalysts for dye degradation. It identifies the challenges associated with Ti3C2Tx MXene-based hybrid photocatalyst research and proposes potential solutions, outlining future research directions to address these obstacles effectively.more » « less
- 
            ifferent mechanisms are used for the discovery of materials. These include creating a material by trial-and-error process without knowing its properties. Other methods are based on computational simulations or mathematical and statistical approaches, such as Density Functional Theory (DFT). A well-known strategy combines elements to predict their properties and selects a set of those with the properties of interest. Carrying out exhaustive calculations to predict the properties of these found compounds may require a high computational cost. Therefore, there is a need to create methods for identifying materials with a desired set of properties while reducing the search space and, consequently, the computational cost. In this work, we present a genetic algorithm that can find a higher percentage of compounds with specific properties than state-of-the-art methods, such as those based on combinatorial screening. Both methods are compared in the search for ternary compounds in an unconstrained space, using a Deep Neural Network (DNN) to predict properties such as formation enthalpy, band gap, and stability; we will focus on formation enthalpy. As a result, we provide a genetic algorithm capable of finding up to 60% more compounds with atypical values of properties, using DNNs for their prediction.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
