Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Inverse molecular generation is an essential task for drug discovery, and generative models offer a very promising avenue, especially when diffusion models are used. Despite their great success, existing methods are inherently limited by the lack of a semantic latent space that can not be navigated and perform targeted exploration to generate molecules with desired properties. Here, we present a property-guided diffusion model for generating desired molecules, which incorporates a sophisticated diffusion process capturing intricate interactions of nodes and edges within molecular graphs and leverages a time-dependent molecular property classifier to integrate desired properties into the diffusion sampling process. Furthermore, we extend our model to a multi-property-guided paradigm. Experimental results underscore the competitiveness of our approach in molecular generation, highlighting its superiority in generating desired molecules without the need for additional optimization steps.more » « lessFree, publicly-accessible full text available April 14, 2025
-
A bstract We establish an orientifold Calabi-Yau threefold database for h 1 , 1 ( X ) ≤ 6 by considering non-trivial ℤ 2 divisor exchange involutions, using a toric Calabi-Yau database ( www.rossealtman.com/tcy ). We first determine the topology for each individual divisor (Hodge diamond), then identify and classify the proper involutions which are globally consistent across all disjoint phases of the Kähler cone for each unique geometry. Each of the proper involutions will result in an orientifold Calabi-Yau manifold. Then we clarify all possible fixed loci under the proper involution, thereby determining the locations of different types of O -planes. It is shown that under the proper involutions, one typically ends up with a system of O 3 /O 7-planes, and most of these will further admit naive Type IIB string vacua. The geometries with freely acting involutions are also determined. We further determine the splitting of the Hodge numbers into odd/even parity in the orbifold limit. The final result is a class of orientifold Calabi-Yau threefolds with non-trivial odd class cohomology ( $$ {h}_{-}^{1,1} $$ h − 1 , 1 ( X/σ * ) ≠ 0).more » « less
-
Abstract Long-read sequencing technology enables significant progress in de novo genome assembly. However, the high error rate and the wide error distribution of raw reads result in a large number of errors in the assembly. Polishing is a procedure to fix errors in the draft assembly and improve the reliability of genomic analysis. However, existing methods treat all the regions of the assembly equally while there are fundamental differences between the error distributions of these regions. How to achieve very high accuracy in genome assembly is still a challenging problem. Motivated by the uneven errors in different regions of the assembly, we propose a novel polishing workflow named BlockPolish. In this method, we divide contigs into blocks with low complexity and high complexity according to statistics of aligned nucleotide bases. Multiple sequence alignment is applied to realign raw reads in complex blocks and optimize the alignment result. Due to the different distributions of error rates in trivial and complex blocks, two multitask bidirectional Long short-term memory (LSTM) networks are proposed to predict the consensus sequences. In the whole-genome assemblies of NA12878 assembled by Wtdbg2 and Flye using Nanopore data, BlockPolish has a higher polishing accuracy than other state-of-the-arts including Racon, Medaka and MarginPolish & HELEN. In all assemblies, errors are predominantly indels and BlockPolish has a good performance in correcting them. In addition to the Nanopore assemblies, we further demonstrate that BlockPolish can also reduce the errors in the PacBio assemblies. The source code of BlockPolish is freely available on Github (https://github.com/huangnengCSU/BlockPolish).more » « less
-
Robinson, Peter (Ed.)Abstract Motivation Oxford Nanopore sequencing producing long reads at low cost has made many breakthroughs in genomics studies. However, the large number of errors in Nanopore genome assembly affect the accuracy of genome analysis. Polishing is a procedure to correct the errors in genome assembly and can improve the reliability of the downstream analysis. However, the performances of the existing polishing methods are still not satisfactory. Results We developed a novel polishing method, NeuralPolish, to correct the errors in assemblies based on alignment matrix construction and orthogonal Bi-GRU networks. In this method, we designed an alignment feature matrix for representing read-to-assembly alignment. Each row of the matrix represents a read, and each column represents the aligned bases at each position of the contig. In the network architecture, a bi-directional GRU network is used to extract the sequence information inside each read by processing the alignment matrix row by row. After that, the feature matrix is processed by another bi-directional GRU network column by column to calculate the probability distribution. Finally, a CTC decoder generates a polished sequence with a greedy algorithm. We used five real datasets and three assembly tools including Wtdbg2, Flye and Canu for testing, and compared the results of different polishing methods including NeuralPolish, Racon, MarginPolish, HELEN and Medaka. Comprehensive experiments demonstrate that NeuralPolish achieves more accurate assembly with fewer errors than other polishing methods and can improve the accuracy of assembly obtained by different assemblers. Availability and implementation https://github.com/huangnengCSU/NeuralPolish.git. Supplementary information Supplementary data are available at Bioinformatics online.more » « less
-
As HCI pedagogy research grows, so too does an emerging set of evidence-based teaching and curricular recommendations. Yet, few studies have implemented and examined these recommendations in the classroom. In this paper, we present a Research Through Design investigation of a studio-based HCI course, which was revised based on HCI education literature. Drawing on reflection surveys, video recordings of student-led user sessions, final project artifacts, and student interviews, we explore how students responded to key educational changes, the strategies that supported and hindered their reflective practices, and how reflection afforded new student insights. Our findings highlight the utility of video-based reflection exercises to support student learning in designing and running user sessions, the importance of multi-faceted reflection prompts, and how students noticed moments of inclusion and exclusion by attending to users’ non-verbal cues. Additionally, we empirically demonstrate the importance of implementing and studying HCI education research recommendations in the classroom.more » « less
-
null (Ed.)Point set is a major type of 3D structure representation format characterized by its data availability and compactness. Most former deep learning-based point set models pay equal attention to different point set regions and channels, thus having limited ability in focusing on small regions and specific channels that are important for characterizing the object of interest. In this paper, we introduce a novel model named Attention-based Point Network (AttPNet). It uses attention mechanism for both global feature masking and channel weighting to focus on characteristic regions and channels. There are two branches in our model. The first branch calculates an attention mask for every point. The second branch uses convolution layers to abstract global features from point sets, where channel attention block is adapted to focus on important channels. Evaluations on the ModelNet40 benchmark dataset show that our model outperforms the existing best model in classification tasks by 0.7% without voting. In addition, experiments on augmented data demonstrate that our model is robust to rotational perturbations and missing points. We also design a Electron Cryo-Tomography (ECT) point cloud dataset and further demonstrate our model’s ability in dealing with fine-grained structures on the ECT dataset.more » « less