Diffusion-based Text-to-Image (T2I) models have achieved impressive success in generating high-quality images from textual prompts. While large language models (LLMs) effectively leverage Direct Preference Optimization (DPO) for fine-tuning on human preference data without the need for reward models, diffusion models have not been extensively explored in this area. Current preference learning methods applied to T2I diffusion models immediately adapt existing techniques from LLMs. However, this direct adaptation introduces an estimated loss specific to T2I diffusion models. This estimation can potentially lead to suboptimal performance through our empirical results. In this work, we propose Direct Score Preference Optimization (DSPO), a novel algorithm that aligns the pretraining and fine-tuning objectives of diffusion models by leveraging score matching, the same objective used during pretraining. It introduces a new perspective on preference learning for diffusion models. Specifically, DSPO distills the score function of human-preferred image distributions into pretrained diffusion models, fine-tuning the model to generate outputs that align with human preferences. We theoretically show that DSPO shares the same optimization direction as reinforcement learning algorithms in diffusion models under certain conditions. Our experimental results demonstrate that DSPO outperforms preference learning baselines for T2I diffusion models in human preference evaluation tasks and enhances both visual appeal and prompt alignment of generated images.
more »
« less
This content will become publicly available on March 19, 2026
Mechano-diffusion of particles in stretchable hydrogels
We report a mechano-diffusion mechanism that harnesses mechanical deformation to control particle diffusion in stretchable hydrogels with a significantly enlarged tuning ratio and highly expanded tuning freedom.
more »
« less
- PAR ID:
- 10599451
- Publisher / Repository:
- ROYAL SOCIETY OF CHEMISTRY
- Date Published:
- Journal Name:
- Soft Matter
- Volume:
- 21
- Issue:
- 12
- ISSN:
- 1744-683X
- Page Range / eLocation ID:
- 2230 to 2241
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Kehtarnavaz, Nasser; Shirvaikar, Mukul V (Ed.)Recent diffusion-based generative models employ methods such as one-shot fine-tuning an image diffusion model for video generation. However, this leads to long video generation times and suboptimal efficiency. To resolve this long generation time, zero-shot text-to-video models eliminate the fine-tuning method entirely and can generate novel videos from a text prompt alone. While the zero-shot generation method greatly reduces generation time, many models rely on inefficient cross-frame attention processors, hindering the diffusion model’s utilization for real-time video generation. We address this issue by introducing more efficient attention processors to a video diffusion model. Specifically, we use attention processors (i.e. xFormers, FlashAttention, and HyperAttention) that are highly optimized for efficiency and hardware parallelization. We then apply these processors to a video generator and test with both older diffusion models such as Stable Diffusion 1.5 and newer, high-quality models such as Stable Diffusion XL. Our results show that using efficient attention processors alone can reduce generation time by around 25%, while not resulting in any change in video quality. Combined with the use of higher quality models, this use of efficient attention processors in zero-shot generation presents a substantial efficiency and quality increase, greatly expanding the video diffusion model’s application to real-time video generation.more » « less
-
Finding correspondences between images is a fundamental problem in computer vision. In this paper, we show that correspondence emerges in image diffusion models without any explicit supervision. We propose a simple strategy to extract this implicit knowledge out of diffusion networks as image features, namely DIffusion FeaTures (DIFT), and use them to establish correspondences between real images. Without any additional fine-tuning or supervision on the task-specific data or annotations, DIFT is able to outperform both weakly-supervised methods and competitive off-the-shelf features in identifying semantic, geometric, and temporal correspondences. Particularly for semantic correspondence, DIFT from Stable Diffusion is able to outperform DINO and OpenCLIP by 19 and 14 accuracy points respectively on the challenging SPair-71k benchmark. It even outperforms the state-of-the-art supervised methods on 9 out of 18 categories while remaining on par for the overall performance. Project page: https://diffusionfeatures. github.io.more » « less
-
Abstract Graphene-based electrodes have been extensively investigated for supercapacitor applications. However, their ion diffusion efficiency is often hindered by the graphene restacking phenomenon. Even though holey graphene is fabricated to address this issue by providing ion transport channels, those channels could still be blocked by densely stacked graphene nanosheets. To tackle this challenge, this research aims at improving the ion diffusion efficiency of microwave-synthesized holey graphene films by tuning the water interlayer spacer towards the improved supercapacitor performance. By controlling the vacuum filtration during graphene-based electrode fabrication, we obtain dry films with dense packing and wet films with sparse packing. The SEM images reveal that 20 times larger interlayer distance is constructed in the wet film compared to that in the dry counterpart. The holey graphene wet film delivers a specific capacitance of 239 F/g, ~82% enhancement over the dry film (131 F/g). By an integrated experimental and computational study, we quantitatively show that the interlayer spacing in combination with the nanoholes in the basal plane dominates the ion diffusion rate in holey graphene-based electrodes. Our study concludes that novel hierarchical structures should be further considered even in holey graphene thin films to fully exploit the superior advantages of graphene-based supercapacitors.more » « less
-
Csikász-Nagy, Attila (Ed.)The ubiquitous existence of microbial communities marks the importance of understanding how species interact within the community to coexist and their spatial organization. We study a two-species mutualistic cross-feeding model through a stochastic cellular automaton on a square lattice using kinetic Monte Carlo simulation. Our model encapsulates the essential dynamic processes such as cell growth, and nutrient excretion, diffusion and uptake. Focusing on the interplay among nutrient diffusion and individual cell division, we discover three general classes of colony morphology: co-existing sectors, co-existing spirals, and engulfment. When the cross-feeding nutrient is widely available, either through high excretion or fast diffusion, a stable circular colony with alternating species sector emerges. When the consumer cells rely on being spatially close to the producers, we observe a stable spiral. We also see one species being engulfed by the other when species interfaces merge due to stochastic fluctuation. By tuning the diffusion rate and the growth rate, we are able to gain quantitative insights into the structures of the sectors and the spirals.more » « less
An official website of the United States government
