NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning

Ding, Yangruibo; Peng, Jinjun; Min, Marcus; Kaiser, Gail; Yang, Junfeng; Ray, Baishakhi (December 2024, Advances in Neural Information Processing Systems, NeurIPS 2024)

Full Text Available
kGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution

Mathai, Alex; Huang, Chenxi; Maniatis, Petros; Nogikh, Aleksandr; Ivancic, Franjo; Yang, Junfeng; Ray, Baishakhi (December 2024, Conference on Neural Information Processing Systems (NeurIPS))

Full Text Available
PropTest: Automatic Property Testing for Improved Visual Programming

https://doi.org/10.18653/v1/2024.findings-emnlp.483

Koo, Jaywon; Yang, Ziyan; Cascante-Bonilla, Paola; Ray, Baishakhi; Ordonez, Vicente (November 2024, Findings of the Association for Computational Linguistics)

Full Text Available
CYCLE: Learning to Self-Refine the Code Generation

https://doi.org/10.1145/3649825

Ding, Yangruibo; Min, Marcus J; Kaiser, Gail; Ray, Baishakhi (April 2024, Proceedings of the ACM on Programming Languages)

Pre-trained code language models have achieved promising performance in code generation and improved the programming efficiency of human developers. However, their self-refinement capability is typically overlooked by the existing evaluations of code LMs, which focus only on the accuracy of the one-time prediction. For the cases when code LMs fail to implement the correct program, developers actually find it hard to debug and fix the faulty prediction since it is not written by the developers themselves. Unfortunately, our study reveals that code LMs cannot efficiently self-refine their faulty generations as well. In this paper, we propose CYCLE framework, learning to self-refine the faulty generation according to the available feedback, such as the execution results reported by the test suites. We evaluate CYCLE on three popular code generation benchmarks, HumanEval, MBPP, and APPS. The results reveal that CYCLE successfully maintains, sometimes improves, the quality of one-time code generation, while significantly improving the self-refinement capability of code LMs. We implement four variants of CYCLE with varied numbers of parameters across 350M, 1B, 2B, and 3B, and the experiments show that CYCLE consistently boosts the code generation performance, by up to 63.5
more » « less
Full Text Available
SCoRD: Subject-Conditional Relation Detection with Text-Augmented Data

https://doi.org/10.1109/WACV57701.2024.00563

Yang, Ziyan; Kafle, Kushal; Lin, Zhe; Cohen, Scott; Ding, Zhihong; Ordonez, Vicente (January 2024, IEEE)

Full Text Available
Language-Guided Traffic Simulation via Scene-Level Diffusion

Ziyuan Zhong, Davis Rempe (November 2023, Proceedings of Machine Learning Research)

Full Text Available
Going Beyond Nouns With Vision & Language Models Using Synthetic Data

https://doi.org/10.1109/ICCV51070.2023.01844

Cascante-Bonilla, Paola; Shehada, Khaled; Smith, James Seale; Doveh, Sivan; Kim, Donghyun; Panda, Rameswar; Varol, Gül; Oliva, Aude; Ordonez, Vicente; Feris, Rogerio; et al (October 2023, IEEE/CVF International Conference on Computer Vision)

Full Text Available
Improving Visual Grounding by Encouraging Consistent Gradient-Based Explanations

https://doi.org/10.1109/CVPR52729.2023.01837

Yang, Ziyan; Kafle, Kushal; Dernoncourt, Franck; Ordonez, Vicente (June 2023, IEEE Conference on Computer Vision and Pattern Recognition)

Full Text Available
Neural Network Guided Evolutionary Fuzzing for Finding Traffic Violations of Autonomous Vehicles

https://doi.org/10.1109/TSE.2022.3195640

Zhong, Ziyuan; Kaiser, Gail; Ray, Baishakhi (April 2023, IEEE Transactions on Software Engineering)

Full Text Available
CLIP-Lite: Information Efficient Visual Representation Learning with Language Supervision

Shrivastava, Aman; Selvaraju, Ramprasaath R.; Naik, Nikhil; Ordonez, Vicente (January 2023, Proceedings of The 26th International Conference on Artificial Intelligence and Statistics)
Ruiz, Francisco; Dy, Jennifer; van de Meent, Jan-Willem (Ed.)
We propose CLIP-Lite, an information efficient method for visual representation learning by feature alignment with textual annotations. Compared to the previously proposed CLIP model, CLIP-Lite requires only one negative image-text sample pair for every positive image-text sample during the optimization of its contrastive learning objective. We accomplish this by taking advantage of an information efficient lower-bound to maximize the mutual information between the two input modalities. This allows CLIP-Lite to be trained with significantly reduced amounts of data and batch sizes while obtaining better performance than CLIP at the same scale. We evaluate CLIP-Lite by pretraining on the COCO-Captions dataset and testing transfer learning to other datasets. CLIP-Lite obtains a +14.0 mAP absolute gain in performance on Pascal VOC classification, and a +22.1 top-1 accuracy gain on ImageNet, while being comparable or superior to other, more complex, text-supervised models. CLIP-Lite is also superior to CLIP on image and text retrieval, zero-shot classification, and visual grounding. Finally, we show that CLIP-Lite can leverage language semantics to encourage bias-free visual representations that can be used in downstream tasks. Implementation: https://github.com/4m4n5/CLIP-Lite
more » « less
Full Text Available

« Prev Next »

Search for: All records