NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Lost in Translation: A Study of Bugs Introduced by Large Language Models while Translating Code

https://doi.org/10.1145/3597503.3639226

Pan, Rangeet; Ibrahimzada, Ali Reza; Krishna, Rahul; Sankar, Divya; Wassi, Lambert Pouguem; Merler, Michele; Sobolev, Boris; Pavuluri, Raju; Sinha, Saurabh; Jabbarvand, Reyhaneh (April 2024, Proceedings of the International Conference on Software Engineering)
Unicorn: reasoning about configurable system performance through the lens of causality

https://doi.org/10.1145/3492321.3519575

Iqbal, Md Shahriar; Krishna, Rahul; Javidian, Mohammad Ali; Ray, Baishakhi; Jamshidi, Pooyan (March 2022, Proceedings of the Seventeenth European Conference on Computer Systems)

Full Text Available
Deep Learning based Vulnerability Detection: Are We There Yet

https://doi.org/10.1109/TSE.2021.3087402

Chakraborty, Saikat; Krishna, Rahul; Ding, Yangruibo; Ray, Baishakhi (January 2021, IEEE Transactions on Software Engineering)

Full Text Available
ConEx: Efficient Exploration of Big-Data System Configurations for Better Performance

https://doi.org/10.1109/TSE.2020.3007560

Krishna, Rahul; Tang, Chong; Sullivan, Kevin; Ray, Baishakhi (July 2020, IEEE Transactions on Software Engineering)

Configuration space complexity makes the big-data software systems hard to configure well. Consider Hadoop, with over nine hundred parameters, developers often just use the default configurations provided with Hadoop distributions. The opportunity costs in lost performance are significant. Popular learning-based approaches to auto-tune software does not scale well for big-data systems because of the high cost of collecting training data. We present a new method based on a combination of Evolutionary Markov Chain Monte Carlo (EMCMC)} sampling and cost reduction techniques tofind better-performing configurations for big data systems. For cost reduction, we developed and experimentally tested and validated two approaches: using scaled-up big data jobs as proxies for the objective function for larger jobs and using a dynamic job similarity measure to infer that results obtained for one kind of big data problem will work well for similar problems. Our experimental results suggest that our approach promises to improve the performance of big data systems significantly and that it outperforms competing approaches based on random sampling, basic genetic algorithms (GA), and predictive model learning. Our experimental results support the conclusion that our approach strongly demonstrates the potential toimprove the performance of big data systems significantly and frugally.
more » « less
Full Text Available

Search for: All records