NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

HILDE: Intentional Code Generation via Human-in-the-Loop Decoding

Gonzalez, Emmanuel Anaya; Rothkopf, Raven; Lerner, Sorin; Polikarpova, Nadia (October 2025, Proceedings)

While AI programming tools hold the promise of increasing programmers’ capabilities and productivity to a remarkable degree, they often exclude users from essential decision making processes, causing many to effectively “turn off their brains” and over-rely on solutions provided by these systems. These behaviors can have severe consequences in critical domains, like software security. We propose Human-in-the-Loop Decoding, a novel interaction technique that allows users to observe and directly influence LLM decisions during code generation, in order to align the model’s output with their personal requirements. We implement this technique in HILDE, a code completion assistant that highlights critical decisions made by the LLM and provides local alternatives for the user to explore. In a within-subjects study (N=18) on security-related tasks, we found that HILDE led participants to generate significantly fewer vulnerabilities and better align code generation with their goals compared to a traditional code completion assistant.
more » « less
Free, publicly-accessible full text available October 7, 2026
HILDE: Intentional Code Generation via Human-in-the-Loop Decoding

Gonzalez, Emmanuel Anaya; Rothkopf, Raven; Lerner, Sorin; Polikarpova, Nadia (October 2025, 2025 IEEE Symposium on Visual Languages and Human-Centric Computing)

While AI programming tools hold the promise of increasing programmers’ capabilities and productivity to a remarkable degree, they often exclude users from essential decision making processes, causing many to effectively “turn off their brains” and over-rely on solutions provided by these systems. These behaviors can have severe consequences in critical domains, like software security. We propose Human-in-the-Loop Decoding, a novel interaction technique that allows users to observe and directly influence LLM decisions during code generation, in order to align the model’s output with their personal requirements. We implement this technique in HILDE, a code completion assistant that highlights critical decisions made by the LLM and provides local alternatives for the user to explore. In a within-subjects study (N=18) on security-related tasks, we found that HILDE led participants to generate significantly fewer vulnerabilities and better align code generation with their goals compared to a traditional code completion assistant.
more » « less
Free, publicly-accessible full text available October 7, 2026
Comparative Assessment of Machine Learning Techniques for Modeling Energy Consumption of Heavy-Duty Battery Electric Trucks

https://doi.org/10.1109/FISTS60717.2024.10485539

Gonzalez, Emmanuel Hidalgo; Garrido, Jacqueline; Barth, Matthew; Boriboonsomsin, Kanok (February 2024, IEEE)

Efforts to decarbonize the heavy-duty vehicle sector have generated vast interest in transitioning from conventional diesel trucks to battery electric trucks (BETs). As a result, understanding energy consumption characteristics of BETs has become important for a variety of applications, for instance, assessing the feasibility of deploying BETs in place of conventional diesel trucks, predicting the state-of-charge (SOC) of BETs after specific duty cycles, and managing BET charging needs at the home base or en-route. For these applications, mesoscopic energy consumption models offer a good balance between the amount and fidelity of the input data needed, such as average traffic speed and road grade on a link-by-link basis, and the model performance. As a common intelligent transportation system (ITS) application, this paper presents a comparative assessment of mesoscopic energy consumption models for BETs developed using three different machine learning techniques. The results show that the random forest (RF) regression outperforms the extreme gradient boosting (XGBoost), the light gradient boosting machine (LightGBM), as well as the conventional linear regression as evidenced by the resulting model having a higher coefficient of determination (R2) value than that of its counterparts. When applied to the simulated dataset, the RF regression can capture the behaviors of BET energy consumption well where the R2 value of the resulting model is 0.94.
more » « less
Full Text Available
Quantifying leaf symptoms of sorghum charcoal rot in images of field‐grown plants using deep neural networks

https://doi.org/10.1002/ppj2.20110

Gonzalez, Emmanuel M; Zarei, Ariyan; Calleja, Sebastian; Christenson, Clay; Rozzi, Bruno; Demieville, Jeffrey; Hu, Jiahuai; Eveland, Andrea L; Dilkes, Brian; Barnard, Kobus; et al (June 2024, The Plant Phenome Journal)

Charcoal rot of sorghum (CRS) is a significant disease affecting sorghum crops, with limited genetic resistance available. The causative agent,Macrophomina phaseolina(Tassi) Goid, is a highly destructive fungal pathogen that targets over 500 plant species globally, including essential staple crops. Utilizing field image data for precise detection and quantification of CRS could greatly assist in the prompt identification and management of affected fields and thereby reduce yield losses. The objective of this work was to implement various machine learning algorithms to evaluate their ability to accurately detect and quantify CRS in red‐green‐blue images of sorghum plants exhibiting symptoms of infection. EfficientNet‐B3 and a fully convolutional network emerged as the top‐performing models for image classification and segmentation tasks, respectively. Among the classification models evaluated, EfficientNet‐B3 demonstrated superior performance, achieving an accuracy of 86.97%, a recall rate of 0.71, and an F1 score of 0.73. Of the segmentation models tested, FCN proved to be the most effective, exhibiting a validation accuracy of 97.76%, a recall rate of 0.68, and an F1 score of 0.66. As the size of the image patches increased, both models’ validation scores increased linearly, and their inference time decreased exponentially. This trend could be attributed to larger patches containing more information, improving model performance, and fewer patches reducing the computational load, thus decreasing inference time. The models, in addition to being immediately useful for breeders and growers of sorghum, advance the domain of automated plant phenotyping and may serve as a foundation for drone‐based or other automated field phenotyping efforts. Additionally, the models presented herein can be accessed through a web‐based application where users can easily analyze their own images.
more » « less
Full Text Available
PhytoOracle: Scalable, modular phenomics data processing pipelines

https://doi.org/10.3389/fpls.2023.1112973

Gonzalez, Emmanuel M.; Zarei, Ariyan; Hendler, Nathanial; Simmons, Travis; Zarei, Arman; Demieville, Jeffrey; Strand, Robert; Rozzi, Bruno; Calleja, Sebastian; Ellingson, Holly; et al (March 2023, Frontiers in Plant Science)

As phenomics data volume and dimensionality increase due to advancements in sensor technology, there is an urgent need to develop and implement scalable data processing pipelines. Current phenomics data processing pipelines lack modularity, extensibility, and processing distribution across sensor modalities and phenotyping platforms. To address these challenges, we developed PhytoOracle (PO), a suite of modular, scalable pipelines for processing large volumes of field phenomics RGB, thermal, PSII chlorophyll fluorescence 2D images, and 3D point clouds. PhytoOracle aims to ( i ) improve data processing efficiency; ( ii ) provide an extensible, reproducible computing framework; and ( iii ) enable data fusion of multi-modal phenomics data. PhytoOracle integrates open-source distributed computing frameworks for parallel processing on high-performance computing, cloud, and local computing environments. Each pipeline component is available as a standalone container, providing transferability, extensibility, and reproducibility. The PO pipeline extracts and associates individual plant traits across sensor modalities and collection time points, representing a unique multi-system approach to addressing the genotype-phenotype gap. To date, PO supports lettuce and sorghum phenotypic trait extraction, with a goal of widening the range of supported species in the future. At the maximum number of cores tested in this study (1,024 cores), PO processing times were: 235 minutes for 9,270 RGB images (140.7 GB), 235 minutes for 9,270 thermal images (5.4 GB), and 13 minutes for 39,678 PSII images (86.2 GB). These processing times represent end-to-end processing, from raw data to fully processed numerical phenotypic trait data. Repeatability values of 0.39-0.95 (bounding area), 0.81-0.95 (axis-aligned bounding volume), 0.79-0.94 (oriented bounding volume), 0.83-0.95 (plant height), and 0.81-0.95 (number of points) were observed in Field Scanalyzer data. We also show the ability of PO to process drone data with a repeatability of 0.55-0.95 (bounding area).
more » « less
Full Text Available
StarBLAST: a scalable BLAST+ solution for the classroom

https://doi.org/10.21105/jose.00102

Cosi, Michele; Forstedt, J.j.; Gonzalez, Emmanuel; Xu, Zhuoyun; Peri, Sateesh; Tuteja, Reetu; Blumberg, Kai; Campbell, Tanner; Merchant, Nirav; Lyons, Eric (April 2021, Journal of Open Source Education)
null (Ed.)
Full Text Available

Search for: All records