Do Large Language Models Pay Similar Attention Like Human Programmers When Generating Code?

Kou, Bonan; Chen, Shengmai; Wang, Zhijie; Ma, Lei; Zhang, Tianyi

doi:10.1145/3660807

Citation Details

Do Large Language Models Pay Similar Attention Like Human Programmers When Generating Code?

Large Language Models (LLMs) have recently been widely used for code generation. Due to the complexity and opacity of LLMs, little is known about how these models generate code. We made the first attempt to bridge this knowledge gap by investigating whether LLMs attend to the same parts of a task description as human programmers during code generation. An analysis of six LLMs, including GPT-4, on two popular code generation benchmarks revealed a consistent misalignment between LLMs' and programmers' attention. We manually analyzed 211 incorrect code snippets and found five attention patterns that can be used to explain many code generation errors. Finally, a user study showed that model attention computed by a perturbation-based method is often favored by human programmers. Our findings highlight the need for human-aligned LLMs for better interpretability and programmer trust. more »

Award ID(s):: 2333736

PAR ID:: 10548056

Author(s) / Creator(s):: Kou, Bonan; Chen, Shengmai; Wang, Zhijie; Ma, Lei; Zhang, Tianyi

Publisher / Repository:: ACM

Date Published:: 2024-07-12

Journal Name:: Proceedings of the ACM on Software Engineering

Volume:: 1

Issue:: FSE

ISSN:: 2994-970X

Page Range / eLocation ID:: 2261 to 2284

Format(s):: Medium: X

Sponsoring Org:: National Science Foundation

Free Publicly Accessible Full Text
Accepted Manuscript1.0
Journal Article:
https://doi.org/10.1145/3660807

More Like this