skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


This content will become publicly available on June 19, 2026

Title: A New Approach to Evaluating Nullability Inference Tools
Null-pointer exceptions are serious problem for Java, and researchers have developed type-based nullness checking tools to prevent them. These tools, however, have a downside: they require developers to write nullability annotations, which is time-consuming and hinders adoption. Researchers have therefore proposed nullability annotation inference tools, whose goal is to (partially) automate the task of annotating a program for nullability. However, prior works rely on differing theories of what makes a set of nullability annotations good, making comparing their effectiveness challenging. In this work, we identify a systematic bias in some prior experimental evaluation of these tools: the use of “type reconstruction” experiments to see if a tool can recover erased developer-written annotations. We show that developers make semantic code changes while adding annotations to facilitate typechecking, leading such experiments to overestimate the effectiveness of inference tools on never-annotated code. We propose a new definition of the “best” inferred annotations for a program that avoids this bias, based on a systematic exploration of the design space. With this new definition, we perform the first head-to-head comparison of three extant nullability inference tools. Our evaluation showed the complementary strengths of the tools and remaining weaknesses that could be addressed in future work.  more » « less
Award ID(s):
2223826 2223825
PAR ID:
10636876
Author(s) / Creator(s):
; ; ;
Publisher / Repository:
ACM
Date Published:
Journal Name:
Proceedings of the ACM on Software Engineering
Volume:
2
Issue:
FSE
ISSN:
2994-970X
Page Range / eLocation ID:
358 to 379
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Gradually typed languages allow programmers to mix statically and dynamically typed code, enabling them to incrementally reap the benefits of static typing as they add type annotations to their code. However, this type migration process is typically a manual effort with limited tool support. This paper examines the problem of automated type migration: given a dynamic program, infer additional or improved type annotations. Existing type migration algorithms prioritize different goals, such as maximizing type precision, maintaining compatibility with unmigrated code, and preserving the semantics of the original program. We argue that the type migration problem involves fundamental compromises: optimizing for a single goal often comes at the expense of others. Ideally, a type migration tool would flexibly accommodate a range of user priorities. We present TypeWhich, a new approach to automated type migration for the gradually-typed lambda calculus with some extensions. Unlike prior work, which relies on custom solvers, TypeWhich produces constraints for an off-the-shelf MaxSMT solver. This allows us to easily express objectives, such as minimizing the number of necessary syntactic coercions, and constraining the type of the migration to be compatible with unmigrated code. We present the first comprehensive evaluation of GTLC type migration algorithms, and compare TypeWhich to four other tools from the literature. Our evaluation uses prior benchmarks, and a new set of "challenge problems." Moreover, we design a new evaluation methodology that highlights the subtleties of gradual type migration. In addition, we apply TypeWhich to a suite of benchmarks for Grift, a programming language based on the GTLC. TypeWhich is able to reconstruct all human-written annotations on all but one program. 
    more » « less
  2. Statically typed languages offer numerous benefits to developers, such as improved code quality and reduced runtime errors, but they also require the overhead of manual type annotations. To mitigate this burden, language designers have started incorporating support for type inference, where the compiler infers the type of a variable based on its declaration/usage context. As a result, type annotations are optional in certain contexts, and developers are empowered to use type inference in these situations. However, the usage patterns of type annotations in languages that support type inference are unclear. These patterns can help provide evidence for further research in program comprehension, in language design, and for education. We conduct a large-scale empirical study using Boa, a tool for mining software repositories, to investigate when and where developers use type inference in 498,963 Kotlin projects. We choose Kotlin because it is the default language for Android development, one of the largest software marketplaces. Additionally, Kotlin has supported declaration-site optional type annotations from its initial release. Our findings reveal that type inference is frequently employed for local variables and variables initialized with method calls declared outside the file are more likely to use type inference. These results have significant implications for language designers, providing valuable insight into where to allow type inference and how to optimize type inference algorithms for maximum efficiency, ultimately improving the development experience for developers. 
    more » « less
  3. Inferring program transformations from concrete program changes has many potential uses, such as applying systematic program edits, refactoring, and automated program repair. Existing work for inferring program transformations usually rely on statistical information over a potentially large set of program-change examples. However, in many practical scenarios we do not have such a large set of program-change examples. In this paper, we address the challenge of inferring a program transformation from one single example. Our core insight is that "big code" can provide effective guide for the generalization of a concrete change into a program transformation, i.e., code elements appearing in many files are general and should not be abstracted away. We first propose a framework for transformation inference, where programs are represented as hypergraphs to enable fine-grained generalization of transformations. We then design a transformation inference approach, GENPAT, that infers a program transformation based on code context and statistics from a big code corpus. We have evaluated GENPAT under two distinct application scenarios, systematic editing and program repair. The evaluation on systematic editing shows that GENPAT significantly outperforms a state-of-the-art approach, SYDIT, with up to 5.5x correctly transformed cases. The evaluation on program repair suggests that GENPAT has the potential to be integrated in advanced program repair tools-GENPAT successfully repaired 19 real-world bugs in the Defects4J benchmark by simply applying transformations inferred from existing patches, where 4 bugs have never been repaired by any existing technique. Overall, the evaluation results suggest that GENPAT is effective for transformation inference and can potentially be adopted for many different applications. 
    more » « less
  4. null (Ed.)
    Many researchers have explored retrofitting static type systems to dynamic languages. This raises the question of how to add type annotations to code that was previously untyped. One obvious solution is type inference. However, in complex type systems, in particular those with structural types, type inference typically produces most general types that are large, hard to understand, and unnatural for programmers. To solve this problem, we introduce InferDL, a novel Ruby type inference system that infers sound and useful type annotations by incorporating heuristics that guess types. For example, we might heuristically guess that a parameter whose name ends in "count" is an integer. InferDL works by first running standard type inference and then applying heuristics to any positions for which standard type inference produces overly-general types. Heuristic guesses are added as constraints to the type inference problem to ensure they are consistent with the rest of the program and other heuristic guesses; inconsistent guesses are discarded. We formalized InferDL in a core type and constraint language. We implemented InferDL on top of RDL, an existing Ruby type checker. To evaluate InferDL, we applied it to four Ruby on Rails apps that had been previously type checked with RDL, and hence had type annotations. We found that, when using heuristics, InferDL inferred 22% more types that were as or more precise than the previous annotations, compared to standard type inference without heuristics. We also found one new type error. We further evaluated InferDL by applying it to six additional apps, finding five additional type errors. Thus, we believe InferDL represents a promising approach for inferring type annotations in dynamic languages. 
    more » « less
  5. A pluggable type system extends a host programming language with type qualifiers. It lets programmers write types like unsigned int, secret string, and nonnull object. Typechecking with pluggable types detects and prevents more errors than the host type system. However, programmers must write type qualifiers; this is the biggest obstacle to use of pluggable types in practice. Type inference can solve this problem. Traditional approaches to type inference are type-system-specific: for each new pluggable type system, the type inference algorithm must be extended to build and then solve a system of constraints corresponding to the rules of the underlying type system. We propose a novel type inference algorithm that can infer type qualifiers for any pluggable type system with little to no new type-system-specific code—that is, “for free”. The key insight is that extant practical pluggable type systems are flow-sensitive and therefore already implement local type inference. Using this insight, we can derive a global inference algorithm by re-using existing implementations of local inference. Our algorithm runs iteratively in rounds. Each round uses the results of local type inference to produce summaries (specifications) for procedures and fields. These summaries enable improved inference throughout the program in subsequent rounds. The algorithm terminates when the inferred summaries reach a fixed point. In practice, many pluggable type systems are built on frameworks. By implementing our algorithm once, at the framework level, it can be reused by any typechecker built using that framework. Using that insight, we have implemented our algorithm for the open-source Checker Framework project, which is widely used in industry and on which dozens of specialized pluggable typecheckers have been built. In experiments with 11 distinct pluggable type systems and 12 projects, our algorithm reduced, by 45% on average, the number of warnings that developers must resolve by writing annotations. 
    more » « less