skip to main content

Search for: All records

Creators/Authors contains: "Chiang, David"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Recursive calls over recursive data are useful for generating probability distributions, and probabilistic programming allows computations over these distributions to be expressed in a modular and intuitive way. Exact inference is also useful, but unfortunately, existing probabilistic programming languages do not perform exact inference on recursive calls over recursive data, forcing programmers to code many applications manually. We introduce a probabilistic language in which a wide variety of recursion can be expressed naturally, and inference carried out exactly. For instance, probabilistic pushdown automata and their generalizations are easy to express, and polynomial-time parsing algorithms for them are derived automatically. We eliminate recursive data types using program transformations related to defunctionalization and refunctionalization. These transformations are assured correct by a linear type system, and a successful choice of transformations, if there is one, is guaranteed to be found by a greedy algorithm. 
    more » « less
    Free, publicly-accessible full text available April 6, 2024
  2. Neural Machine Translation (NMT) systems are known to degrade when confronted with noisy data, especially when the system is trained only on clean data. In this paper, we show that augmenting training data with sentences containing artificially-introduced grammatical errors can make the system more robust to such errors. In combination with an automatic grammar error correction system, we can recover 1.0 BLEU out of 2.4 BLEU lost due to grammatical errors. We also present a set of Spanish translations of the JFLEG grammar error correction corpus, which allows for testing NMT robustness to real grammatical errors. 
    more » « less
  3. Discovering the underlying structures present in large real world graphs is a fundamental scientific problem. Recent work at the intersection of formal language theory and graph theory has found that a Probabilistic Hyperedge Replacement Grammar (PHRG) can be extracted from a tree decomposition of any graph. However, because the extracted PHRG is directly dependent on the shape and contents of the tree decomposition, rather than from the dynamics of the graph, it is unlikely that informative graph-processes are actually being captured with the PHRG extraction algorithm. To address this problem, the current work adapts a related formalism called Probabilistic Synchronous HRG (PSHRG) that learns synchronous graph production rules from temporal graphs. We introduce the PSHRG model and describe a method to extract growth rules from the graph. We find that SHRG rules capture growth patterns found in temporal graphs and can be used to predict the future evolution of a temporal graph. We perform a brief evaluation on small synthetic networks that demonstrate the prediction accuracy of PSHRG versus baseline and state of the art models. Ultimately, we find that PSHRGs seem to be very good at modelling dynamics of a temporal graph; however, our prediction algorithm, which is based on string parsing and generation algorithms, does not scale to practically useful graph sizes. 
    more » « less