skip to main content

Attention:

The NSF Public Access Repository (NSF-PAR) system and access will be unavailable from 10:00 PM ET on Friday, December 8 until 2:00 AM ET on Saturday, December 9 due to maintenance. We apologize for the inconvenience.


Title: Worst-case versus average-case design for estimation from partial pairwise comparisons
Award ID(s):
1704967
NSF-PAR ID:
10170052
Author(s) / Creator(s):
; ; ; ;
Date Published:
Journal Name:
Annals of Statistics
Volume:
48
Issue:
2
ISSN:
0090-5364
Page Range / eLocation ID:
1072 to 1097
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. null (Ed.)
    This paper describes a systematic study of an approach to Farsi-Spanish low-resource Neural Machine Translation (NMT) that leverages monolingual data for joint learning of forward and backward translation models. As is standard for NMT systems, the training process begins using two pre-trained translation models that are iteratively updated by decreasing translation costs. In each iteration, either translation model is used to translate monolingual texts from one language to another, to generate synthetic datasets for the other translation model. Two new translation models are then learned from bilingual data along with the synthetic texts. The key distinguishing feature between our approach and standard NMT is an iterative learning process that improves the performance of both translation models, simultaneously producing a higher-quality synthetic training dataset upon each iteration. Our empirical results demonstrate that this approach outperforms baselines. 
    more » « less
  2. Keystroke dynamics has gained relevance over the years for its potential in solving practical problems like online fraud and account takeovers. Statistical algorithms such as distance measures have long been a common choice for keystroke authentication due to their simplicity and ease of implementation. However, deep learning has recently started to gain popularity due to their ability to achieve better performance. When should statistical algorithms be preferred over deep learning and vice-versa? To answer this question, we set up experiments to evaluate two state-of-the-art statistical algorithms: Scaled Manhattan and the Instance-based Tail Area Density (ITAD) metric, with a state-of-the-art deep learning model called TypeNet, on three datasets (one small and two large). Our results show that on the small dataset, statistical algorithms significantly outperform the deep learning approach (Equal Error Rate (EER) of 4.3% for Scaled Manhattan / 1.3% for ITAD versus 19.18% for TypeNet ). However, on the two large datasets, the deep learning approach performs better (22.9% & 28.07% for Scaled Manhattan / 12.25% & 20.74% for ITAD versus 0.93% & 6.77% for TypeNet). 
    more » « less
  3. Bakay, Özge ; Pratley, Breanna ; Neu, Eva ; Deal, Peyton (Ed.)
    The existence and nature of abstract Case has been debated in recent years (McFadden 2004, Landau 2006, Markman 2009), particularly in languages that show no morphological case marking (Diercks 2012, Sheehan & van der Wal 2016). Using data from original fieldwork, I argue that Nukuoro (Polynesian-Outlier) instantiates abstract ergative Case without morphological case or agreement. Nukuoro shows a range of syntactic phenomena indicative of abstract Case, including object shift and pseudo noun incorporation (e.g., Massam 2001), syntactic ergativity in A'-movement, and alternative licensing in tenseless clauses. This pattern provides support for modern theories of Case (Legate 2008), which cleave the assignment of abstract Case from its realization in the morphology; additionally, this pattern differs from other documented examples of unrealized abstract Case by having an ergative alignment, rather than a nominative one (Halpert 2016, Sheehan & van der Wal 2016). 
    more » « less