skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Vasilis"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Abstract Over three percent of people carry a dominant pathogenic variant, yet only a fraction of carriers develop disease. Disease phenotypes from carriers of variants in the same gene range from mild to severe. Here, we investigate underlying mechanisms for this heterogeneity: variable variant effect sizes, carrier polygenic backgrounds, and modulation of carrier effect by genetic background (marginal epistasis). We leveraged exomes and clinical phenotypes from the UK Biobank and the Mt. Sinai BioMeBiobank to identify carriers of pathogenic variants affecting cardiometabolic traits. We employed recently developed methods to study these cohorts, observing strong statistical support and clinical translational potential for all three mechanisms of variable carrier penetrance and disease severity. For example, scores from our recent model of variant pathogenicity were tightly correlated with phenotype amongst clinical variant carriers, they predicted effects of variants of unknown significance, and they distinguished gain- from loss-of-function variants. We also found that polygenic scores modify phenotypes amongst pathogenic carriers and that genetic background additionally alters the effects of pathogenic variants through interactions. 
    more » « less
    Free, publicly-accessible full text available December 1, 2026
  2. Free, publicly-accessible full text available July 2, 2026
  3. Traditional models of supervised learning require a learner, given examples from an arbitrary joint distribution on 𝑅 𝑑 × { ± 1 } R d ×{±1}, to output a hypothesis that competes (to within 𝜖 ϵ) with the best fitting concept from a class. To overcome hardness results for learning even simple concept classes, this paper introduces a smoothed-analysis framework that only requires competition with the best classifier robust to small random Gaussian perturbations. This subtle shift enables a wide array of learning results for any concept that (1) depends on a low-dimensional subspace (multi-index model) and (2) has bounded Gaussian surface area. This class includes functions of halfspaces and low-dimensional convex sets, which are only known to be learnable in non-smoothed settings with respect to highly structured distributions like Gaussians. The analysis also yields new results for traditional non-smoothed frameworks such as learning with margin. In particular, the authors present the first algorithm for agnostically learning intersections of 𝑘 k-halfspaces in time 𝑘 ⋅ poly ( log ⁡ 𝑘 , 𝜖 , 𝛾 ) k⋅poly(logk,ϵ,γ), where 𝛾 γ is the margin parameter. Previously, the best-known runtime was exponential in 𝑘 k (Arriaga and Vempala, 1999). 
    more » « less
    Free, publicly-accessible full text available April 30, 2026
  4. We study the distortion of one-sided and two-sided matching problems on the line. In the one-sided case, n agents need to be matched to n items, and each agent's cost in a matching is their distance from the item they were matched to. We propose an algorithm that is provided only with ordinal information regarding the agents' preferences (each agent's ranking of the items from most- to least-preferred) and returns a matching aiming to minimize the social cost with respect to the agents' true (cardinal) costs. We prove that our algorithm simultaneously achieves the best-possible approximation of 3 (known as distortion) with respect to a variety of social cost measures which include the utilitarian and egalitarian social cost. In the two-sided case, where the agents need be matched to n other agents and both sides report their ordinal preferences over each other, we show that it is always possible to compute an optimal matching. In fact, we show that this optimal matching can be achieved using even less information, and we provide bounds regarding the sufficient number of queries. 
    more » « less
    Free, publicly-accessible full text available September 1, 2026
  5. Azar, Yossi; Panigrahi, Debmalya (Ed.)
    We provide the first analysis of (deferred acceptance) clock auctions in the learning-augmented framework. These auctions satisfy a unique list of very appealing properties, including obvious strategyproofness, transparency, and unconditional winner privacy, making them particularly well-suited for real-world applications. However, early work that evaluated their performance from a worst-case analysis perspective concluded that no deterministic clock auction with n bidders can achieve a O (log1-∈ n ) approximation of the optimal social welfare for a constant ∈ > 0, even in very simple settings. This overly pessimistic impossibility result heavily depends on the assumption that the designer has no information regarding the bidders’ values. Leveraging the learning-augmented framework, we instead consider a designer equipped with some (machine-learned) advice regarding the optimal solution; this advice can provide useful guidance if accurate, but it may be unreliable. Our main results are learning-augmented clock auctions that use this advice to achieve much stronger performance guarantees whenever the advice is accurate (known as consistency), while maintaining worst-case guarantees even if this advice is arbitrarily inaccurate (known as robustness ). Our first clock auction achieves the best of both worlds: (1 + ∈ )-consistency for any desired constant ∈ > 0 and O (log n ) robustness; we also extend this auction to achieve error tolerance. We then consider a much stronger notion of consistency, which we refer to as consistency∞ and provide an auction that achieves a near-optimal trade-off between consistency∞ and robustness. Finally, using our impossibility results regarding this trade-off, we prove lower bounds on the “cost of smoothness,” i.e., on the robustness that is achievable if we also require that the performance of the auction degrades smoothly as a function of the prediction error. 
    more » « less
    Free, publicly-accessible full text available January 28, 2026
  6. We study the problem of PAC learning γ-margin halfspaces with Massart noise. We propose a simple proper learning algorithm, the Perspectron, that has sample complexity O˜((ϵγ)−2) and achieves classification error at most η+ϵ where η is the Massart noise rate. Prior works [DGT19,CKMY20] came with worse sample complexity guarantees (in both ϵ and γ) or could only handle random classification noise [DDK+23,KIT+23] -- a much milder noise assumption. We also show that our results extend to the more challenging setting of learning generalized linear models with a known link function under Massart noise, achieving a similar sample complexity to the halfspace case. This significantly improves upon the prior state-of-the-art in this setting due to [CKMY20], who introduced this model. 
    more » « less
  7. Globerson, A; Mackey, L; Belgrave, D; Fan, A; Paquet, U; Tomczak, J; Zhang, C (Ed.)
    In the strategic facility location problem, a set of agents report their locations in a metric space and the goal is to use these reports to open a new facility, minimizing an aggregate distance measure from the agents to the facility. However, agents are strategic and may misreport their locations to influence the facility’s placement in their favor. The aim is to design truthful mechanisms, ensuring agents cannot gain by misreporting. This problem was recently revisited through the learning-augmented framework, aiming to move beyond worst-case analysis and design truthful mechanisms that are augmented with (machine-learned) predictions. The focus of this prior work was on mechanisms that are deterministic and augmented with a prediction regarding the optimal facility location. In this paper, we provide a deeper understanding of this problem by exploring the power of randomization as well as the impact of different types of predictions on the performance of truthful learning-augmented mechanisms. We study both the single-dimensional and the Euclidean case and provide upper and lower bounds regarding the achievable approximation of the optimal egalitarian social cost. 
    more » « less