Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
ABSTRACT A key set‐theoretic “spread” lemma has been central to two recent celebrated results in combinatorics: the recent improvements on the sunflower conjecture by Alweiss, Lovett, Wu, and Zhang; and the proof of the fractional Kahn–Kalai conjecture by Frankston, Kahn, Narayanan, and Park. In this work, we present a new proof of the spread lemma, that—perhaps surprisingly—takes advantage of an explicit recasting of the proof in the language of Bayesian inference. We show that from this viewpoint the reasoning proceeds in a straightforward and principled probabilistic manner, leading to a truncated second moment calculation which concludes the proof.more » « lessFree, publicly-accessible full text available July 1, 2026
-
Background:The limited diagnostic accuracy of prostate-specific antigen screening for prostate cancer (PCa) has prompted innovative solutions, such as the state-of-the-art 18-gene urine test for clinically-significant PCa (MyProstateScore2.0 (MPS2)).Objective:We aim to develop a non-invasive biomarker test, the simplified MPS2 (sMPS2), which achieves similar state-of-the-art accuracy as MPS2 for predicting high-grade PCa but requires substantially fewer genes than the 18-gene MPS2 to improve its accessibility for routine clinical care.Methods:We grounded the development of sMPS2 in the Predictability, Computability, and Stability (PCS) framework for veridical data science. Under this framework, we stress-tested the development of sMPS2 across various data preprocessing and modeling choices and developed a stability-driven PCS ranking procedure for selecting the most predictive and robust genes for use in sMPS2.Results:The final sMPS2 model consisted of 7 genes and achieved a 0.784 AUROC (95% confidence interval, 0.742–0.825) for predicting high-grade PCa on a blinded external validation cohort. This is only 2.3% lower than the 18-gene MPS2, which is similar in magnitude to the 1–2% in uncertainty induced by different data preprocessing choices.Conclusions:The 7-gene sMPS2 provides a unique opportunity to expand the reach and adoption of non-invasive PCa screening.more » « less
-
Free, publicly-accessible full text available July 19, 2026
-
Free, publicly-accessible full text available July 19, 2026
-
Gradient Descent Converges Arbitrarily Fast for Logistic Regression via Large and Adaptive StepsizesFree, publicly-accessible full text available July 19, 2026
-
Free, publicly-accessible full text available May 31, 2026
-
Domainadaptation(DA)isastatisticallearningproblemthatariseswhenthedistribution ofthesourcedatausedtotrainamodeldi↵ersfromthatofthetargetdatausedtoevaluate themodel. WhilemanyDAalgorithmshavedemonstratedconsiderableempiricalsuccess, blindly applying these algorithms can often lead to worse performance on new datasets. Toaddressthis, itiscrucialtoclarifytheassumptionsunderwhichaDAalgorithmhas good target performance. In this work, we focus on the assumption of the presence of conditionally invariant components (CICs), which are relevant for prediction and remain conditionally invariant across the source and target data. We demonstrate that CICs, whichcanbeestimatedthroughconditionalinvariantpenalty(CIP),playthreeprominent rolesinprovidingtargetriskguaranteesinDA.First,weproposeanewalgorithmbased on CICs, importance-weighted conditional invariant penalty (IW-CIP), which has target riskguaranteesbeyondsimplesettingssuchascovariateshiftandlabelshift. Second,we showthatCICshelpidentifylargediscrepanciesbetweensourceandtargetrisksofother DAalgorithms. Finally,wedemonstratethatincorporatingCICsintothedomaininvariant projection(DIP)algorithmcanaddressitsfailurescenariocausedbylabel-flippingfeatures. We support our new algorithms and theoretical findings via numerical experiments on syntheticdata,MNIST,CelebA,Camelyon17,andDomainNetdatasets.more » « lessFree, publicly-accessible full text available May 25, 2026
-
Free, publicly-accessible full text available May 15, 2026
-
Free, publicly-accessible full text available May 5, 2026
-
Free, publicly-accessible full text available May 5, 2026
An official website of the United States government
