NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Private Geometric Median in Nearly-Linear Time

Kumar, S; Liu, D; Tian, K; Yang, C (May 2025, https://doi.org/10.48550/arXiv.2505.20189)

Estimating the geometric median of a dataset is a robust counterpart to mean estimation, and is a fundamental problem in computational geometry. Recently, [HSU24] gave an (ε,δ)-differentially private algorithm obtaining an α-multiplicative approximation to the geometric median objective, 1n∑i∈[n]‖⋅−xi‖, given a dataset :={xi}i∈[n]⊂ℝd. Their algorithm requires n≳d‾‾√⋅1αε samples, which they prove is information-theoretically optimal. This result is surprising because its error scales with the \emph{effective radius} of  (i.e., of a ball capturing most points), rather than the worst-case radius. We give an improved algorithm that obtains the same approximation quality, also using n≳d‾‾√⋅1αϵ samples, but in time O˜(nd+dα2). Our runtime is nearly-linear, plus the cost of the cheapest non-private first-order method due to [CLM+16]. To achieve our results, we use subsampling and geometric aggregation tools inspired by FriendlyCore [TCK+22] to speed up the "warm start" component of the [HSU24] algorithm, combined with a careful custom analysis of DP-SGD's sensitivity for the geometric median objective.
more » « less
Free, publicly-accessible full text available May 26, 2026
DS-LLM: Leveraging Dynamical Systems to Enhance Both Training and Inference of Large Language Models

Song, R; Liu, C; Wu, C; Li, A; Liu, D; Wu, Y N; Geng, T (May 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available May 1, 2026
Visual Agents as Fast and Slow Thinkers

Sun, G; Jin, M; Wang, Z; Wang, C L; Ma, S; Wang, Q; Geng, T; Wu, Y N; Zhang, Y; Liu, D (May 2025, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available May 1, 2026
Visual Fourier Prompt Tuning

Zeng, R; Han, C; Wang, Q; Wu, C; Geng, T; Huang, L; Wu, Y N; Liu, D (December 2024, Advances in Neural Information Processing Systems (NeurIPS))

Free, publicly-accessible full text available December 1, 2025
Visual Fourier Prompt Tuning

Zeng, R; Han, C; Wang, Q; Wu, C; Geng, T; Huang, L; Wu, Y N; Liu, D (December 2024, NeurIPS)

Free, publicly-accessible full text available December 1, 2025
Improving indirect-call analysis in LLVM with type and data-flow co-analysis

Liu, D; Ji, S; Lu, K; He, Q (August 2024, USENIX)

Indirect function calls are widely used in building system software like OS kernels for their high flexibility and performance. Statically resolving indirect-call targets has been known to be a hard problem, which is a fundamental requirement for various program analysis and protection tasks. The state-of-the-art techniques, which use type analysis, are still imprecise. In this paper, we present a new approach, TFA, that precisely identifies indirect-call targets. The intuition behind TFA is that type-based analysis and data-flow analysis are inherently complementary in resolving indirect-call targets. TFA incorporates a co-analysis system that makes the best use of both type information and data-flow information. The co-analysis keeps refining the global call graph iteratively, allowing us to achieve an optimal indirect call analysis. We have implemented TFA in LLVM and evaluated it against five famous large-scale programs. The experimental results show that TFA eliminates additional 24% to 59% of indirect-call targets compared with the state-of-the-art approaches, without introducing new false negatives. With the precise indirect-call analysis, we further developed a strengthened fine-grained forward-edge control-flow integrity scheme and applied it to the Linux kernel. We have also used the refined indirect-call analysis results in bug detection, where we found 8 deep bugs in the Linux kernel. As a generic technique, the precise indirect-call analysis of TFA can also benefit other applications such as compiler optimization and software debloating.
more » « less
Full Text Available
Improving indirect-call analysis in LLVM with type and data-flow co-analysis

Liu, D; Ji, Shouling; Lu, K; He, Q (August 2024, USENIX)

Full Text Available
Diff-PIC: Revolutionizing Particle-In-Cell Nuclear Fusion Simulation with Diffusion Models

Liu, C; Wu, C; Cao, S; Chen, M; Liang, J C; Li, A; Huang, M; Ren, C; Wu, Y N; Liu, D; et al (December 2024, International Conference on Learning Representations (ICLR))

Free, publicly-accessible full text available December 1, 2025
NP-NDS: A Nature-Powered Nonlinear Dynamical System for Power Grid Forecasting

Wu, C; Song, R; Liu, C; Wang, Y; Chen, Y; Li, A; Liu, D; Wu, Y N; Huang, M; Geng, T (November 2024, The Third Learning on Graphs Conference (LoG))

Free, publicly-accessible full text available November 1, 2025
Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions

Asi, H; Liu, D; Tian, K (June 2024, https://doi.org/10.48550/arXiv.2406.02789)

We study the problem of differentially private stochastic convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a kth-moment bound on the Lipschitz constants of sample functions rather than a uniform bound. We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error G2⋅1n√+Gk⋅(d√nϵ)1−1k under (ϵ,δ)-approximate differential privacy, up to a mild $$\textup{polylog}(\frac{1}{\delta})$$ factor, where G22 and Gkk are the 2nd and kth moment bounds on sample Lipschitz constants, nearly-matching a lower bound of [Lowy and Razaviyayn 2023]. We further give a suite of private algorithms in the heavy-tailed setting which improve upon our basic result under additional assumptions, including an optimal algorithm under a known-Lipschitz constant assumption, a near-linear time algorithm for smooth functions, and an optimal linear time algorithm for smooth generalized linear models.
more » « less
Full Text Available

« Prev Next »

Search for: All records