skip to main content

Search for: All records

Creators/Authors contains: "Huang, W"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Free, publicly-accessible full text available June 1, 2023
  2. Free, publicly-accessible full text available June 1, 2023
  3. Recent research shows that the dynamics of an infinitely wide neural network (NN) trained by gradient descent can be characterized by Neural Tangent Kernel (NTK) [27]. Under the squared loss, the infinite-width NN trained by gradient descent with an infinitely small learning rate is equivalent to kernel regression with NTK [4]. However, the equivalence is only known for ridge regression currently [6], while the equivalence between NN and other kernel machines (KMs), e.g. support vector machine (SVM), remains unknown. Therefore, in this work, we propose to establish the equivalence between NN and SVM, and specifically, the infinitely wide NN trained by soft margin loss and the standard soft margin SVM with NTK trained by subgradient descent. Our main theoretical results include establishing the equivalence between NN and a broad family of L2 regularized KMs with finite width bounds, which cannot be handled by prior work, and showing that every finite-width NN trained by such regularized loss functions is approximately a KM. Furthermore, we demonstrate our theory can enable three practical applications, including (i) non-vacuous generalization bound of NN via the corresponding KM; (ii) nontrivial robustness certificate for the infinite-width NN (while existing robustness verification methods would provide vacuous bounds); (iii)more »intrinsically more robust infinite-width NNs than those from previous kernel regression.« less
  4. Weinberger, A. ; Chen, W. ; Hernández-Leo, D. ; Chen, B. (Ed.)
    Dynamically transitioning between individual and collaborative learning has been hypothesized to have positive effects, such as providing the optimal learning mode based on students’ needs. There are, however, challenges in orchestrating these transitions in real-time while managing a classroom of students. AI-based orchestration tools have the potential to alleviate some of the orchestration load for teachers. In this study, we describe a sequence of three design sessions with teachers where we refine prototypes of an orchestration tool to support dynamic transitions. We leverage design narratives and conjecture mapping for the design of our novel orchestration tool. Our contributions include the orchestration tool itself; a description of how novel tool features were revised throughout the sessions with teachers, including shared control between teachers, students, and AI and the use of AI to support dynamic transitions, and a reflection of the changes to our design and theoretical conjectures.
  5. By studying two interband cascade laser (ICL) wafers with structural parameters that deviated considerably from the design, the durability of the device performance against structural variations was explored. Even with the lasing wavelength blue shifted by more than 700 nm from the designed value near 4.6 μm at 300 K, the ICLs still performed very well with a threshold current density as low as 320 A/cm2 at 300 K, providing solid experimental evidence of the tolerance of ICL performance on structural variations.
  6. We propose a simple, fast, and accurate one-stage approach to visual grounding, inspired by the following insight. The performances of existing propose-and-rank twostage methods are capped by the quality of the region candidates they propose in the first stage — if none of the candidates could cover the ground truth region, there is no hope in the second stage to rank the right region to the top. To avoid this caveat, we propose a one-stage model that enables end-to-end joint optimization. The main idea is as straightforward as fusing a text query’s embedding into the YOLOv3 object detector, augmented by spatial features so as to account for spatial mentions in the query. Despite being simple, this one-stage approach shows great potential in terms of both accuracy and speed for both phrase localization and referring expression comprehension, according to our experiments. Given these results along with careful investigations into some popular region proposals, we advocate for visual grounding a paradigm shift from the conventional two-stage methods to the one-stage framework.