skip to main content


Search for: All records

Creators/Authors contains: "Kong, S."

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Contemporary autonomous vehicle (AV) benchmarks have advanced techniques for training 3D detectors, particularly on large-scale lidar data. Surprisingly, although semantic class labels naturally follow a long-tailed distribution, contemporary benchmarks focus on only a few common classes (e.g., pedestrian and car) and neglect many rare classes in-the-tail (e.g., debris and stroller). However, AVs must still detect rare classes to ensure safe operation. Moreover, semantic classes are often organized within a hierarchy, e.g., tail classes such as child and construction-worker are arguably subclasses of pedestrian. However, such hierarchical relationships are often ignored, which may lead to misleading estimates of performance and missed opportunities for algorithmic innovation. We address these challenges by formally studying the problem of Long-Tailed 3D Detection (LT3D), which evaluates on all classes, including those in-the-tail. We evaluate and innovate upon popular 3D detection codebases, such as CenterPoint and PointPillars, adapting them for LT3D. We develop hierarchical losses that promote feature sharing across common-vs-rare classes, as well as improved detection metrics that award partial credit to "reasonable" mistakes respecting the hierarchy (e.g., mistaking a child for an adult). Finally, we point out that fine-grained tail class accuracy is particularly improved via multimodal fusion of RGB images with LiDAR; simply put, small fine-grained classes are challenging to identify from sparse (lidar) geometry alone, suggesting that multimodal cues are crucial to long-tailed 3D detection. Our modifications improve accuracy by 5% AP on average for all classes, and dramatically improve AP for rare classes (e.g., stroller AP improves from 3.6 to 31.6)! Our code is available at this https URL. 
    more » « less
  2. null (Ed.)
    Monocular depth predictors are typically trained on large-scale training sets which are naturally biased w.r.t the distribution of camera poses. As a result, trained predic- tors fail to make reliable depth predictions for testing exam- ples captured under uncommon camera poses. To address this issue, we propose two novel techniques that exploit the camera pose during training and prediction. First, we in- troduce a simple perspective-aware data augmentation that synthesizes new training examples with more diverse views by perturbing the existing ones in a geometrically consis- tent manner. Second, we propose a conditional model that exploits the per-image camera pose as prior knowledge by encoding it as a part of the input. We show that jointly ap- plying the two methods improves depth prediction on im- ages captured under uncommon and even never-before-seen camera poses. We show that our methods improve perfor- mance when applied to a range of different predictor ar- chitectures. Lastly, we show that explicitly encoding the camera pose distribution improves the generalization per- formance of a synthetically trained depth predictor when evaluated on real images. 
    more » « less
  3. null (Ed.)
    Computer science (CS) has the potential to positively impact the economic well-being of those who pursue it, and the lives of those who benefit from its innovations. Yet, large CS learning opportunity gaps exist for students from historically underrepresented populations. The Computer Science for All (CS for All) movement has brought nationwide attention to these inequities in CS education. More recently, financial support for research-practice partnerships (RPPs) has increased to address these disparities because such collaborations can yield more relevant research for immediate educational/practical application. However, for initiatives to effectively engage in equity-focused initiatives toward making computing inclusive, partnership members need to begin with a shared definition of equity to which all are accountable. This poster takes a critical look at the development of a collaboratively developed definition of equity and its application in a CS for All RPP of university researchers and administrators from local education agencies across the state of California. Details are shared about how the RPP collectively defined equity and how that definition evolved and informed the larger project’s work with school administrators/educators. 
    more » « less
  4. null (Ed.)
    We present a method for establishing confidence in the decisions of an autonomous car which accounts for errors not only in control but also in perception. The key idea is that the controller generates a certificate, which is a kind its proposed action is safe. of proof that its interpretation of the scene is accurate and its proposed action is safe. Checking the certificate is faster and simpler than generating it, which allows for a monitor that comprises a much smaller trusted base than the system as a whole. Simulation experiments suggest that the approach is practical. 
    more » « less
  5. null (Ed.)
  6. Certified control is a new architectural pattern for achieving high assurance of safety in autonomous cars. As with a traditional safety controller or interlock, a separate component oversees safety and intervenes to prevent safety violations. This component (along with sensors and actuators) comprises a trusted base that can ensure safety even if the main controller fails. But in certified control, the interlock does not use the sensors directly to determine when to intervene. Instead, the main controller is given the responsibility of presenting the interlock with a certificate that provides evidence that the proposed next action is safe. The interlock checks this certificate, and intervenes only if the check fails. Because generating such a certificate is usually much harder than checking one, the interlock can be smaller and simpler than the main controller, and thus assuring its correctness is more feasible. 
    more » « less