skip to main content

Search for: All records

Creators/Authors contains: "Zhang, Haoran"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Endoscopic angle-resolved light scattering methods have been developed for early cancer detection but they typically require multi-element coherent fiber optic bundles to recover scattering distributions from tissues. Recent work has focused on using a single multimode fiber (MMF) to measure angle resolved scattering but this approach has practical limitations to overcome before clinical translation. Here we address these limitations by proposing an MMF-based endoscope capable of measuring angular scattering patterns suitable for determining structure. Significantly, this approach implements a spectrally resolved detection scheme to reduce speckle and leverages the azimuthal symmetry of the angular scattering patterns to enable measurements that are robust to fiber bending. This results in a unique method that does not require matrix inversion or machine learning to measure a transmitted scattering distribution. The MMF utilized here is 1000 mm in length with a 200 µm core and is demonstrated to recover angular scattering distributions even with bending displacements of up to 30 cm. This advance has a significant impact on the clinical translation of biomedical endoscopic diagnostic techniques that use angular scattering to determine the size of cell nuclei to detect early cancer.

    more » « less
  2. Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is critical for the model developer to identify the root cause and take mitigating actions. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts. 
    more » « less
    Free, publicly-accessible full text available June 15, 2024
  3. Free, publicly-accessible full text available July 1, 2024
  4. Incorporating machine learning (ML) components into software products raises new software-engineering challenges and exacerbates existing ones. Many researchers have invested significant effort in understanding the challenges of industry practitioners working on building products with ML components, through interviews and surveys with practitioners. With the intention to aggregate and present their collective findings, we conduct a meta-summary study: We collect 50 relevant papers that together interacted with over 4758 practitioners using guidelines for systematic literature reviews. We then collected, grouped, and organized the over 500 mentions of challenges within those papers. We highlight the most commonly reported challenges and hope this meta-summary will be a useful resource for the research community to prioritize research and education in this field. 
    more » « less
  5. While serverless platforms substantially simplify the provisioning, configuration, and management of cloud applications, implementing correct services on top of these platforms can present significant challenges to programmers. For example, serverless infrastructures introduce a host of failure modes that are not present in traditional deployments. Individual serverless instances can fail while others continue to make progress, correct but slow instances can be killed by the cloud provider as part of resource management, and providers will often respond to such failures by re-executing requests. For functions with side-effects, these scenarios can create behaviors that are not observable in serverful deployments. In this paper, we propose mu2sls, a framework for implementing microservice applications on serverless using standard Python code with two extra primitives: transactions and asynchronous calls. Our framework orchestrates user-written services to address several challenges, such as failures and re-executions, and provides formal guarantees that the generated serverless implementations are correct. To that end, we present a novel service specification abstraction and formalization of serverless implementations that facilitate reasoning about the correctness of a given application’s serverless implementation. This formalization forms the basis of the mu2sls prototype, which we then use to develop a few real-world microservice applications and show that the performance of the generated serverless implementations achieves significant scalability (3-5× the throughput of a sequential implementation) while providing correctness guarantees in the context of faults, re-execution, and concurrency. 
    more » « less
  6. We present a machine learning method for detecting and staging cervical dysplastic tissue using light scattering data based on a convolutional neural network (CNN) architecture. Depth-resolved angular scattering measurements from two clinical trials were used to generate independent training and validation sets as input of our model. We report 90.3% sensitivity, 85.7% specificity, and 87.5% accuracy in classifying cervical dysplasia, showing the uniformity of classification of a/LCI scans across different instruments. Further, our deep learning approach significantly improved processing speeds over the traditional Mie theory inverse light scattering analysis (ILSA) method, with a hundredfold reduction in processing time, offering a promising approach for a/LCI in the clinic for assessing cervical dysplasia.

    more » « less
  7. Eliassi-Rad, Tina (Ed.)
    Multidimensional unfolding methods are widely used for visualizing item response data. Such methods project respondents and items simultaneously onto a low-dimensional Eu- clidian space, in which respondents and items are represented by ideal points, with person- person, item-item, and person-item similarities being captured by the Euclidian distances between the points. In this paper, we study the visualization of multidimensional unfold- ing from a statistical perspective. We cast multidimensional unfolding into an estimation problem, where the respondent and item ideal points are treated as parameters to be esti- mated. An estimator is then proposed for the simultaneous estimation of these parameters. Asymptotic theory is provided for the recovery of the ideal points, shedding lights on the validity of model-based visualization. An alternating projected gradient descent algorithm is proposed for the parameter estimation. We provide two illustrative examples, one on users’ movie rating and the other on senate roll call voting. 
    more » « less
  8. null (Ed.)
    This paper introduces Beldi, a library and runtime system for writing and composing fault-tolerant and transactional stateful serverless functions. Beldi runs on existing providers and lets developers write complex stateful applications that require fault tolerance and transactional semantics without the need to deal with tasks such as load balancing or maintaining virtual machines. Beldi’s contributions include extending the log-based fault-tolerant approach in Olive (OSDI 2016) with new data structures, transaction protocols, function invocations, and garbage collection. They also include adapting the resulting framework to work over a federated environment where each serverless function has sovereignty over its own data. We implement three applications on Beldi, including a movie review service, a travel reservation system, and a social media site. Our evaluation on 1,000 AWS Lambdas shows that Beldi’s approach is effective and affordable. 
    more » « less