AIDS is a syndrome caused by the HIV. During the progression of AIDS, a patient's immune system is weakened, which increases the patient's susceptibility to infections and diseases. Although antiretroviral drugs can effectively suppress HIV, the virus mutates very quickly and can become resistant to treatment. In addition, the virus can also become resistant to other treatments not currently being used through mutations, which is known in the clinical research community as cross-resistance. Since a single HIV strain can be resistant to multiple drugs, this problem is naturally represented as a multilabel classification problem. Given this multilabel relationship, traditional single-label classification methods often fail to effectively identify the drug resistances that may develop after a particular virus mutation. In this work, we propose a novel multilabel Robust Sample Specific Distance (RSSD) method to identify multiclass HIV drug resistance. Our method is novel in that it can illustrate the relative strength of the drug resistance of a reverse transcriptase (RT) sequence against a given drug nucleoside analog and learn the distance metrics for all the drug resistances. To learn the proposed RSSDs, we formulate a learning objective that maximizes the ratio of the summations of a number of ℓ1-norm distances, which is difficult to solve in general. To solve this optimization problem, we derive an efficient, nongreedy iterative algorithm with rigorously proved convergence. Our new method has been verified on a public HIV type 1 drug resistance data set with over 600 RT sequences and five nucleoside analogs. We compared our method against several state-of-the-art multilabel classification methods, and the experimental results have demonstrated the effectiveness of our proposed method.
more »
« less
A weak‐signal‐assisted procedure for variable selection and statistical inference with an informative subsample
Abstract This paper is motivated from an HIV‐1 drug resistance study where we encounter three analytical challenges: to analyze data with an informative subsample, to take into account the weak signals, and to detect important signals and also conduct statistical inference. We start with an initial estimation method, which adopts a penalized pairwise conditional likelihood approach for variable selection. This initial estimator incorporates the informative subsample issue. To accounting for the effect of weak signals, we use a key idea of partial ridge regression. We also propose a one‐step estimation method for each of the signal coefficients and then construct confidence intervals accordingly. We apply the proposed method to the Stanford HIV‐1 drug resistance study and compare the results with existing approaches. We also conduct comprehensive simulation studies to demonstrate the superior performance of our proposed method.
more »
« less
- Award ID(s):
- 2019461
- PAR ID:
- 10450068
- Publisher / Repository:
- Oxford University Press
- Date Published:
- Journal Name:
- Biometrics
- Volume:
- 77
- Issue:
- 3
- ISSN:
- 0006-341X
- Format(s):
- Medium: X Size: p. 996-1010
- Size(s):
- p. 996-1010
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Acquired immunodeficiency syndrome (AIDS) is a syndrome caused by the human immunodeficiency virus (HIV). During the progression of AIDS, a patient’s the immune system is weakened, which increases the patient’s susceptibility to infections and diseases. Although antiretroviral drugs can effectively suppress HIV, the virus mutates very quickly and can become resistant to treatment. In addition, the virus can also become resistant to other treatments not currently being used through mutations, which is known in the clinical research community as cross-resistance. Since a single HIV strain can be resistant to multiple drugs, this problem is naturally represented as a multi-label classification problem. Given this multi-class relationship, traditional single-label classification methods usually fail to effectively identify the drug resistances that may develop after a particular virus mutation. In this paper, we propose a novel multi-label Robust Sample Specific Distance (RSSD) method to identify multi-class HIV drug resistance. Our method is novel in that it can illustrate the relative strength of the drug resistance of a reverse transcriptase sequence against a given drug nucleoside analogue and learn the distance metrics for all the drug resistances. To learn the proposed RSSDs, we formulate a learning objective that maximizes the ratio of the summations of a number of ℓ1-norm distances, which is difficult to solve in general. To solve this optimization problem, we derive an efficient, non-greedy, iterative algorithm with rigorously proved convergence. Our new method has been verified on a public HIV-1 drug resistance data set with over 600 RT sequences and five nucleoside analogues. We compared our method against other state-of-the-art multi-label classification methods and the experimental results have demonstrated the effectiveness of our proposed method.more » « less
-
Drug-resistant HIV-1 has caused a growing concern in clinic and public health. Although combination antiretroviral therapy can contribute massively to the suppression of viral loads in patients with HIV-1, it cannot lead to viral eradication. Continuing viral replication during sub-optimal therapy (due to poor adherence or other reasons) may lead to the accumulation of drug resistance mutations, resulting in an increased risk of disease progression. Many studies also suggest that events occurring during the early stage of HIV-1 infection (i.e., the first few hours to days following HIV exposure) may determine whether the infection can be successfully established. However, the numbers of infected cells and viruses during the early stage are extremely low and stochasticity may play a critical role in dictating the fate of infection. In this paper, we use stochastic models to investigate viral infection and the emergence of drug resistance of HIV-1. The stochastic model is formulated by a continuous-time Markov chain (CTMC), which is derived based on an ordinary differential equation model proposed by Kitayimbwa et al. that includes both forward and backward mutations. An analytic estimate of the probability of the clearance of HIV infection of the CTMC model near the infection-free equilibrium is obtained by a multitype branching process approximation. The analytical predictions are validated by numerical simulations. Unlike the deterministic dynamics where the basic reproduction number $$ \mathcal{R}_0 $$ serves as a sharp threshold parameter (i.e., the disease dies out if $$ \mathcal{R}_0 < 1 $$ and persists if $$ \mathcal{R}_0 > 1 $$), the stochastic models indicate that there is always a positive probability for HIV infection to be eradicated in patients. In the presence of antiretroviral therapy, our results show that the chance of clearance of the infection tends to increase although drug resistance is likely to emerge.more » « less
-
Abstract Although combination antiretroviral therapy (ART) with three or more drugs is highly effective in suppressing viral load for people with HIV (human immunodeficiency virus), many ART agents may exacerbate mental health‐related adverse effects including depression. Therefore, understanding the effects of combination ART on mental health can help clinicians personalize medicine with less adverse effects to avoid undesirable health outcomes. The emergence of electronic health records offers researchers' unprecedented access to HIV data including individuals' mental health records, drug prescriptions, and clinical information over time. However, modeling such data is challenging due to high dimensionality of the drug combination space, the individual heterogeneity, and sparseness of the observed drug combinations. To address these challenges, we develop a Bayesian nonparametric approach to learn drug combination effect on mental health in people with HIV adjusting for sociodemographic, behavioral, and clinical factors. The proposed method is built upon the subset‐tree kernel that represents drug combinations in a way that synthesizes known regimen structure into a single mathematical representation. It also utilizes a distance‐dependent Chinese restaurant process to cluster heterogeneous populations while considering individuals' treatment histories. We evaluate the proposed approach through simulation studies, and apply the method to a dataset from the Women's Interagency HIV Study, showing the clinical utility of our model in guiding clinicians to prescribe informed and effective personalized treatment based on individuals' treatment histories and clinical characteristics.more » « less
-
Abstract This article presents generalized semiparametric regression models for conditional cumulative incidence functions with competing risks data when covariates are missing by sampling design or happenstance. A doubly robust augmented inverse probability weighted (AIPW) complete‐case approach to estimation and inference is investigated. This approach modifies IPW complete‐case estimating equations by exploiting the key features in the relationship between the missing covariates and the phase‐one data to improve efficiency. An iterative numerical procedure is derived to solve the nonlinear estimating equations. The asymptotic properties of the proposed estimators are established. A simulation study examining the finite‐sample performances of the proposed estimators shows that the AIPW estimators are more efficient than the IPW estimators. The developed method is applied to the RV144 HIV‐1 vaccine efficacy trial to investigate vaccine‐induced IgG binding antibodies to HIV‐1 as correlates of acquisition of HIV‐1 infection while taking account of whether the HIV‐1 sequences are near or far from the HIV‐1 sequences represented in the vaccine construct.more » « less
An official website of the United States government
