skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Title: Socio‐Environmental Determinants of Mental and Behavioral Disorders in Youth: A Machine Learning Approach
Abstract Growing evidence indicates that extreme environmental conditions in summer months have an adverse impact on mental and behavioral disorders (MBD), but there is limited research looking at youth populations. The objective of this study was to apply machine learning approaches to identify key variables that predict MBD‐related emergency room (ER) visits in youths in select North Carolina cities among adolescent populations. Daily MBD‐related ER visits, which totaled over 42,000 records, were paired with daily environmental conditions, as well as sociodemographic variables to determine if certain conditions lead to higher vulnerability to exacerbated mental health disorders. Four machine learning models (i.e., generalized linear model, generalized additive model, extreme gradient boosting, random forest) were used to assess the predictive performance of multiple environmental and sociodemographic variables on MBD‐related ER visits for all cities. The best‐performing machine learning model was then applied to each of the six individual cities. As a subanalysis, a distributed lag nonlinear model was used to confirm results. In the all cities scenario, sociodemographic variables contributed the greatest to the overall MBD prediction. In the individual cities scenario, four cities had a 24‐hr difference in the maximum temperature, and two of the cities had a 24‐hr difference in the minimum temperature, maximum temperature, or Normalized Difference Vegetation Index as a leading predictor of MBD ER visits. Results can inform the use of machine learning models for predicting MBD during high‐temperature events and identify variables that affect youth MBD responses during these events.  more » « less
Award ID(s):
2044839
PAR ID:
10462423
Author(s) / Creator(s):
 ;  ;  ;  
Publisher / Repository:
DOI PREFIX: 10.1029
Date Published:
Journal Name:
GeoHealth
Volume:
7
Issue:
9
ISSN:
2471-1403
Format(s):
Medium: X
Sponsoring Org:
National Science Foundation
More Like this
  1. Abstract Mental distress among young people has increased in recent years. Research suggests that greenspace may benefit mental health. The objective of this exploratory study is to further understanding of place‐based differences (i.e., urbanity) in the greenspace‐mental health association. We leverage publicly available greenspace data sets to operationalize greenspace quantity, quality, and accessibility metrics at the community‐level. Emergency department visits for young people (ages 24 and under) were coded for: anxiety, depression, mood disorders, mental and behavioral disorders, and substance use disorders. Generalized linear models investigated the association between greenspace metrics and community‐level mental health burden; results are reported as prevalence rate ratios (PRR). Urban and suburban communities with the lowest quantities of greenspace had the highest prevalence of poor mental health outcomes, particularly for mood disorders in urban areas (PRR: 1.19, 95% CI: 1.16–1.21), and substance use disorders in suburban areas (PRR: 1.35, 95% CI: 1.28–1.43). In urban, micropolitan, and rural/isolated areas further distance to greenspace was associated with a higher prevalence of poor mental health outcomes; this association was most pronounced for substance use disorders (PRRUrban: 1.31, 95% CI: 1.29–1.32; PRRMicropolitan: 1.47, 95% CI: 1.43–1.51; PRRRural 2.38: 95% CI: 2.19–2.58). In small towns and rural/isolated communities, poor mental health outcomes were more prevalent in communities with the worst greenspace quality; this association was most pronounced for mental and behavioral disorders in small towns (PRR: 1.29, 95% CI: 1.24–1.35), and for anxiety disorders in rural/isolated communities (PRR: 1.61, 95% CI: 1.43–1.82). The association between greenspace metrics and mental health outcomes among young people is place‐based with variations across the rural‐urban continuum. 
    more » « less
  2. Soil nitrous oxide (N2O) emissions exhibit high variability in intensively managed cropping systems, which challenges our ability to understand their complex interactions with controlling factors. We leveraged 17-years (2003-2019) of measurements at the Kellogg Biological Station LTER/LTAR site to better understand controls of N2O emissions in four corn–soybean–winter wheat rotations employing Conventional, No-till, Reduced input, and Biologically-based/organic inputs. We used a Random Forest machine learning model to predict daily N2O fluxes, trained separately for each system with 70% of observations, using variables such as crop species, daily air temperature, cumulative 2-day precipitation, water-filled pore space, and soil nitrate and ammonium concentrations. The model explained 29 to 42% of daily N2O flux variability in test data, with greater predictability for the corn phase in each system. The long-term rotations showed different controlling factors and threshold conditions influencing N2O emissions. In the Conventional system, the model identified ammonium (>15 kg N ha-1) and daily temperature (>23 °C) as the most influential variables; in the No-till system, climate variables, precipitation, and temperature were important variables. In low input and organic systems, where red clover (Trifolium repens L.; before corn) and cereal rye (Secale cereale L.; before soybean) cover crops were integrated, nitrate was the predominant variable, followed by precipitation and temperature. In low input and biologically-based systems, red clover residues increased soil nitrogen availability to influence N2O emissions. Long-term data facilitated machine learning for predicting N2O emissions in response to differential controls and threshold responses to management, environmental, and biogeochemical drivers. 
    more » « less
  3. Abstract Emerging research suggests that internet search patterns may provide timely, actionable insights into adverse health impacts from, and behavioral responses to, days of extreme heat, but few studies have evaluated this hypothesis, and none have done so across the United States. We used two-stage distributed lag nonlinear models to quantify the interrelationships between daily maximum ambient temperature, internet search activity as measured by Google Trends, and heat-related emergency department (ED) visits among adults with commercial health insurance in 30 US metropolitan areas during the warm seasons (May to September) from 2016 to 2019. Maximum daily temperature was positively associated with internet searches relevant to heat, and searches were in turn positively associated with heat-related ED visits. Moreover, models combining internet search activity and temperature had better predictive ability for heat-related ED visits compared to models with temperature alone. These results suggest that internet search patterns may be useful as a leading indicator of heat-related illness or stress. 
    more » « less
  4. Abstract Soil nitrous oxide (N2O) emissions exhibit high variability in intensively managed cropping systems, which challenges our ability to understand their complex interactions with controlling factors. We leveraged 17 years (2003–2019) of measurements at the Kellogg Biological Station Long‐Term Ecological Research (LTER)/Long‐Term Agroecosystem Research (LTAR) site to better understand the controls of N2O emissions in four corn–soybean–winter wheat rotations employing conventional, no‐till, reduced input, and biologically based/organic inputs. We used a random forest machine learning model to predict daily N2O fluxes, trained separately for each system with 70% of observations, using variables such as crop species, daily air temperature, cumulative 2‐day precipitation, water‐filled pore space, and soil nitrate and ammonium concentrations. The model explained 29%–42% of daily N2O flux variability in the test data, with greater predictability for the corn phase in each system. The long‐term rotations showed different controlling factors and threshold conditions influencing N2O emissions. In the conventional system, the model identified ammonium (>15 kg N ha−1) and daily air temperature (>23°C) as the most influential variables; in the no‐till system, climate variables such as precipitation and air temperature were important variables. In low‐input and organic systems, where red clover (Trifolium repensL.; before corn) and cereal rye (Secale cerealeL.; before soybean) cover crops were integrated, nitrate was the predominant predictor of N2O emissions, followed by precipitation and air temperature. In low‐input and biologically based systems, red clover residues increased soil nitrogen availability to influence N2O emissions. Long‐term data facilitated machine learning for predicting N2O emissions in response to differential controls and threshold responses to management, environmental, and biogeochemical drivers. 
    more » « less
  5. Abstract Predicting rain from large-scale environmental variables remains a challenging problem for climate models and it is unclear how well numerical methods can predict the true characteristics of rainfall without smaller (storm) scale information. This study explores the ability of three statistical and machine learning methods to predict 3-hourly rain occurrence and intensity at 0.5° resolution over the tropical Pacific Ocean using rain observations the Global Precipitation Measurement (GPM) satellite radar and large-scale environmental profiles of temperature and moisture from the MERRA-2 reanalysis. We also separated the rain into different types (deep convective, stratiform, and shallow convective) because of their varying kinematic and thermodynamic structures that might respond to the large-scale environment in different ways. Our expectation was that the popular machine learning methods (i.e., the neural network and random forest) would outperform a standard statistical method (a generalized linear model) because of their more flexible structures, especially in predicting the highly skewed distribution of rain rates for each rain type. However, none of the methods obviously distinguish themselves from one another and each method still has issues with predicting rain too often and not fully capturing the high end of the rain rate distributions, both of which are common problems in climate models. One implication of this study is that machine learning tools must be carefully assessed and are not necessarily applicable to solving all big data problems. Another implication is that traditional climate model approaches are not sufficient to predict extreme rain events and that other avenues need to be pursued. 
    more » « less