Cognitive tasks engage multiple brain regions. Studying how these regions interact is key to understand the neural bases of cognition. Standard approaches to model the interactions between brain regions rely on univariate statistical dependence. However, newly developed methods can capture multivariate dependence. Multivariate pattern dependence (MVPD) is a powerful and flexible approach that trains and tests multivariate models of the interactions between brain regions using independent data. In this article, we introduce PyMVPD: an open source toolbox for multivariate pattern dependence. The toolbox includes linear regression models and artificial neural network models of the interactions between regions. It is designed to be easily customizable. We demonstrate example applications of PyMVPD using well-studied seed regions such as the fusiform face area (FFA) and the parahippocampal place area (PPA). Next, we compare the performance of different model architectures. Overall, artificial neural networks outperform linear regression. Importantly, the best performing architecture is region-dependent: MVPD subdivides cortex in distinct, contiguous regions whose interaction with FFA and PPA is best captured by different models.
more »
« less
LOCALLY ADAPTIVE AND DIFFERENTIABLE REGRESSION
Overparameterized models like deep nets and random forests have become very popular in machine learning. However, the natural goals of continuity and differentiability, common in regression models, are now often ignored in modern overparametrized, locally adaptive models. We propose a general framework to construct a global continuous and differentiable model based on a weighted average of locally learned models in corresponding local regions. This model is competitive in dealing with data with different densities or scales of function values in different local regions. We demonstrate that when we mix kernel ridge and polynomial regression terms in the local models, and stitch themtogether continuously, we achieve faster statistical convergence in theory and improved performance in various practical settings.
more »
« less
- Award ID(s):
- 1953350
- PAR ID:
- 10537979
- Publisher / Repository:
- Begellhouse
- Date Published:
- Journal Name:
- Journal of Machine Learning for Modeling and Computing
- Volume:
- 4
- Issue:
- 4
- ISSN:
- 2689-3967
- Page Range / eLocation ID:
- 103 to 122
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
The increasing adoption of machine learning tools has led to calls for accountability via model interpretability. But what does it mean for a machine learning model to be interpretable by humans, and how can this be assessed? We focus on two definitions of interpretability that have been introduced in the machine learning literature: simulatability (a user's ability to run a model on a given input) and "what if" local explainability (a user's ability to correctly determine a model's prediction under local changes to the input, given knowledge of the model's original prediction). Through a user study with 1,000 participants, we test whether humans perform well on tasks that mimic the definitions of simulatability and "what if" local explainability on models that are typically considered locally interpretable. To track the relative interpretability of models, we employ a simple metric, the runtime operation count on the simulatability task. We find evidence that as the number of operations increases, participant accuracy on the local interpretability tasks decreases. In addition, this evidence is consistent with the common intuition that decision trees and logistic regression models are interpretable and are more interpretable than neural networks.more » « less
-
In this paper, we propose a Spatial Robust Mixture Regression model to investigate the relationship between a response variable and a set of explanatory variables over the spatial domain, assuming that the relationships may exhibit complex spatially dynamic patterns that cannot be captured by constant regression coefficients. Our method integrates the robust finite mixture Gaussian regression model with spatial constraints, to simultaneously handle the spatial non-stationarity, local homogeneity, and outlier contaminations. Compared with existing spatial regression models, our proposed model assumes the existence a few distinct regression models that are estimated based on observations that exhibit similar response-predictor relationships. As such, the proposed model not only accounts for non-stationarity in the spatial trend, but also clusters observations into a few distinct and homogenous groups. This provides an advantage on interpretation with a few stationary sub-processes identified that capture the predominant relationships between response and predictor variables. Moreover, the proposed method incorporates robust procedures to handle contaminations from both regression outliers and spatial outliers. By doing so, we robustly segment the spatial domain into distinct local regions with similar regression coefficients, and sporadic locations that are purely outliers. Rigorous statistical hypothesis testing procedure has been designed to test the significance of such segmentation. Experimental results on many synthetic and real-world datasets demonstrate the robustness, accuracy, and effectiveness of our proposed method, compared with other robust finite mixture regression, spatial regression and spatial segmentation methods.more » « less
-
We model the societal task of redistricting political districts as a partitioning problem: Given a set of n points in the plane, each belonging to one of two parties, and a parameter k, our goal is to compute a partition P of the plane into regions so that each region contains roughly s = n/k points. P should satisfy a notion of "local" fairness, which is related to the notion of core, a well-studied concept in cooperative game theory. A region is associated with the majority party in that region, and a point is unhappy in P if it belongs to the minority party. A group D of roughly s contiguous points is called a deviating group with respect to P if majority of points in D are unhappy in P. The partition P is locally fair if there is no deviating group with respect to P.This paper focuses on a restricted case when points lie in 1D. The problem is non-trivial even in this case. We consider both adversarial and "beyond worst-case" settings for this problem. For the former, we characterize the input parameters for which a locally fair partition always exists; we also show that a locally fair partition may not exist for certain parameters. We then consider input models where there are "runs" of red and blue points. For such clustered inputs, we show that a locally fair partition may not exist for certain values of s, but an approximate locally fair partition exists if we allow some regions to have smaller sizes. We finally present a polynomial-time algorithm for computing a locally fair partition if one exists.more » « less
-
Traffic crashes significantly contribute to global fatalities, particularly in urban areas, highlighting the need to evaluate the relationship between urban environments and traffic safety. This study extends former spatial modeling frameworks by drawing paths between global models, including spatial lag (SLM), and spatial error (SEM), and local models, including geographically weighted regression (GWR), multi-scale geographically weighted regression (MGWR), and multi-scale geographically weighted regression with spatially lagged dependent variable (MGWRL). Utilizing the proposed framework, this study analyzes severe traffic crashes in relation to urban built environments using various spatial regression models within Leon County, Florida. According to the results, SLM outperforms OLS, SEM, and GWR models. Local models with lagged dependent variables outperform both the global and generic versions of the local models in all performance measures, whereas MGWR and MGWRL outperform GWR and GWRL. Local models performed better than global models, showing spatial non-stationarity; so, the relationship between the dependent and independent variables varies over space. The better performance of models with lagged dependent variables signifies that the spatial distribution of severe crashes is correlated. Finally, the better performance of multi-scale local models than classical local models indicates varying influences of independent variables with different bandwidths. According to the MGWRL model, census block groups close to the urban area with higher population, higher education level, and lower car ownership rates have lower crash rates. On the contrary, motor vehicle percentage for commuting is found to have a negative association with severe crash rate, which suggests the locality of the mentioned associations.more » « less
An official website of the United States government

