Abstract The study presented in this paper applies hidden Markov modeling (HMM) to uncover the recurring patterns within a neural activation dataset collected while designers engaged in a design concept generation task. HMM uses a probabilistic approach that describes data (here, fMRI neuroimaging data) as a dynamic sequence of discrete states. Without prior assumptions on the fMRI data's temporal and spatial properties, HMM enables an automatic inference on states in neurocognitive activation data that are highly likely to occur in concept generation. The states with a higher likelihood of occupancy show more activation in the brain regions from the executive control network, the default mode network, and the middle temporal cortex. Different activation patterns and transfers are associated with these states, linking to varying cognitive functions, for example, semantic processing, memory retrieval, executive control, and visual processing, that characterize possible transitions in cognition related to concept generation. HMM offers new insights into cognitive dynamics in design by uncovering the temporal and spatial patterns in neurocognition related to concept generation. Future research can explore new avenues of data analysis methods to investigate design neurocognition and provide a more detailed description of cognitive dynamics in design.
more »
« less
Latent State Models of Training Dynamics
The impact of randomness on model training is poorly understood. How do differences in data order and initialization actually manifest in the model, such that some training runs outperform others or converge faster? Furthermore, how can we interpret the resulting training dynamics and the phase transitions that characterize different trajectories? To understand the effect of randomness on the dynamics and outcomes of neural network training, we train models multiple times with different random seeds and compute a variety of metrics throughout training, such as the norm, mean, and variance of the neural network's weights. We then fit a hidden Markov model (HMM) over the resulting sequences of metrics. The HMM represents training as a stochastic process of transitions between latent states, providing an intuitive overview of significant changes during training. Using our method, we produce a low-dimensional, discrete representation of training dynamics on grokking tasks, image classification, and masked language modeling. We use the HMM representation to study phase transitions and identify latent "detour" states that slow down convergence.
more »
« less
- Award ID(s):
- 1922658
- PAR ID:
- 10535868
- Publisher / Repository:
- Transactions on Machine Learning Research (TMLR) 2023
- Date Published:
- Format(s):
- Medium: X
- Sponsoring Org:
- National Science Foundation
More Like this
-
-
Cognition and attention arise from the adaptive coordination of neural systems in response to external and internal demands. The low-dimensional latent subspace that underlies large-scale neural dynamics and the relationships of these dynamics to cognitive and attentional states, however, are unknown. We conducted functional magnetic resonance imaging as human participants performed attention tasks, watched comedy sitcom episodes and an educational documentary, and rested. Whole-brain dynamics traversed a common set of latent states that spanned canonical gradients of functional brain organization, with global desynchronization among functional networks modulating state transitions. Neural state dynamics were synchronized across people during engaging movie watching and aligned to narrative event structures. Neural state dynamics reflected attention fluctuations such that different states indicated engaged attention in task and naturalistic contexts, whereas a common state indicated attention lapses in both contexts. Together, these results demonstrate that traversals along large-scale gradients of human brain organization reflect cognitive and attentional dynamics.more » « less
-
Existing Neural Architecture Search (NAS) methods either encode neural architectures using discrete encodings that do not scale well, or adopt supervised learning-based methods to jointly learn architecture representations and optimize architecture search on such representations which incurs search bias. Despite the widespread use, architecture representations learned in NAS are still poorly understood. We observe that the structural properties of neural architectures are hard to preserve in the latent space if architecture representation learning and search are coupled, resulting in less effective search performance. In this work, we find empirically that pre-training architecture representations using only neural architectures without their accuracies as labels improves the downstream architecture search efficiency. To explain this finding, we visualize how unsupervised architecture representation learning better encourages neural architectures with similar connections and operators to cluster together. This helps map neural architectures with similar performance to the same regions in the latent space and makes the transition of architectures in the latent space relatively smooth, which considerably benefits diverse downstream search strategies.more » « less
-
Abstract The response process of problem‐solving items contains rich information about respondents' behaviours and cognitive process in the digital tasks, while the information extraction is a big challenge. The aim of the study is to use a data‐driven approach to explore the latent states and state transitions underlying problem‐solving process to reflect test‐takers' behavioural patterns, and to investigate how these states and state transitions could be associated with test‐takers' performance. We employed the Hidden Markov Modelling approach to identify test takers' hidden states during the problem‐solving process and compared the frequency of states and/or state transitions between different performance groups. We conducted comparable studies in two problem‐solving items with a focus on the US sample that was collected in PIAAC 2012, and examined the correlation between those frequencies from two items. Latent states and transitions between them underlying the problem‐solving process were identified and found significantly different by performance groups. The groups with correct responses in both items were found more engaged in tasks and more often to use efficient tools to solve problems, while the group with incorrect responses was found more likely to use shorter action sequences and exhibit hesitative behaviours. Consistent behavioural patterns were identified across items. This study demonstrates the value of data‐driven based HMM approach to better understand respondents' behavioural patterns and cognitive transmissions underneath the observable action sequences in complex problem‐solving tasks.more » « less
-
null (Ed.)In this paper, a deep neural network hidden Markov model (DNN-HMM) is proposed to detect pipeline leakage location. A long pipeline is divided into several sections and the leakage occurs in different section that is defined as different state of hidden Markov model (HMM). The hybrid HMM, i.e., DNN-HMM, consists of a deep neural network (DNN) with multiple layers to exploit the non-linear data. The DNN is initialized by using a deep belief network (DBN). The DBN is a pre-trained model built by stacking top-down restricted Boltzmann machines (RBM) that compute the emission probabilities for the HMM instead of Gaussian mixture model (GMM). Two comparative studies based on different numbers of states using Gaussian mixture model-hidden Markov model (GMM-HMM) and DNN-HMM are performed. The accuracy of the testing performance between detected state sequence and actual state sequence is measured by micro F1 score. The micro F1 score approaches 0.94 for GMM-HMM method and it is close to 0.95 for DNN-HMM method when the pipeline is divided into three sections. In the experiment that divides the pipeline as five sections, the micro F1 score for GMM-HMM is 0.69, while it approaches 0.96 with DNN-HMM method. The results demonstrate that the DNN-HMM can learn a better model of non-linear data and achieve better performance compared to GMM-HMM method.more » « less
An official website of the United States government

