Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
ABSTRACT Advances in digital phenotyping have opened the door to continuous, individualised monitoring of mental health, but realising the full potential of these data demands machine learning models that can operate effectively in ‘small‐data’ regimes—where per‐user data are sparse, irregular and noisy. This article explores the feasibility, challenges and opportunities of small‐data machine learning approaches for forecasting individual‐level mental health trajectories. We examine the limitations of traditional clinical tools and population‐level models and argue that fine‐grained time‐series forecasting, powered by models such as tabular prior‐data fitted networks (TabPFN), Gaussian processes, Kalman filters and meta‐learning strategies, offers a path towards personalised, proactive psychiatry. Emphasis is placed on key clinical requirements: real‐time adaptation, uncertainty quantification, feature‐level interpretability and respect for interindividual variability. We discuss implementation barriers including data quality, model transparency and ethical considerations and propose practical pathways for deployment—such as integrated biosensor platforms and just‐in‐time adaptive interventions (JITAIs). We highlight the emerging convergence of small‐data ML, mobile sensing and clinical insight as a transformative force in mental healthcare. With interdisciplinary collaboration and prospective validation, these technologies have the potential to shift psychiatry from reactive symptom management to anticipatory, personalised intervention.more » « less
-
Abstract Inferring gene regulatory networks (GRNs) from single-cell data is challenging due to heuristic limitations. Existing methods also lack estimates of uncertainty. Here we present Probabilistic Matrix Factorization for Gene Regulatory Network Inference (PMF-GRN). Using single-cell expression data, PMF-GRN infers latent factors capturing transcription factor activity and regulatory relationships. Using variational inference allows hyperparameter search for principled model selection and direct comparison to other generative models. We extensively test and benchmark our method using real single-cell datasets and synthetic data. We show that PMF-GRN infers GRNs more accurately than current state-of-the-art single-cell GRN inference methods, offering well-calibrated uncertainty estimates.more » « less
-
Abstract Anomaly, or out-of-distribution, detection is a promising tool for aiding discoveries of new particles or processes in particle physics. In this work, we identify and address two overlooked opportunities to improve anomaly detection (AD) for high-energy physics. First, rather than train a generative model on the single most dominant background process, we build detection algorithms using representation learning from multiple background types, thus taking advantage of more information to improve estimation of what is relevant for detection. Second, we generalize decorrelation to the multi-background setting, thus directly enforcing a more complete definition of robustness for AD. We demonstrate the benefit of the proposed robust multi-background AD algorithms on a high-dimensional dataset of particle decays at the Large Hadron Collider.more » « less
-
Abstract Normative models of synaptic plasticity use computational rationales to arrive at predictions of behavioral and network-level adaptive phenomena. In recent years, there has been an explosion of theoretical work in this realm, but experimental confirmation remains limited. In this review, we organize work on normative plasticity models in terms of a set of desiderata that, when satisfied, are designed to ensure that a given model demonstrates a clear link between plasticity and adaptive behavior, is consistent with known biological evidence about neural plasticity and yields specific testable predictions. As a prototype, we include a detailed analysis of the REINFORCE algorithm. We also discuss how new models have begun to improve on the identified criteria and suggest avenues for further development. Overall, we provide a conceptual guide to help develop neural learning theories that are precise, powerful, and experimentally testable.more » « less
-
Abstract Natural behaviors occur in closed action-perception loops and are supported by dynamic and flexible beliefs abstracted away from our immediate sensory milieu. How this real-world flexibility is instantiated in neural circuits remains unknown. Here, we have male macaques navigate in a virtual environment by primarily leveraging sensory (optic flow) signals, or by more heavily relying on acquired internal models. We record single-unit spiking activity simultaneously from the dorsomedial superior temporal area (MSTd), parietal area 7a, and the dorso-lateral prefrontal cortex (dlPFC). Results show that while animals were able to maintain adaptive task-relevant beliefs regardless of sensory context, the fine-grain statistical dependencies between neurons, particularly in 7a and dlPFC, dynamically remapped with the changing computational demands. In dlPFC, but not 7a, destroying these statistical dependencies abolished the area’s ability for cross-context decoding. Lastly, correlational analyses suggested that the more unit-to-unit couplings remapped in dlPFC, and the less they did so in MSTd, the less were population codes and behavior impacted by the loss of sensory evidence. We conclude that dynamic functional connectivity between neurons in prefrontal cortex maintain a stable population code and context-invariant beliefs during naturalistic behavior.more » « less
-
Abstract Sensory-guided behavior requires reliable encoding of stimulus information in neural populations, and flexible, task-specific readout. The former has been studied extensively, but the latter remains poorly understood. We introduce a theory for adaptive sensory processing based on functionally-targeted stochastic modulation. We show that responses of neurons in area V1 of monkeys performing a visual discrimination task exhibit low-dimensional, rapidly fluctuating gain modulation, which is stronger in task-informative neurons and can be used to decode from neural activity after few training trials, consistent with observed behavior. In a simulated hierarchical neural network model, such labels are learned quickly and can be used to adapt downstream readout, even after several intervening processing stages. Consistently, we find the modulatory signal estimated in V1 is also present in the activity of simultaneously recorded MT units, and is again strongest in task-informative neurons. These results support the idea that co-modulation facilitates task-adaptive hierarchical information routing.more » « less
-
Chain of thought is a natural inference-time method for increasing the computational power of transformer-based large language models (LLMs), but comes at the cost of sequential decoding. Are there more efficient alternatives to expand a transformer's expressive power without adding parameters? We consider transformers with padding tokens as a form of parallelizable test-time compute. We show that averaging-hard-attention, masked-pre-norm transformers with polynomial padding recognize precisely the class FO -uniform TC of extremely parallelizable problems. While the TC upper bound was known, proving a matching lower bound had been elusive. Further, our novel analysis reveals the precise expanded power of padded transformers when coupled with another form of inference-time compute, namely dynamically increasing depth via looping. Our core technical contribution is to show how padding helps bring the notions of complete problems and reductions, which have been a cornerstone of classical complexity theory, to the formal study of transformers. Armed with this new tool, we prove that padded transformers with looping on inputs of length recognize exactly the class FO -uniform TC of moderately parallelizable problems. Thus, padding and looping together systematically expand transformers' expressive power: with polylogarithmic looping, polynomially padded transformers recognize precisely the class FO -uniform NC , the best that could be expected without losing parallelism (unless NCP ). Our results thus motivate further exploration of padding and looping as parallelizable alternatives to chain of thought for test-time compute.more » « less
-
Language model (LM) agents are increasingly used as autonomous decision-makers which need to actively gather information to guide their decisions. A crucial cognitive skill for such agents is the efficient exploration and understanding of the causal structure of the world—key to robust, scientifically grounded reasoning. Yet, it remains unclear whether LMs possess this capability or exhibit systematic biases leading to erroneous conclusions. In this work, we examine LMs’ ability to explore and infer causal relationships, using the well-established Blicket Test paradigm from developmental psychology. We find that LMs reliably infer the common, intuitive disjunctive causal relationships but systematically struggle with the unusual, yet equally (or sometimes even more) evidenced conjunctive ones. This “disjunctive bias” persists across model families, sizes, and prompting strategies, and performance further declines as task complexity increases. Interestingly, an analogous bias appears in human adults, suggesting that LMs may have inherited deep-seated reasoning heuristics from their training data. To this end, we quantify similarities between LMs and humans, finding that LMs exhibit adult-like inference profiles (but not child-like). Finally, we propose a test-time sampling method which explicitly samples and eliminates hypotheses about causal relationships from the LM. This scalable approach significantly reduces the disjunctive bias and moves LMs closer to the goal of scientific, causally rigorous reasoning.more » « less
-
As a cornerstone in language modeling, tokenization involves segmenting text inputs into pre-defined atomic units. Conventional statistical tokenizers often disrupt constituent boundaries within words, thereby corrupting semantic information. To address this drawback, we introduce morphological structure guidance to tokenization and propose a deep model to induce character-level structures of words. Specifically, the deep model jointly encodes internal structures and representations of words with a mechanism named to ensure the indecomposability of morphemes. By training the model with self-supervised objectives, our method is capable of inducing character-level structures that align with morphological rules without annotated training data. Based on the induced structures, our algorithm tokenizes words through vocabulary matching in a top-down manner. Empirical results indicate that the proposed method effectively retains complete morphemes and outperforms widely adopted methods such as BPE and WordPiece on both morphological segmentation tasks and language modeling tasks.more » « less
An official website of the United States government

Full Text Available