skip to main content

Title: Scalable Biologically-Aware Skeleton Generation for Connectomic Volumes
As connectomic datasets exceed hundreds of terabytes in size, accurate and efficient skeleton generation of the label volumes has evolved into a critical component of the computation pipeline used for analysis, evaluation, visualization, and error correction. We propose a novel topological thinning strategy that uses biological constraints to produce accurate centerlines from segmented neuronal volumes while still maintaining bio- logically relevant properties. Current methods are either agnostic to the underlying biology, have non-linear running times as a function of the number of input voxels, or both. First, we eliminate from the input segmentation biologically-infeasible bubbles, pockets of voxels incorrectly labeled within a neuron, to improve segmentation accuracy, allow for more accurate centerlines, and increase processing speed. Next, a Convolutional Neural Network (CNN) detects cell bodies from the input segmentation, allowing us to anchor our skeletons to the somata. Lastly, a synapse-aware topological thinning approach produces expressive skeletons for each neuron with a nearly one-to-one correspondence between endpoints and synapses. We simultaneously estimate geometric properties of neurite width and geodesic distance between synapse and cell body, improving accuracy by 47.5% and 62.8% over baseline methods. We separate the skeletonization process into a series of computation steps, leveraging data-parallel strategies to increase more » throughput significantly. We demonstrate our results on over 1250 neurons and neuron fragments from three different species, processing over one million voxels per second per CPU with linear scalability. « less
; ; ; ; ; ; ;
Award ID(s):
Publication Date:
Journal Name:
IEEE transactions on medical imaging
Sponsoring Org:
National Science Foundation
More Like this
  1. Reconstructed terabyte and petabyte electron microscopy image volumes contain fully-segmented neurons at resolutions fine enough to identify every synaptic connection. After manual or automatic reconstruction, neuroscientists want to extract wiring diagrams and connectivity information to analyze the data at a higher level. Despite significant advances in image acquisition, neuron segmentation, and synapse detection techniques, the extracted wiring diagrams are still quite coarse, and often do not take into account the wealth of information in the densely reconstructed volumes. We propose a synapse-aware skeleton generation strategy to transform the reconstructed volumes into an information-rich yet abstract format on which neuroscientists can perform biological analysis and run simulations. Our method extends existing topological thinning strategies and guarantees a one-to-one correspondence between skeleton endpoints and synapses while simultaneously generating vital geometric statistics on the neuronal processes. We demonstrate our results on three large-scale connectomic datasets and compare against current state-of-the-art skeletonization algorithms.
  2. Abstract Following significant advances in image acquisition, synapse detection, and neuronal segmentation in connectomics, researchers have extracted an increasingly diverse set of wiring diagrams from brain tissue. Neuroscientists frequently represent these wiring diagrams as graphs with nodes corresponding to a single neuron and edges indicating synaptic connectivity. The edges can contain “colors” or “labels”, indicating excitatory versus inhibitory connections, among other things. By representing the wiring diagram as a graph, we can begin to identify motifs, the frequently occurring subgraphs that correspond to specific biological functions. Most analyses on these wiring diagrams have focused on hypothesized motifs—those we expect to find. However, one of the goals of connectomics is to identify biologically-significant motifs that we did not previously hypothesize. To identify these structures, we need large-scale subgraph enumeration to find the frequencies of all unique motifs. Exact subgraph enumeration is a computationally expensive task, particularly in the edge-dense wiring diagrams. Furthermore, most existing methods do not differentiate between types of edges which can significantly affect the function of a motif. We propose a parallel, general-purpose subgraph enumeration strategy to count motifs in the connectome. Next, we introduce a divide-and-conquer community-based subgraph enumeration strategy that allows for enumeration per brain region.more »Lastly, we allow for differentiation of edges by types to better reflect the underlying biological properties of the graph. We demonstrate our results on eleven connectomes and publish for future analyses extensive overviews for the 26 trillion subgraphs enumerated that required approximately 9.25 years of computation time.« less
  3. We present a simple, yet effective, auxiliary learning task for the problem of neuron segmentation in electron microscopy volumes. The auxiliary task consists of the prediction of Local Shape Descriptors (LSDs), which we combine with conventional voxel-wise direct neighbor affinities for neuron boundary detection. The shape descriptors are designed to capture local statistics about the neuron to be segmented, such as diameter, elongation, and direction.On a large study comparing several existing methods across various specimen, imaging techniques, and resolutions, we find that auxiliary learning of LSDs consistently increases segmentation accuracy of affinity-based methods over a range of metrics. Furthermore, the addition of LSDs promotes affinity-based segmentation methods to be on par with the current state of the art for neuron segmentation (Flood-Filling Networks,FFN), while being two orders of magnitudes more efficient—a critical requirement for the processing of future petabyte-sized datasets. Implementations of the new auxiliary learning task,network architectures, training, prediction, and evaluation code, as well as the datasets used in this study are publicly available as a benchmark for future method contributions.

    Combining finite element methods for the incompressible Stokes equations with particle-in-cell methods is an important technique in computational geodynamics that has been widely applied in mantle convection, lithosphere dynamics and crustal-scale modelling. In these applications, particles are used to transport along properties of the medium such as the temperature, chemical compositions or other material properties; the particle methods are therefore used to reduce the advection equation to an ordinary differential equation for each particle, resulting in a problem that is simpler to solve than the original equation for which stabilization techniques are necessary to avoid oscillations.

    On the other hand, replacing field-based descriptions by quantities only defined at the locations of particles introduces numerical errors. These errors have previously been investigated, but a complete understanding from both the theoretical and practical sides was so far lacking. In addition, we are not aware of systematic guidance regarding the question of how many particles one needs to choose per mesh cell to achieve a certain accuracy.

    In this paper we modify two existing instantaneous benchmarks and present two new analytic benchmarks for time-dependent incompressible Stokes flow in order to compare the convergence rate and accuracy of various combinations of finite elements,more »particle advection and particle interpolation methods. Using these benchmarks, we find that in order to retain the optimal accuracy of the finite element formulation, one needs to use a sufficiently accurate particle interpolation algorithm. Additionally, we observe and explain that for our higher-order finite-element methods it is necessary to increase the number of particles per cell as the mesh resolution increases (i.e. as the grid cell size decreases) to avoid a reduction in convergence order.

    Our methods and results allow designing new particle-in-cell methods with specific convergence rates, and also provide guidance for the choice of common building blocks and parameters such as the number of particles per cell. In addition, our new time-dependent benchmark provides a simple test that can be used to compare different implementations, algorithms and for the assessment of new numerical methods for particle interpolation and advection. We provide a reference implementation of this benchmark in aspect (the ‘Advanced Solver for Problems in Earth’s ConvecTion’), an open source code for geodynamic modelling.

    « less
  5. Detecting synaptic connections using large-scale extracellular spike recordings presents a statistical challenge. Although previous methods often treat the detection of each putative connection as a separate hypothesis test, here we develop a modeling approach that infers synaptic connections while incorporating circuit properties learned from the whole network. We use an extension of the generalized linear model framework to describe the cross-correlograms between pairs of neurons and separate correlograms into two parts: a slowly varying effect due to background fluctuations and a fast, transient effect due to the synapse. We then use the observations from all putative connections in the recording to estimate two network properties: the presynaptic neuron type (excitatory or inhibitory) and the relationship between synaptic latency and distance between neurons. Constraining the presynaptic neuron’s type, synaptic latencies, and time constants improves synapse detection. In data from simulated networks, this model outperforms two previously developed synapse detection methods, especially on the weak connections. We also apply our model to in vitro multielectrode array recordings from the mouse somatosensory cortex. Here, our model automatically recovers plausible connections from hundreds of neurons, and the properties of the putative connections are largely consistent with previous research. NEW & NOTEWORTHY Detecting synaptic connectionsmore »using large-scale extracellular spike recordings is a difficult statistical problem. Here, we develop an extension of a generalized linear model that explicitly separates fast synaptic effects and slow background fluctuations in cross-correlograms between pairs of neurons while incorporating circuit properties learned from the whole network. This model outperforms two previously developed synapse detection methods in the simulated networks and recovers plausible connections from hundreds of neurons in in vitro multielectrode array data.« less