Plan emerges to deal with deluge of data from brain research
Brain research initiatives globally are generating a tsunami of neuroscience data. However, there has not been a coherent strategy to manage, share and analyze the plethora of data.
But now, an international team of researchers led by Lawrence Berkeley National Laboratory has developed a plan for managing, sharing and analyzing neuroscience data to better understand how brains work.
Kristofer Bouchard, a computational neuroscientist at Berkeley Lab, assembled the team that included mathematicians, computer scientists, and physicists to tackle the “grand challenge” problems posed by the deluge of data generated by neuroscience initiatives such as the Obama administration’s Brain Research through Advancing Innovative Neurotechnologies.
While many of these national and private initiatives are developing new tools to enable researchers to explore how the brain processes, utilizes, stores and retrieves information, he says little attention has been paid to the computing challenges associated with the vast amounts of data that these technologies create.
According to Bouchard and his research team, that’s where the computational power, memory and storage capabilities of high-performance computing can make a huge contribution by enabling exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
“High-performance computing (HPC) has revolutionized many scientific fields, in particular those studying systems of many heterogeneous elements governed by complex interaction across many spatiotemporal scales: brains fit this mold well,” they write this month in the journal Neuron.
“Scientific HPC has traditionally focused on large-scale simulations, and this is also true for neuroscience. However, there is a paradigm shift occurring within the HPC community toward harnessing this massive computing power to processing and analyzing experimental data,” the authors note. “The neuroscience community is not alone in the challenges of utilizing HPC—other scientific fields are being rapidly transformed through the application of HPC to process and analyze ever-increasing volumes of experimental data.”
To leverage the power of high-performance computing, Bouchard and his colleagues contend that:
- Persistent storage of large-scale neuroscience datasets from multiple brains in massive repositories is required to extract universal design principles and identify unique differences across individuals. HPC systems are uniquely positioned to enable this capability and should be utilized.
- Leveraging HPC resources for neuroscience “Grand Challenge Problems” requires significant investment in standardization of data and metadata as well as development and optimization of data preprocessing and analysis codes, as well as enabling data analysis frameworks (e.g., Spark) on HPC systems.
- Activities focused on enabling cloud computing for individual labs (e.g., “International Brain Station”) are important endeavors but do not address the challenges for which HPC will be required, and both should be pursued in parallel.
- International research and funding agreements, such as occurs in high-energy physics, should ensure coordination across different national and private neuroscience initiatives.
“Harnessing the power of HPC resources will require neuroscientists to work closely with computer scientists and will take time, so we recommend rapid and sustained investment in this endeavor now,” concludes Bouchard. “The insights generated from this effort will have high-payoff outcomes. They will support neuroscience efforts to reveal both the universal design features of a species' brain and help us understand what makes each individual unique.”