Barriers in sharing and accessing data from neuroimaging has slowed progress in the field of neuroscience, even as newer technologies offer more promise. Now, scientists from Stanford University are tackling those issues through a new way of organizing brain-imaging data that simplifies data analysis and helps researchers collaborate more effectively – they call it BIDS (Brain Imaging Data Structure).
The easier it becomes to analyze and organize data, said Russell Poldrack, a professor of psychology, the more easily that data can be shared among researchers, leading to more transparency and more progress in understanding the brain.
“We’ve been interested for a long time in finding ways to share data between groups. Sharing data is a good thing because it allows different research groups to reuse data and maximizes its potential,”
Thousands of research MRI studies are performed every year generating substantial amounts of data. However, there’s no consensus on how that data should be organized.
Conceivably, you could have two neuroscience researchers working side-by-side in the same lab analyzing the same MRI scans and recording the data differently. These labs also experience significant turnover with doctoral students and postdoctoral scholars leaving for teaching and other research positions.
New researchers entering the lab may need to decipher data in a format they’re not accustomed to. The dilemma gets further complicated as new data analysis methods are being developed, providing even more ways to organize the data.
A total of 50 features are selected by the preprocessing steps. The features are ordered from highest median importance (the QI2 ) to lowest (percentile 5% of the intensities within the GM mask). The boxplots represent the distribution of importances of a given feature within all trees in the ensemble. B. (Left) Four different examples of false negatives of the DS030 dataset. The red boxes indicate a ghosting artifact, present in more than 20% of the images. Only extreme cases where the ghost overlaps the cortical GM layer of the occipital lobes are presented. (Right) Two examples of false positives. The two examples are borderline cases that were rated as “doubtful”. Due to the intra- and inter- rater variabilities, some data points with poorer overall quality are rated just “doubtful”. These images demonstrate the effects of the noise in the quality labels. Credit: Oscar Esteban, et al, PLOS
For example, Poldrack’s group is currently working on a project where participants undergo MRI scans to study their brain activity related to self-control. The data the team collects are images – up to 40 or 50 files – of the brain in various stages.
But transferring these files from the MRI scanner to a format the lab’s software program can read requires transforming the files – a process that has traditionally been idiosyncratic among different researchers.
Without a common standard, it becomes increasingly difficult for researchers to maximize these valuable data sets. It would be like if thousands of U.S. Census takers gathering demographic information on Americans all over the country sent their survey results back in different languages.
Brain Imaging Data Structure
BIDS, the researchers say, solves that problem by providing a uniform standard.
“Basically, we constructed this language where all people collecting brain data understand each other,”
said Chris Gorgolewski, co-director of the Stanford Center for Reproducible Neuroscience.
BIDS is essentially a collection of related apps that help handle different aspects of data analysis and storage. Once a new app is tested and deployed it resides in a cloud-based service, where other scientists can download the apps directly for their own use.
The group originally developed BIDS with support from the International Neuroinformatics Coordinating Facility, a global organization dedicated to promoting data sharing among neuroscientists. The Stanford Center for Reproducible Neuroscience has taken the lead in championing BIDS as the standard language for MRI data.
In addition to publishing research about BIDS, the center has also hosted two annual workshops, each bringing together about 30 researchers and developers from around the world to learn about and build these apps. The lab also received a $1.4 million grant last month from the National Institutes of Health BRAIN Initiative to further the development of BIDS.
Reproducibility And Transparency
The center’s researchers, including Gorgolewski and postdoctoral research fellow Oscar Esteban, have either built or facilitated the building of 22 BIDS apps. Their most recent innovation is the MRIQC tool (MRI Quality Control), which performs quality assessments and large-scale analysis of MRI data.
MRI image analysis takes time and involves numerous steps, and often requires external software. The BIDS apps, conversely, are compatible with major operating systems with minimal extra work for users. They are meant to be “plug and play,” Esteban said.
MRIQC generates one individual report per subject in the input folder and one group report including all subjects. To visually assess MRI samples, the first step (1) is opening the group report. This report shows boxplots and strip-plots for each of the IQMs. Looking at the distribution, it is possible to find images that potentially show low-quality as they are generally reflected as outliers in one or more strip-plots. For instance, in (2) hovering a suspicious sample within the coefficient of joint variation (CJV) plot, the subject identifier is presented (“sub-51296”). Clicking on that sample will open the individual report for that specific subject (3). Credit: Oscar Esteban, et al, PLOS
Poldrack readily admits that apps that help organize data sound “pretty boring.” He said he and his researchers sometimes see themselves as “plumbers” fixing infrastructure.
But offering a setting of openness where scholars around the world have access to critical data is worth the work.
“The bigger picture for us is transparency and reproducibility,” Poldrack said. “There are interesting scientific questions we want people to get at, questions about how our different psychological functions are related to each other. Part of what we want to do is to convince people to share their data when they run a study to do interesting science or reproduce the results.”
This work was supported by the Laura and John Arnold Foundation and Swiss National Science Foundation.