Problem

Research on human neural networks has been significantly advanced by computational methods and tools. However, the large volume of data acquired by brain imaging techniques (e.g., fMRI) poses a serious challenge to the computation in voxel-wise processing. Standard FC analyses calculate a correlation matrix of the relationship between timecourses of every pair of voxels. Unfortunately, the matrix can be gigantic, making conventional processing methods problematic. For example, a standard 7-minute 3T fMRI scan can generate a timeseries of 210 time points with about 30,000 voxels for a single subject. Even if we just use 32-bit float data type to store one correlation coefficient, we need at least 1.8 GB (30K * 30K * 4 / 2) storage for one correlation matrix. The sheer size unavoidably brings high memory pressure on most of standalone machines and we also need space for further processing based on the matrix and for intermediate results, which can quickly eat up the whole memory.

Voluminous data can have an even larger effect on more sophisticated network analysis approaches. Graph theory has been extensively applied to study the characteristics of ROI-wise functional brain networks. A number of network measures such as degree, strength, and betweenness centrality are computed in this type of analysis. These measures are usually obtained through matrix manipulations as a ROI network in general is represented as a connectivity matrix. Recently, research interests in voxel-wise functional connectivity have been sparked. However, graph theoretic analyses at voxel level become much more computationally difficult due to the large number of voxels.

These challenges in computing space and time prompt the use of high performance computing (HPC). HPC is usually achieved by a cluster of dedicated computing machines that are interconnected through a high speed network. In recent years, a new form of HPC has emerged with the maturity of virtual machine and cloud computing technologies. Within a cloud, a user may execute any kind of computational job that may be as large as the one running on a large dedicated cluster within a cloud. The advantages of conducting HPC in cloud environments are multifold. First, it offers the opportunity of exposing HPC-dependent research to underdeveloped regions where resources are too limited to build and maintain a dedicated HPC site. Second, it provides more flexibility in computing time, location, and resources for researchers so that computation jobs can be tailored to each user’s needs. Based on these considerations, we believe that cloud-based high performance computing will become popular and should be leveraged for conducting network analyses with massive volumes of brain imaging data.

     


Sketch of Solution

Parallel software packages (e.g., PETSc and SLEPc) have been developed to reduce the complexity of HPC programming. These packages usually expose APIs that hide low-level details of parallel programming and appear as nonparallel APIs. However, the learning curve of these packages is still quite high for HPC users who often are domain experts. For a neural scientist who uses those packages to compute a specific measure of a voxel-wise FC network, he has to first decompose the computation of the network measure into basic matrix operations and then map those operations onto the APIs provided by the packages. Therefore, the scientist needs to well understand how to compute the metric and be familiar with the APIs of matrix operations supported by the packages, which certainly imposes a big overhead.

In this project we propose to develop a cloud-based software package for supporting brain network analysis. The package is designed to harness the power of Amazon cloud computing for HPC (Amazon EC2 and S3) and ease the use of HPC significantly by providing high-level APIs that directly support common computing tasks in brain network research. Specifically, we will leverage the existing parallel software packages to develop APIs for

(1) computing network measures used in graph theoretical analyses of neural networks,

(2) supporting common processes (e.g., finding clusters) in graph theatrical analyses, and

(3) supporting a variety of voxel parcellation approaches.

We anticipate that this tool will significantly lower the bar of using cloud-based HPC in neural network research. A neuroscientist will easily get any interested metrics of neural network of any size by writing simple code where only simple APIs are called and the details of parallel computing and communications are almost totally transparent.

Project Members

Dr. Mengjun Xie
Dr. Jiang Bian