Problem
Research on human neural networks has been significantly advanced by
computational methods and tools. However, the large volume of data acquired by
brain imaging techniques (e.g., fMRI) poses a serious challenge to the
computation in voxel-wise processing. Standard FC analyses calculate a
correlation matrix of the relationship between timecourses of every pair of
voxels. Unfortunately, the matrix can be gigantic, making conventional
processing methods problematic. For example, a standard 7-minute 3T fMRI scan
can generate a timeseries of 210 time points with about 30,000 voxels for a
single subject. Even if we just use 32-bit float data type to store one
correlation coefficient, we need at least 1.8 GB (30K * 30K * 4 / 2) storage for
one correlation matrix. The sheer size unavoidably brings high memory pressure
on most of standalone machines and we also need space for further processing
based on the matrix and for intermediate results, which can quickly eat up the
whole memory.
Voluminous data can have an even larger effect on more sophisticated network
analysis approaches. Graph theory has been extensively applied to study the
characteristics of ROI-wise functional brain networks. A number of network
measures such as degree, strength, and betweenness centrality are computed in
this type of analysis. These measures are usually obtained through matrix
manipulations as a ROI network in general is represented as a connectivity
matrix. Recently, research interests in voxel-wise functional connectivity have
been sparked. However, graph theoretic analyses at voxel level become much more
computationally difficult due to the large number of voxels.
These challenges in computing space and time prompt the use of high performance
computing (HPC). HPC is usually achieved by a cluster of dedicated computing
machines that are interconnected through a high speed network. In recent years,
a new form of HPC has emerged with the maturity of virtual machine and cloud
computing technologies. Within a cloud, a user may execute any kind of
computational job that may be as large as the one running on a large dedicated
cluster within a cloud. The advantages of conducting HPC in cloud environments
are multifold. First, it offers the opportunity of exposing HPC-dependent
research to underdeveloped regions where resources are too limited to build and
maintain a dedicated HPC site. Second, it provides more flexibility in computing
time, location, and resources for researchers so that computation jobs can be
tailored to each user’s needs. Based on these considerations, we believe that
cloud-based high performance computing will become popular and should be
leveraged for conducting network analyses with massive volumes of brain imaging
data.
