Optimal compressed sensing reconstructions of fMRI using 2D deterministic and stochastic sampling geometries

Background Compressive sensing can provide a promising framework for accelerating fMRI image acquisition by allowing reconstructions from a limited number of frequency-domain samples. Unfortunately, the majority of compressive sensing studies are based on stochastic sampling geometries that cannot guarantee fast acquisitions that are needed for fMRI. The purpose of this study is to provide a comprehensive optimization framework that can be used to determine the optimal 2D stochastic or deterministic sampling geometry, as well as to provide optimal reconstruction parameter values for guaranteeing image quality in the reconstructed images. Methods We investigate the use of frequency-space (k-space) sampling based on: (i) 2D deterministic geometries of dyadic phase encoding (DPE) and spiral low pass (SLP) geometries, and (ii) 2D stochastic geometries based on random phase encoding (RPE) and random samples on a PDF (RSP). Overall, we consider over 36 frequency-sampling geometries at different sampling rates. For each geometry, we compute optimal reconstructions of single BOLD fMRI ON & OFF images, as well as BOLD fMRI activity maps based on the difference between the ON and OFF images. We also provide an optimization framework for determining the optimal parameters and sampling geometry prior to scanning. Results For each geometry, we show that reconstruction parameter optimization converged after just a few iterations. Parameter optimization led to significant image quality improvements. For activity detection, retaining only 20.3% of the samples using SLP gave a mean PSNR value of 57.58 dB. We also validated this result with the use of the Structural Similarity Index Matrix (SSIM) image quality metric. SSIM gave an excellent mean value of 0.9747 (max = 1). This indicates that excellent reconstruction results can be achieved. Median parameter values also gave excellent reconstruction results for the ON/OFF images using the SLP sampling geometry (mean SSIM > =0.93). Here, median parameter values were obtained using mean-SSIM optimization. This approach was also validated using leave-one-out. Conclusions We have found that compressive sensing parameter optimization can dramatically improve fMRI image reconstruction quality. Furthermore, 2D MRI scanning based on the SLP geometries consistently gave the best image reconstruction results. The implication of this result is that less complex sampling geometries will suffice over random sampling. We have also found that we can obtain stable parameter regions that can be used to achieve specific levels of image reconstruction quality when combined with specific k-space sampling geometries. Furthermore, median parameter values can be used to obtain excellent reconstruction results.


(Continued from previous page)
Conclusions : We have found that compressive sensing parameter optimization can dramatically improve fMRI image reconstruction quality. Furthermore, 2D MRI scanning based on the SLP geometries consistently gave the best image reconstruction results. The implication of this result is that less complex sampling geometries will suffice over random sampling. We have also found that we can obtain stable parameter regions that can be used to achieve specific levels of image reconstruction quality when combined with specific k-space sampling geometries. Furthermore, median parameter values can be used to obtain excellent reconstruction results.
Keywords: Compressive sensing, MRI, fMRI, Numerical optimization Background Using compressive sensing (CS), we can recover certain noise-free signals and images exactly from limited numbers of k-space samples. In theory, to reconstruct images from a limited number of samples, we require that the signal exhibits sparsity and incoherence (lack of correlation) between the sensing basis and the transform basis used for reconstruction. For one-dimensional signals with N samples that are composed of T "spikes", perfect reconstuction can be obtained from O(T ⋅ log(N)) frequency-domain samples [1].
When the required conditions are met, perfect reconstruction is possible from a limited number of samples. For example, for piecewise constant signals, very impressive results have been obtained from a very limited number of Fourier samples. Unfortunately, such idealized models may not necessarily fit more complex, non-piecewise smooth images, such as standard magnetic resonance imaging (MRI) images [2]. To demonstrate the problem, we present a typical functional MRI (fMRI) slice image in Figure 1. Figure 1 depicts an fMRI data set, containing a brain slice of a patient "at rest" (OFF) and the same patient while performing a prescribed task or activity (ON). The kspace data is also depicted, revealing the energy in k-space being concentrated around the center. The difference image of the ON and OFF images, depicting the region of activity within the brain is shown in Figure 1(d).
In what follows, we demonstrate a reconstruction paradigm that explores several kspace sampling geometries. We consider both stochastic and deterministic Fourier plane sampling geometries. A limitation of our study is that it is based on retrospectively downsampled data from an original fully-sampled EPI sequence. Thus, for practical application of our findings for in-vivo experiments, the EPI sampling pattern will have to be modified to reflect the proposed downsampling patterns. Here, we note that the use of stochastic geometries that result in non-consecutive EPI phase encoding is prone to artifacts. This should not lead to problems with the SLP geometry that is found to perform best in our experiments. On the other hand, this observation can lead to reconstruction artifacts in stochastic geometries if they are implemented by moving around k-space randomly. We also note that actual MRI imaging time depends on how many phase encoding lines are acquired. Random geometries are included to comply with the conventional (sparsity and incoherence) compressive sensing requirements.
To measure the quality of the reconstructions, we use an optimization process that solves for the compressive sensing objective function parameters that maximize the peak signal to noise ratio (PSNR) or the structural similarity index matrix (SSIM) obtained by each geometry. Here, we note that the structural similarity index matrix (SSIM) [3] provides for a recent (and arguably far more reliable) image reconstruction quality metric.
Given the fact that our approach is based on numerical simulations, we avoid making claims on the actual acquisition times. We do however discuss how the most successful geometry presented here can be effectively implemented based on [4]. In [4], the authors describe a fast fMRI imaging method based on a spiral sampling geometry. As it turns out, the 2D Spiral Lowpass Sampling Geometry (SLP) that ends up performing the best among all sampling geometries in this paper represents a 2D version of the 3D k x k y k z spiral sampling geometry of [4] in the 2D k x k y plane (see Table 1 for SLP plots). This is clearly demonstrated in Figure 1 of [4]. We note that we perform 2D reconstructions on a single fMRI slice as described in the section titled "Data Set and fMRI Activity Detection," in the "Methods" section.We also study parameter optimization for the best reconstructed image quality.We investigate the estimation of optimal parameter regions that can provide high quality reconstructions (while avoiding outliers). We investigate the relative impact of each parameter and the regularity of the optimal region. Therefore, we provide an optimization framework to help determine scanning  parameters and scanning geometries that should work for the majority of the cases. In addition to parameter ranges, we also provide specific parameter values based on the median of all optimal parameter values. This approach is validated using leave-one-out for optimization of the average SSIM index. Alternatively, an optimal regularization parameter can be determined by the L-curve method [5]. Here, the "L" refers to the shape of the curve of plotting the solution norm versus the corresponding residual norm. The optimal regularization parameter is then determined by the point that gives the corner of the "L"-shaped plot. At the corner point of "L", we have the point that minimizes the solution norm while also minimizing the residual norm. In this way, the L-curve approach avoids producing a regularization parameter that overly depends on the noise in the image [5].
We note that our parameter optimization approach is direct over the mean structural similarity index (SSIM) [3] and the PSNR. Here, we note that the PSNR is related to the residual error and it appears that the L-curve method could be applied to find robust estimates of the regularization parameters. However, the emphasis of this paper is on finding the optimal parameters with respect to the mean-SSIM [3]. We refer to [6] for a description of the problems associated with the use of the mean-squared error for digital images. On the other hand, the mean-SSIM provides for a robust metric that better correlates with human visual perception of the images that are being examined.
Our approach finds the optimal regularization parameters that minimize the mean-SSIM over a training set of images. Here, we note that each image will produce a different set of optimal parameters depending on the noise variance and how the sampling geometry captures the structure of the activity region. In the section titled "A Fourier Model for Brain Activity," we provide a frequency-domain model for brain activity. In this section, we carefully show how brain activity favors the 2D spiral low pass geometry (SLP) in the sense that it captures most of the image energy within the effective bandwidth, while it also allows for TV-norm optimization to reconstruct the activity region.
Estimates of the optimal regularization parameters are determined by taking median values of the optimal values estimated over the training set. The effectiveness of this approach is then evaluated over an independent testing set. As we shall describe in the Results section, the approach works very well giving mean-SSIM values above 0.93 for the SLP geometry. Furthermore, note that the median is considered to be a robust, non-parametric statistic that avoids outliers [7]. The median provides for robust estimates of the regularization parameters that maximize the mean-SSIM of the reconstructed image. Here, we note the conflicting goals. Clearly, the best reconstructions will require the largest number of k-space samples. Thus, we are interested in determining the minimum acceptable quality that also yields acceptable reconstruction with the minimum number of required frequency-domain samples. To this end, we determine effective PSNR ranges and associate them with different image reconstruction qualities. For the average SSIM, we consider fixed values (SSIM > 0.75 for good results, SSIM > 0.90 for excellent results).
We limit our study to optimizing for the best MRI image reconstructions, and for the best quality activity maps based on the difference image between the "ON" and "OFF" MRI images. For detecting activity regions, we focus on the parameters that lead to highest possible quality in the difference images (activation versus no-activation images). Here, we will avoid issues associated with post-processing the difference images to better delineate between active and inactive regions of the brain. Clearly though, when the difference image is of the highest possible quality, artifact removal will simply improve upon our results. Furthermore, for very large average SSIM values (SSIM > 0.93), reconstruction quality should be considered to be excellent. For the use of post-processing methods, we refer to the body of work described in [8][9][10][11][12][13][14][15], which utilize a wide variety of signal detection, correlation analysis, and statistical tests to infer brain activity in fMRI difference images.

Related Work
CS methods have been developed that resulted in improved spatial resolution from reduced k-space sampling. The work presented here falls into this category of k-space undersampling reconstruction methods, and is closely related to that of Lustig, Donoho, and Pauly [16]. Lustig et al. also provides a general framework for k-space sampling and reconstruction using CS theory. While the sampling geometries presented here are designed to satisfy the CS criteria, the random and pseudo-random sampling of k-space could result in slower acquisition due to scanner programming constraints.
Most of the subsequent literature on CS applications to MRI imaging explores the merging of CS theory and other fast MRI acquisition techniques, unique k-space sampling methods, and novel reconstruction algorithms. Gamper, Boesiger, and Kozerke meet the sparsity condition of CS theory by applying the Fourier transformation along the temporal dimension, assuming that some regions of the filed-of-view change at a high temporal rate while other parts remain stationary or change slowly. Their methods show the effectiveness of CS reconstruction for accelerated dynamic (continuous sampling) MRI reconstruction by comparing them to k-t BLAST reconstructions over the same data sets [17]. Their sampling scheme can be described as randomly skipping phase-encoding lines in each dynamic frame. Jung et al. developed k-t FOCUSS to provide a general framework beyond k-t SENSE and k-t BLAST for model-based dynamic MRI by applying CS theory to randomly sampled reconstruction of the prediction and residual encoding that are significantly sparse [18].
Recent studies have focused on extending the work in [16] to non-Fourier bases. Haldar, Hernando, and Liang use tailored spatially selective RF pulses to better satisfy the incoherence requirement of CS theory [19]. Liu, Zou, and Ying apply CS theory to parallel reconstruction using sensitivity encoding. Their extension of SENSE to CS is based upon a reconstruction method using Bregman iteration [20]. Trzakso, Manduca, and Borisch present a CS method that minimizes the L0 semi-norm using a redescending M-estimator functions instead of L-1 norms typically found in CS literature [21]. The extension of the sparsity measure to multi-scale form allows for rapid reconstructions compared to the non-trivial solutions described in [17][18][19][20][21].
Methods that exploit the spatial and/or temporal redundancy of k-space data provide a way of describing another class of k-space undersampling reconstructions. A recent study by Lindquist et al. describes methods for obtaining a 3D echo-volumnar fMRI imaging sequence by sampling the small central portion of k-space at a high temporal rate [4]. The sampling trajectory is sampled successively across the fourth dimension in a 4D acquisition instead of successively over each temporal slice, and is constrained to a spiral pattern. Other MRI processing techniques that utilize information across slices collected at successive time intervals can be found in [22][23][24][25][26]. Alternatively, the authors in [27] provide a simple iterative algorithm, termed deconvolution-interpolation gridding (DING) for reconstructing MRI image from arbitrarily-sampled k-space. Compared to the Compressive Sensing methods discussed here, it is important to note that DING does not require regularization, and can also work with k-space trajectories that are not known a priori. Thus, DING is also suitable for cases where patient motion occurs during the scan.

Contributions
The primary contributions of this paper can be summarized as follows: Comprehensive comparisons of 36 stochastic and deterministic frequency sampling geometries: While the theory of compressing sensing suggests the use of stochastic geometries, we show that superior results can be achieved with deterministic frequency sampling geometries. Furthermore, we investigate the use of different sampling rates in order to determine the minimum sampling density that can still provide acceptable reconstruction results. As a consequence of this research, we provide the optimal geometry that can lead to the fastest reconstruction time among all candidate sampling geometries. This represents a contribution over existing methods that are based on a limited number of mostly stochastic sampling geometries.
Optimal reconstruction parameter ranges for guaranteeing a minimum quality in the reconstruction images: Current compressive sensing methods do not provide sufficient details on how to set reconstruction parameters for different applications. As a result, in current methods, there is no guidance as to which parameter values will provide reconstruction of sufficient quality. Here, we first provide a direct search optimization framework that is used to provide parameter ranges that can guarantee reconstruction of acceptable image quality. In particular, we provide parameter ranges for reconstruction BOLD fMRI ON & OFF images, as well as difference images that can be used for activity detection. Here, as evidence for the need of the proposed approach, we show that a completely different parameter range is needed for activity detection as opposed to parameters for reconstructing individual ON and OFF images. Furthermore, we found that the fast, spiral low pass sampling geometry can reach PSNR levels above 40 dB after a few iterations. In addition, the spiral low pass sampling geometry gives superior results with all reconstructions yielding mean SSIM values above 0.93.

Optimal TV Norm and Wavelet Transform Penalty
In what follows, we provide a summary of the proposed optimization method. As we describe below, the basic approach involves a nested optimization approach. Let z denote the original image reconstructed using the entire k-space. Then, let f denote the reconstructed image. Here, f is a function of the reconstruction parameters α, β (further explained below) and the k-space sampling geometry, indexed by k. We seek to find the optimal reconstruction parameters and k-sampling geometry as given by where A denotes the accuracy of the reconstructed image. Here, we will consider two measures of accuracy: (i) the average SSIM index, and (ii) PSNR. To solve (1), we will need to solve the optimization problem of reconstructing f from a limited number of k-space samples.
Let F k denote the k-sampling geometry operator. Let y denote the k-space samples that are available. In our constrained optimization approach, the reconstructed signal needs to reproduce the samples as given by where ε denotes a small tolerance value, and : k k l 2 denotes the l 2 norm. In (2), F k takes the 2D discrete Fourier transform of the input image and forces the k-space samples in y to match the ones available in the k-th sampling geometry. For reconstructing f, we follow the approach of [16,28] of seeking solutions that are sparse in the wavelet domain. This leads us to consider a wavelet reconstruction of the MRI image as given by where x denotes the column vector of the wavelet coefficients and Ψ is the wavelet transform operator that contains the wavelet basis functions along its columns. For each reconstruction geometry, we seek to find the wavelet coefficients x that satisfy: such that: In (4), TV(Ψx) denotes the total variation of the reconstructed signal. In (4)-(5), we have a constrained optimization problem in three parameters. This is converted to an unconstrained optimization problem through the use of a Lagrange multiplier as indicated in [16,28]. In what follows, we will consider different values for α, β and use the software described in [16] to solve (4)-(5) for the optimal wavelet coefficients. We provide details for parameter optimization in a separate section.

Data set and fMRI activity detection
In blood oxygenation-level dependent (BOLD) fMRI, neural activity is detected by changes in the T2 * relaxation time due to changes in blood oxygenation levels in response to local activation. All images were acquired on a 3 T Siemens TIM Trio system with a 12-channel radio frequency (RF) coil. The fMRI experiment used a standard Siemens gradient-echo EPI sequence modified so that it stored real and imaginary data separately. We used a field-of-view (FOV) = 240 mm, slice thickness = 3.5 mm, slice gap = 1 mm, number of slices = 32, matrix size = 64 × 64, TE = 29 ms, and TR = 2 s. The fMRI experiment used a block design with periods of 30 s OFF and 30 s ON. The subjects who participated in this study tapped the fingers of their right hand during the ON period. There were five and a half cycles, starting with OFF and ending with the OFF period.
The BOLD fMRI data was then preprocessed to account for motion artifacts and spatially normalized into the standard Montreal Neurological Institute space. This spatial normalization was then sub-sampled to 3 × 3 × 4 mm, resulting in 53 × 63 × 46 voxels. An individual slice was then selected that ensured measureable regions of activity based on the task being performed by the test subjects.
Temporal smoothing also tends to better localize detected activity across all temporal slices within a single run. Instead of utilizing the temporal information, our reconstructions are of two individual ON and OFF BOLD images. The less dense collection of temporal samples provides a worst-case scenario for detecting neural activity. We detected neural activity in fMRI by calculating the difference image of the reconstructed ON and OFF images. Individual slices were reconstructed with optimal parameters computed for different sampling geometries. At this juncture, the neural activity detection problem becomes a segmentation problem. The pre-processed sum of squares imagery was used to generate the k-space data in this study by applying the 2D Fast Fourier Transform to the multi-coil data.
As such, we will not consider optimization over a variety of different activity detection algorithms. We do note however that the activity detection algorithms that employ low-pass filtering will tend to favor zero-filling over interpolation by compressive sensing or any other method. The reason for this is that low-pass filtering attenuates highfrequency components that are estimated by the interpolation/reconstruction method. Ultimately, any segmentation for BOLD fMRI images will be governed by the accuracy of the reconstructed images themselves. Thus, for considering activity detection, we present the reconstructed difference images for qualitative comparisons. Thus, a more practical implementation of our approach would be studies involving structural MRI data, in which the benefit of a reduced acquisition time provides relief of the anxiety and discomfort patients may experience within an MRI scanner.

A Fourier model for brain activity
In this section, we introduce an analytical model for modeling the activity (ON) image. As we shall derive in this section, the lower-frequency components can provide for an accurate model for the Fourier-Energy concentration as well as a low-TV norm image reconstruction.
To develop a mathematical model for the difference image, we refer to Figure 2. There are four regions of interest: (i) the stimulated brain region (ON region) which is characterized by larger image intensity values, (ii) the brain background region that corresponds to regions that are not activated, (iii) the periphery of the brain that, and the region outside the brain. This leads to the following approximation for the spatialdomain brain image: x; y ð Þ is on the brain background; f ON x; y ð Þ; x; y ð Þ is on the stimulated region; f p x; y ð Þ; x; y ð Þ is on the periphery of the brain; and 0; x; y ð Þ is outside the brain: Alternatively, since the different functions do not overlap, we write: Thus, our objective in reconstructing the MRI image is to effectively reconstruct the stimulated region that differentiates f ON (.) from the rest. From Figure 2, we provide models for each function using a rectangular approximation for each function. The use of a rectangular approximation allows us to develop a model that is separable in the xy coordinate system. The assumption allows us to provide analytical models and evaluate the properties of the resulting Fourier expansions for (7). Also, note that we can always express images as a sum of separable functions (e.g., via the use of the singular value decomposition or the 2D Fourier Transform).
First, we define the 1-D rectangular function rect (.) in terms of its width W and center C using: As mentioned earlier, for simplicity, let us assume that the brain, its periphery, and stimulated regions are all defined over rectangular regions using: where A≫a, B≫b, A, B≫d i for i = 1, …, 4. Here, note that the background region is of image intensity given by A while the stimulated region is characterized by larger image intensity B>A. Over the main brain region, f b is given by where we have A≫a, to signify that the oscillatory components are much smaller than Periphery of the brain.
Brain background.
Region outside the brain. the average image intensity. Similarly, B≫b for the stimulated region. We can verify that both of these assumptions are valid by visual inspection of the corresponding regions in Figure 2. We next take the Fourier-Transform of our model in (7) using: To derive the final expression, note that the Fourier transform of the rectangular function is given by: where sinc (ω x ) = sin(πω x )/(πω x ), and F{.} denotes the Fourier-transform operation.
Given the Fourier transform of the rectangular function in (10), the Fourier transform of its product with the cosine function simply produces two copies as given by: Using (11), we can now evaluate the Fourier Transform of our model of the brain region as given by From (12), we note that the first term A ⋅ R(ω x ; C 1 , W 1 ) ⋅ R(ω y ; C 2 , W 2 ) dominates the rest of the terms assuming that ω 1 , ω 2 ≫ 0. To see this, note that the spread of the sinc (.)-function copies will not contribute significantly around the DC region, where the first term is concentrated. By inspection of Figure 2, it is easy to see that the assumption that ω 1 , ω 2 ≫ 0 seems to hold. On the other hand, in the case of ω 1 , ω 2 ≈ 0, it is easy to see that the first term still dominates since A ≫ a. A similar argument applies for the Fourier transform of the stimulated regions described by f ON (.). In addition, note that the contributions from the periphery are limited since A, B ≫ d i for i = 1, …, 4, for i = 1, …, 4, (refer to Figure 2 for verification). Based on these observations, we have that the Fourier Transform of the model is dominated by: which allows us to focus on the Fourier transforms of the rectangular regions associated with the brain background region and the stimulated brain region. From (10) and (13), we note that we have the sum of two products of sinc (.) functions. Clearly, if we can identify the sinc (.) -function with the largest frequency-domain spread, we can capture most of the energy in (13). From the scaling property of the Fourier Transform, it is easy to see that the sinc (.) function with the smallest spatial-domain spread will result in the largest frequency-domain spread. Since the brain region will be larger than just the stimulated region, it is natural to assume that one of the widths associated with the stimulated region will be the smallest. Thus, without loss of generality, we assume that W 4 > W 1 , W 2 , W 3 . Furthermore, assuming that the effective spread of the sinc(.) -function extends to the three zerocrossings on either side, we have that ω spread ⋅ W 4 /2 = 3π gives that ω spread = 6π/W 4 . This argument can be generalized to multiple stimulated-regions. The smallest region gives the "effective" low-frequency spread as given by: where W smallest refers to the smallest stimulated region that we wish to recover in the Fourier domain. The concept of width can easily be extended to nonrectangular regions. Now, assuming that these low-frequency components can be captured by the frequency-sampling geometry, the inverse Fourier Transform of (13) yields: In (15), it is important to note that we have a function that has a very small TV-norm. In fact, it is an ideal case since we expect that the TV-optimized reconstruction will be well represented by the right-hand-side of (15). The observation here is that we can use the low-frequency components to capture most of the Fourier-transform energy of the ideal model as well as provide for a good initialization for optimization for low TV.
The expectation that the effective Fourier magnitude frequency spread is centered around the low-frequencies is also verified experimentally in two activity images as shown in Figure 3. From Figures 3(c) and 3(d), it is quite clear that most of the energy is concentrated around the lower-frequency portion of the spectrum. The log-magnitude plots of 4(e) and 4(f ) show the attenuation of the spectrum at higher frequency magnitudes. Our discussion in this section clearly shows that the low-frequency components of Figures 3(c) and 3(d) also capture the elevated image intensity levels associated with activated brain regions.

The k-Space downsampling geometries
We consider both deterministic and stochastic sampling geometries. For deterministic geometries, we consider the spiral low-pass and dyadic downsampling along the phase encoding direction. Stochastic methods are based on random sampling.
First, we consider geometries that restrict down sampling in the phase-encoded dimension only. Here, we wish to compare k-space sampling geometries in just the phaseencoded dimension to k-space undersampling in both dimensions around the center.
As noted in the literature review, the central region of k-space is essential in obtaining reconstruction performance that is acceptable. Almost all non-CS reconstruction of k-space undersampling included the center of k-space in the data that was sampled. A dyadic sampling geometry class was developed that considers sparse sampling along the phaseencoded dimension of k-space. All geometries of this class are shown in the first row of Table 1. This geometry includes samples from a collection of contiguous frequency encoded vectors centered over the center of k-space. The width of the central region can be varied to generate a number of unique masks that are members of this mask class. Nine unique geometries are generated based on a central region that sample 1/2, 1/3, 1/4, 1/6, 1/8, 1/10, 1/12, 1/16, and 1/32 of the phase encoded samples. Beyond the bounds of the central region of the geometry, additional frequency encoded vectors are sampled every 2 nd , 4 th , 8 th , etc. phase encoded sample until the entire support of the fully sampled k-space has been included. The dyadic characteristic of the gap size between subsequent high-frequency samples was intentionally designed to sample more densely near the center of k-space, and in hopes that such sampling in k-space might coincide with the wavelet transform our reconstruction method utilized. This class will be referred to as the dyadic phase encoded (DPE) geometry class in the remainder of this paper.
We also consider two additional geometry classes which attempt to increase the incoherence between the sensing and transform bases by introducing an element of randomness in how the samples were selected. The first class, which is referred to as the random phase encoded (RPE) geometry class, samples frequency-encoded vectors along random phase encoded samples. This sampling method was selected as a comparison to the dynamic CS MRI in [20]. In two images, single fMRI study, such sampling would not expect to provide acceptable reconstruction due to the possibility of excluding a portion of the central region of k-space. Each of the random phase encoded geometries contains the same number of samples as one of the dyadic phase encoded geometries. Due to the one-to-one geometry correspondence across classes, there are nine total geometries in the RPE class. All of the random phase encoded geometries are shown in the second row of Table 1.
The second geometry class that incorporates randomness into the sampling scheme can be described as random sampling along a 1-D probability distribution function across the phase encoded samples at each frequency-encoded sample. This geometry class will be referred to the random sampling PDF (RSP) class.
The motivation of this geometry is from [16], but we alter the distribution to be replicated at each frequency-encoded sample instead being defined over both k-space dimensions. We use a fifth-order polynomial of the form where U and V are the number of k-space samples in the frequency and phase encoded dimension. This function mathematically defines the probability density function on which the random samples are being selected. Our motivation for this geometry was to attempt a compromise between the inclusion of the center of k-space and also imparting incoherence into the CS problem through a pseudo-random sampling geometry.
Representative geometries of the RSP class are depicted in the third row in Table 1.
Lastly, we include a geometry class that restricts k-space sampling in both the phase encoded and frequency-encoded dimensions. Here, we generate the nine geometries by a sampling along Cartesian spiral emanating from the center of k-space. Such sampling geometries are more typical of the classical k-space undersampling reconstruction techniques found in literature. This class is referred to as the spiral low pass (SLP) geometry class (see Table 1).

Optimization of CS penalty parameters
Parameter optimization is performed using two separate approaches. In the first approach, we want to estimate the optimal parameter ranges for α, β that can provide a certain level of image quality. This first approach is based on maximizing the PSNR in the reconstruction images. Our second approach is more direct. It is based on maximizing the average SSIM index. In our second approach, we want to recommend specific parameter values for each sampling geometry. The second approach is validated using leave-one-out. Both approaches assume that we can find optimal parameters when ground truth is available. When ground truth is not available, our expectation is that our training sets will reflect the new datasets that will be collected. To initialize optimization, we first zero pad the images from the original size of 53 × 63 to be of size 64 × 64. We then use the 2D FFT to generate k-space data for our experiments. A baseline reconstruction is obtained by applying the inverse-Fourier transform to the k-space undersampling matrix. This initial reconstruction will exhibit the aliasing artifacts that the work presented in the introduction seeks to suppress. We then use the Nelder-Mead search algorithm to estimate optimal reconstruction parameters of (1) [29]. Here, we note that the Nelder-Mead algorithm does not require that we evaluate the derivatives. It belongs to the class of direct search methods [30]. This approach is used to generate a collection of optimal parameter values α, β associated with each image.
Then, in our first approach, we estimate the optimal parameter ranges for α, β that can provide a certain level of image quality. To understand how this works, note that minimum image quality requirement can be expressed as a minimum requirement for the PSNR level. We then determine the reconstructed images that exceed the minimum level. The optimal parameter region can then be estimated from the range of values of α, β that correspond to these images. To avoid outliers, the optimal regions need only be met by 75% of the total number of the considered images. We also consider the complexity of each optimal parameter region based on the inner and outer bounding boxes. The inner bounding box is the largest rectangle that can be totally contained inside the optimal parameter region. On the other hand, the outer bounding box represents the smallest rectangle that contains the optimal parameter region. A detailed example of our first approach is also provided in the results section.
Our second approach is more direct. Our goal is to provide specific recommendations for α, β that can be expected to exceed a minimum level of image quality. For this direct approach, we use the average SSIM index value to determine the required level of image quality. We consider an image with an average SSIM index value above 0.9 to represent excellent reconstruction quality. In this second approach, the image reconstruction quality is estimated using a leave-one-out method. In leave-one-out, we construct training sets consisting of all but one of the ON/OFF image pairs. We then report the reconstruction image quality on the remaining ON/OFF image pair that serves as our testing set. The method is then applied over all image pairs. Here, we use the median parameter values estimated over each training set for reconstruction of the testing set. In this approach, we report our results for each image serving as a member of the testing set. Furthermore, for simplicity, we provide the median parameter values over the entire dataset. Also, for completeness, we also provide optimal parameter values for each image pair. This allows the readers to see if the discrepancy between the median and the optimal parameter values will affect the reconstructed image quality.

Results
We provide optimization results independently for the ON, the OFF images, as well as for the difference image. Here, we consider optimization of the difference image for activity detection.
In Figure 4, we use the Shepp-Logan phantom for testing with the classical TV optimization method of [1], as well as for the proposed Wavelet optimization framework based on [16]. As expected, we can see from Figure 4(d) that the standard TV method performs very well here. We get perfect reconstruction from just 22 radial lines shown in Figure 4(b). The stochastic sampling geometries also perform better than the deterministic geometries on this example. For example, at the highest sampling density, random sampling on PDF gives reconstructions well over 90 dB that are 30 dB above any reconstruction achieved by the SLP geometries. However, please note that this example is unrealistic in the sense that it does not capture the complexity of the activity region in fMRI.
For all sampling geometries, we provide optimal PSNR results for ON and OFF images in Figure 6. The PSNR values obtained by simply zero-filling the missing k-pace samples are displayed for comparison. A brief inspection reveals the PSNR values for  Table 1 on this image. For sampling geometries using 20.3% of the total samples, reconstructions gave values well-above 40 dB (50-60 dB) which are excellent (e.g., see Figure 5). It is interesting to note that stochastic geometries perform significantly better than deterministic geometries at higher sampling rates. For example, random sampling on PDF gave over 90 dB reconstructions while the SLP geometry gave reconstructions of less than 60 dB. Again, all of these reconstructions are excellent. The lack of perfect-reconstruction for our geometries comes from the use of Wavelet transform in our method. As it is well-known, and also demonstrated here, the standard TV-norm is the best fit for this example. the DPE and SLP geometry classes are higher than the RPE and RSP geometry classes. The dyadic phase encoded geometry class exhibited the most consistency between PSNR values across images for the same geometry, but had a lower average PSNR value across images than the spiral low pass geometry class. The random classes exhibited a greater improvement over zero-filling, but did not achieve reconstructed images as accurately as the deterministic classes (for the same number of samples). In terms of PSNR, the spiral low pass sampling geometry gave the best results. Furthermore, for PSNR > 40 dB, most reconstructions were found to be of excellent quality.
To determine the minimum acceptable reconstruction quality, we consider PSNR values from 20 dB to 45 dB (see Figure 5). For a given sampling geometry and a required image quality, all parameter regions were intersected to determine the maximum number of images that achieve the desired reconstruction quality for a single range of parameters. Thus, for each quality level, we can have a maximum of 24 images, indicating that the same optimal region provided this minimum quality for all of them. To avoid outliers, we consider an optimization region to be successful in meeting the image quality criterion if 75% or more of the images maintain a level above what is required. We also provide bounding box results in Table 2.  Our experiments reveal that the DPE and SLP geometry class outperform the RPE and RSP classes. It was expected that the RPE class exhibit poor reconstruction, since the geometries do not contain sufficient samples from the center of k-space. And only a single geometry, the RSP geometry including 62.5% of samples resulted in a suggested operating parameter space for achieving "acceptable" reconstruction.  On the other hand, the DPE class achieved excellent reconstruction from two geometries (62.5% and 48.5%) and the SLP class achieved excellent reconstruction from the three geometries that retained the most number of k-space samples (60.5%, 48.5%, and 40.6%). When lower-quality reconstructions are acceptable, we can use sampling geometries that use fewer k-space samples. Twelve additional DPE geometries were found to satisfy the parameter space threshold and intersection procedure. All eighteen (nine each from the ON and OFF imagery) SLP geometries that were included in this experiment satisfied our requirement for acceptable reconstruction. The combined parameter plots over all images indicate the number of images that meet the quality criteria. We present an example in Figure 7 for the 48.5% DPE geometry. Figure 7(c) depicts the inner and out bounding regions, where 75% of the images meet the quality requirements. The complexity of this optimal region is reflected in an area ratio of 5.71, indicating that the area of the outer box is 5.71 times larger than the inner bounding box.
For activity detection, we also compared the difference images of the original ON and OFF samples, the CS reconstructed ON and OFF samples, and also the reconstruction values of zero-filling the missing k-space samples. Based on our investigations, we have found that it was possible to obtain excellent reconstruction results with very fast k-space sampling geometries. In what follows, we present our best results, obtained with the SLP geometry using only 20.3% of k-space.
Here, it is interesting to note that activity detection required optimization of the reconstructed difference image. Separate optimization of the ON/OFF images did not translate into any improvements over simple zero filling (see Figure 8). On the other hand, optimization of the difference images led to significant increases in PSNR and SSIM over zero-filling for eleven of the twelve samples. The mean values for the optimal difference image parameters were 57.58 dB and 0.9747 for the PSNR and mean SSIM measurements, respectively. The plots in Figure 6 show the PSNR and mean SSIM values for each difference image for the optimal CS reconstruction for each slice difference image, the optimal difference image CS reconstruction, and the zero-filled difference image. For activity detection, after removing outlier points, 75% of the optimal parameter region was bounded by: α min ¼ 2:407 e À 7; α max ¼ 2:881 e À 5; β min ¼ 9:638 e À 8; β max ¼ 1:084 e À 4: We also present results from parameter optimization based on maximizing the mean SSIM index value. For each sampling geometry and image pair, we present the best mean SSIM index in Tables 3, 4, 5, 6. The corresponding α, β values are shown in Tables 7,8,9,10,11,12,13,14. Overall, the SLP geometry gave the best results. The mean SSIM results achieved by the SLP geometry are given in Table 3. The optimal αvalues are given in Table 8 and the optimal β-values are given in Table 12.
For each sampling geometry, we provide the median α, β-values in the last column of Tables 7,8,9,10,11,12,13,14. The recommendation to use the median α, β-values for reconstruction was validated using leave-one-out. Here, note that the mean SSIM values reported in Tables 3, 4, 5, 6 are based on performing the image reconstruction using the median values from the training set of the remaining 11 image pairs. There is thus a mismatch between the median α, β values and the optimal values that can only be estimated when the entire image is available. Yet, despite this mismatch, we find that the second approach worked very well. For example, all of the SLP reconstructions for 20.3% sampling gave average SSIM index values above 0.93.
We next discuss the relationship between PSNR and mean SSIM values. Numerically, an SSIM value of 1 corresponds to an infinite value for the PSNR. Furthermore, an SSIM value of 0 corresponds to a negative infinity value for PSNR. Ideally, an increase in the SSIM index will also correspond to an increase in PSNR. We have verified this fact for the mean-SSIM values of Table 4 and their corresponding PSNR values. On the  The results are based on the leave one out method. See Table 3 for details. The results are based on the leave one out method. See Table 3 for details. The results are based on the leave one out method. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details. The optimal values are shown for each image pair. The median value for each sampling geometry is also shown. In the leave one out method, the median values were only computed over the training sets. See Table 3 for details.
other hand, this property does not hold for individual images. For example, image #2 for sampling at 20.30% has a PSNR value that is above the median, while its SSIM value is below the median. In such cases, the SSIM trend is considered to be superior to the PSNR. Overall though, all of the SLP reconstructions gave mean SSIM values above 0.93. Thus, the SLP geometry reconstructions are considered to be excellent.

Discussion
We have found that parameter optimization achieved significant image quality improvements using a relatively small number of iterations. Typically, five iterations were required to improve PSNR by over 10 dB from the initial values of α = 0, β = 0 and achieve reconstructions that were within 5% of the optimal quality value. We take a closer look at the α and β ranges for select geometries from the DPE and SLP classes. We selected both the DPE and SLP geometries with the least number of samples that still provided reasonable performance regions in the optimal quality regions for 30 dB, 35 dB, 40 dB, and 45 dB. Examining the range of the parameters provides a more detailed description of the parameter space, and which constraint in (3) is essential for acceptable reconstructions. The minimum and maximum parameter values that define the vertices of the inner and outer bounding boxes are listed in Table 2.
Three instances in the above table stand out because the minimum alpha and/or beta parameters are negative. These values are a by-product of the interpolation and intersection method used to calculate the parameter ranges. When negative parameter values were inserted into the optimization algorithm, the reconstruction algorithm breaks down. In these cases, we replace the negative value with zero to achieve an acceptable result.
It is imperative that our discussion turn to the effect of a zero parameter value for either constraint in the reconstruction problem. Typically, the effect of α = 0 is a more prominent presence of high-frequency errors in the spatial domain, as well as a higher amount of artifacts from the sampling geometry. Conversely, the effect of β = 0 is a loss of highfrequency spatial components in the spatial domain reconstruction. In this case, the TVnorm tends to drive the solution to result whose finite difference in each dimension is minimized. While removing one of the penalty terms in the CS objective function may result in a usable reconstruction, our observations indicated that this case will not be optimal.
When both parameters are zero, the proposed optimization framework breaks down. It is possible to achieve acceptable image quality when one of the two parameters is zero, but not both. While completely removing one of the penalty terms in (3) may result in acceptable reconstructions in terms of PSNR, an increased PSNR can be achieved by using the (non-zero) optimal values.
For activity detection, we have found that optimization of the difference image led to excellent results. In particular, the SLP geometry gave excellent average SSIM values of 0.9747 (max = 1). The use of the leave-one-out validation with mean SSIM optimization also verified the fact that the SLP family of geometries gives the best result. Furthermore, recommended parameter values for the SLP geometries are given by the median values of Tables 8 and 12.

Conclusions
In conclusion, we have found that CS parameter optimization can dramatically improve fMRI image reconstruction quality. Furthermore, deterministic SLP scanning geometries (similar to the 3D geometries of [4]) consistently gave the best image reconstruction results. The implication of this result is that less complex sampling geometries will suffice over random sampling, provided the TV-Norm and Transform penalty parameters are selected from the range of values we have calculated in our proposed methodology. We have also found that we can obtain stable parameter regions that can be used to achieve specific levels of image reconstruction quality when combined with specific k-space sampling geometries. More importantly, median parameter values can be used for obtaining excellent reconstructions. This observation has been validated using leave-one-out and optimization based on the mean SSIM index.