- Research
- Open Access

# A super-resolution method-based pipeline for fundus fluorescein angiography imaging

- Zhe Jiang
^{1}, - Zekuan Yu
^{1}, - Shouxin Feng
^{2}, - Zhiyu Huang
^{1}, - Yahui Peng
^{3}, - Jianxin Guo
^{2}Email author, - Qiushi Ren
^{1}and - Yanye Lu
^{4}Email authorView ORCID ID profile

**17**:125

https://doi.org/10.1186/s12938-018-0556-7

© The Author(s) 2018

**Received:**18 July 2018**Accepted:**6 September 2018**Published:**19 September 2018

## Abstract

### Background

Fundus fluorescein angiography (FFA) imaging is a standard diagnostic tool for many retinal diseases such as age-related macular degeneration and diabetic retinopathy. High-resolution FFA images facilitate the detection of small lesions such as microaneurysms, and other landmark changes, in the early stages; this can help an ophthalmologist improve a patient’s cure rate. However, only low-resolution images are available in most clinical cases. Super-resolution (SR), which is a method to improve the resolution of an image, has been successfully employed for natural and remote sensing images. To the best of our knowledge, no one has applied SR techniques to FFA imaging so far.

### Methods

In this work, we propose a SR method-based pipeline for FFA imaging. The aim of this pipeline is to enhance the image quality of FFA by using SR techniques. Several SR frameworks including neighborhood embedding, sparsity-based, locally-linear regression and deep learning-based approaches are investigated. Based on a clinical FFA dataset collected from Second Affiliated Hospital to Xuzhou Medical University, each SR method is implemented and evaluated for the pipeline to improve the resolution of FFA images.

### Results and conclusion

As shown in our results, most SR algorithms have a positive impact on the enhancement of FFA images. Super-resolution forests (SRF), a random forest-based SR method has displayed remarkable high effectiveness and outperformed other methods. Hence, SRF should be one potential way to benefit ophthalmologists by obtaining high-resolution FFA images in a clinical setting.

## Keywords

- Fundus fluorescein angiography imaging
- Super-resolution
- Machine learning
- Random forest
- Convolutional network

## Background

Fundus fluorescein angiography (FFA) imaging is a valuable diagnostic tool for many ocular and systemic diseases including malarial retinopathy, glaucoma, malignant hypertension and multiple sclerosis [1–4]. Ocular lesions can be detected in the early phases by examining FFA images, which improves the chances of early diagnosis of some diseases, such as, age-related macular degeneration (AMD) [5] and diabetic retinopathy (DR) [6]. Therefore, FFA images can provide ophthalmologists with additional data that can ultimately improve a patient’s cure rate. For example, diabetic retinopathy, which is the leading cause of blindness in adults world-wide [7], can be treated effectively via laser surgery if diagnosed at an early stage. Although routine checkups with ophthalmoscopes have already promoted early diagnosis of ocular diseases, the overall diagnostic rate is still limited by the shortage of ophthalmologists and optometrists [8] in most countries. Hence, high-resolution (HR) FFA images become a vital tool in ophthalmopathy screenings because the HR images can provide doctors with more indicators, such as micro-aneurysms, hemorrhages, and small veins, to assist their diagnosis decision. However, only low-resolution (LR) FFA images are available in most clinical cases. Thus, super-resolution (SR) techniques that aim at enhancing the spatial resolution of images by using image-processing techniques have great potential to improve ophthalmic disease diagnosis rates.

SR techniques were pioneered by Tsai and Huang in 1984 [9] and are mainly used to improve nature and remote sensing images. Generally, there are two major categories of SR algorithms, multi-frame SR and single image SR (SISR). The early SR methods are mainly multi-frame SR [10–13]. The core idea of such methods is an algorithm that combines the information of a sequence of LR images to construct a correlated HR image. On the other hand, the SISR methods focus on learning the relationship between HR and LR images from a training set and recover a correlated HR image from a single LR image. Benefitting from the development of machine learning techniques, various SISR algorithms [14–25] have been proposed in recent years and have become the main research direction of SR algorithms.

In recent years, SR techniques have successfully been extended to medical imaging applications; this provides an important preprocessing step that can improve the image quality of imaging technologies such as ultrasound [26, 27], CT [28], PET [29] and MRI [30–35]. There is also relevant research on using SR techniques on conventional fundus images. Thapa et al. evaluated several SR algorithms and demonstrated the promising impact of partial SR methods [36]. However, the study of Thapa et al. is limited by the low amount of experimental data.

Considering the limitations of current methods, FFA images provide more valuable clinical information than conventional fundus images. Thus, applying super-resolution methods to FFA images can help ophthalmologists achieve better diagnosis results. To the best of our knowledge, no one has applied SR techniques to FFA imaging so far; thus, we propose a SR method-based pipeline for FFA imaging. The aim of this pipeline is to enhance the image quality of FFA by using super-resolution techniques. In this study, four types of SISR algorithms will be analyzed, i.e., neighborhood embedding (NE) [14, 15] approaches, sparsity-based approaches [16, 17], locally-linear regression approaches [18–21] and deep learning (DL)-based approaches [22, 23]. We investigate the effectiveness of each method using our clinical FFA datasets. The results of each algorithm are then quantitatively evaluated to investigate the method’s feasibility and performance.

## Methods

### Super-resolution method-based pipeline

Before describing each process, we define the mathematical notation we use in this paper (mathematical definition and details will follow shortly). For the training phase, a set of *patch pairs* are extracted from the original training image pairs. This yields the training set \( {\mathcal{P}} = \left\{ {{\mathcal{X}}_{h} ,{\mathcal{X}}_{l} } \right\} = \{ \left( {x_{l}^{n} ,x_{h}^{n} } \right)|n = 1,2, \ldots ,N\} \), where \( {\mathcal{X}}_{h} = \left\{ {x_{h}^{1} ,x_{h}^{2} , \ldots ,x_{h}^{N} } \right\} \) and \( {\mathcal{X}}_{l} = \left\{ {x_{l}^{1} ,x_{l}^{2} , \ldots ,x_{l}^{N} } \right\} \) are a set of HR and LR image patches (with *N* samples), respectively. Meanwhile, we denote \( X_{h} = \left[ {x_{h}^{1} ,x_{h}^{2} , \ldots ,x_{h}^{N} } \right] \in R^{{S_{h} \times N}} \) and \( X_{l} = \left[ {x_{l}^{1} ,x_{l}^{2} , \ldots ,x_{l}^{N} } \right] \in R^{{S_{l} \times N}} \) as the matrix representation of the two sets, where \( S_{h} \) and \( S_{l} \) are the dimensions of the HR and LR patches in vector form, respectively. For the testing phase, we denote the testing LR image patch by \( y_{l} \in R^{{S_{l} \times 1}} \) and the reconstruction HR image patch by \( y_{h} \in R^{{S_{h} \times 1}} . \)

Parameter setting for the patch-extraction of ten test SISR algorithms on the pipeline (M represents the upscaling factor of the SR task)

#### Neighborhood embedding approaches

**w**is the vector of \( \left\{ {w_{i} } \right\}_{i = 1}^{K} . \)

**w**. Therefore, the weight-calculation of the NE + LS and NE + NNLS methods can be expressed as

In the proposed pipeline, the size of the nearest neighbor set K, of both NE + LS and NE + NNLS, was set to 24.

#### Sparsity-based approaches

The sparse representation α (\( \upalpha \in R^{B \times 1} \)) of \( y_{l} \) is firstly calculated by minimizing Eq. 7, where the regularization parameter λ balances the importance of the sparsity constraint. Then the reconstructed HR patch \( y_{h} \) is obtained directly via matrix multiplication of the sparse representation α and the HR dictionary *D*_{h}.

*α*in Eq. 7. Finally, the HR patch \( y_{h} \) can be reconstructed based on Eq. 8.

In this work, we used SB-Yang and SB-Zeyde to signify the two sparsity-based approaches, respectively. Both approaches set the number of dictionary atoms and regularization parameter λ to 2048 and 0.1, respectively. The parameter *L* for the solution process of K-SVD is set to 24 atoms for each representation vector.

#### Locally-linear regression approaches

*L*

_{2}-norm constraint of the coefficient matrix instead of the

*L*

_{1}-norm constraint for the sparse representation; this is done to simplify the optimization problem to a ridge regression [40], which can be solved in closed-form. Therefore, for the input LR patch \( y_{l} \) with the nearest dictionary atom \( d_{l}^{j} \), the optimization problem of Eq. 7 can be reformulated (as Eq. 11) by combining the sub-dictionaries and

*L*

_{2}-norm regularization

*j*-th dictionary atom, which can be calculated offline. For each input LR patch \( y_{l} \), the reconstruction procedure of ANR can be simplified to find the nearest-neighbor atom \( d_{l}^{j} \) for \( y_{l} \) in the LR dictionary \( D_{l} \) and using the corresponding projection matrix \( P_{j} \) to finish the SR reconstruction via the matrix multiplication of \( P_{j} \) and \( y_{l} \).

Depending on the simplified architecture of ANR, Timofte further proposed an adjusted anchored neighborhood regression (A+) [19]. A+ inherits various tricks of ANR, such as sub-dictionary and *L*_{2}-norm regularization; but for A+, the training samples are no longer discarded after training the coupled dictionaries, whereas ANR and most of the sparsity-based approaches do. Instead, these training samples are directly applied to the reconstruction procedure via the use of sub-dictionaries. For each atom \( d_{l}^{j} \) from the LR dictionary \( D_{l} \), A+ searches its k-nearest neighbors among the training pool \( {\mathcal{X}}_{l} \), instead of the sparse dictionary atoms of \( D_{l} \). Therefore, the LR and HR sub-dictionaries of A+ can be denoted as \( N_{ls}^{j} = \left[ {x_{ls}^{1\left( j \right)} ,x_{ls}^{2\left( j \right)} , \ldots ,x_{ls}^{K\left( j \right)} } \right] \in R^{{S_{l} \times K}} \) and \( N_{hs}^{j} = \left[ {x_{hs}^{1\left( j \right)} ,x_{hs}^{2\left( j \right)} , \ldots ,x_{hs}^{K\left( j \right)} } \right] \in R^{{S_{h} \times K}} \), where \( x_{ls} \) and \( x_{hs} \) are training samples selected from \( {\mathcal{X}}_{l} \) and \( {\mathcal{X}}_{h} \) respectively. Based on the solved \( N_{ls}^{j} \) and \( N_{hs}^{j} \), A+ reconstructs the HR patch using the same method that ANR does.

*O*groups and learns

*O*regressors \( {\mathcal{F}} = \left\{ {f_{1} ,f_{2} , \ldots ,f_{O} } \right\} \), which collectively provide the least reconstruction error for all the training patches (

*O*is the fixed number assigned manually). The problem can be expressed as follows:

*O*, otherwise \( c_{o,n} = 0 \). An iterative algorithm resembling EM algorithm [41] is used to solve this problem. Two procedures (E-step and M-step) are implemented to update the \( {\mathcal{F}} \) and \( {\text{C}} \) alternately until Eq. 14 convergence. In the E-step, the clusters \( {\text{C}} \) are fixed and \( {\mathcal{F}} = \left\{ {f_{1} ,f_{2} , \ldots ,f_{O} } \right\} \) is estimated for each cluster. Once again, ridge regression (Eqs. 11–13) is used to learn the regressors. The SR-reconstructed HR patch of regressor \( f_{O} \) can be expressed as \( \widetilde{{x_{h}^{o,n} }} = f_{o} \left( {x_{l}^{n} } \right) = P_{o} x_{l}^{n} = [X_{h}^{o} \left( {(X_{l}^{o} )^{T} X_{l}^{o} + \lambda I} \right)^{ - 1} (X_{l}^{o} )^{T} ]x_{l}^{n} . \) Here, \( X_{l}^{o} \) and \( X_{h}^{o} \) are matrices stacked by all the LR patches and the corresponding HR patches from the

*O*-th cluster column-wise. In the M step, the regressors \( {\mathcal{F}} \) are fixed and the clusters \( {\text{C}} \) should be updated. For each training sample pair \( \{ x_{l}^{n} ,x_{h}^{n} \} \), the SR reconstruction error of all regressors \( \left\{ {f_{o} } \right\}_{o = 1}^{O} \) are calculated according to \( e_{o,n} = \;\parallel \widetilde{{x_{h}^{o,n} }} - x_{h}^{n} \parallel^{2} \); the sample pair is then reassigned to the

*o*-th cluster with the minimum reconstruction error \( e_{o,n} \) to get the new clusters. Once Eq. 14 is solved, the training of JOR is finished. For the testing step, the input LR patch only needs to find its k-nearest neighbors from the training samples and use these neighbors to evaluate the most suitable regressor \( f_{o} \) for SR reconstruction.

*N*training samples \( \left\{ {x_{l}^{n} ,x_{h}^{n} } \right\}_{n = 1}^{N} \). Moreover, SRF adapts a novel regularized quality measure \( {\text{E}}\left( {X_{H} ,X_{L} } \right) \) for the evaluation of splitting functions

*le*with a linear regression model \( m_{le} \left( {x_{l}^{n} } \right) = w^{le} x_{l}^{n} \), SRF can use the training samples \( (X_{l}^{le} \;{\text{and}}\;X_{h}^{le} ) \), routed to the current leaf node, to calculate the mapping \( w^{le} \) via local linear regression. Again, we can get a closed-form solution of \( w^{le} = X_{h}^{le} \left( {(X_{l}^{le} )^{T} X_{l}^{le} + \lambda I} \right)^{ - 1} (X_{l}^{le} )^{T} . \) The reconstruction procedure of \( y_{h} \) can be implemented by averaging the predictions over all T trees:

*t*that \( y_{l} \) is routed to.

In our pipeline, the ANR and A+ use the trained coupled dictionaries \( D_{h} \) and \( D_{l} \) to form the SB-Zeyde as the starting point of the algorithm. On the other hand, JOR and SRF directly split the patch spaces without coupled dictionaries. For the ANR and A+ , the weight factors λ of the sparsity constraints were all set to 0.1 and the nearest neighbor size K was set to 40 and 2048 for ANR and A+ , respectively. For JOR, the weight factor λ was fixed to 0.1 and the three main parameters (the number of regressors, the number of iterations of the E-M optimization and the nearest neighbor size K) were set to 32, 20 and 32, respectively. For the SRF case, the parameter settings were the number of trees \( {\text{T}} = 6 \), the max tree depth \( \upxi_{max} = 15 \), \( \uplambda = 0.1 \) and \( \upkappa = 1. \)

#### Deep learning-based approaches

In recent years, DL has achieved phenomenal success. Various computer vision tasks such as classification, object recognition, and segmentation have benefited from DL’s many functions. Inspired by successful DL models, especially convolutional neural networks (CNN) that are used for classification (such as VGG-Net [43] and ResNet [44]), several CNN-based methods [22–25] were proposed to handle the SISR problem. In this paper, two representative CNN networks for SR, SRCNN [22] and VDSR [23], are implemented in our experiment.

*d*-th convolutional layer. Given the training set \( {\mathcal{P}} = \{ \left( {x_{l}^{n} ,x_{h}^{n} } \right)|n = 1,2, \ldots ,N\} , \) the SRCNN model is estimated by minimizing the mean squared error (MSE) of ground truth HR images \( x_{h}^{n} \) and reconstructed HR images \( {\text{F}}\left( {x_{l}^{n} ;\varTheta } \right) \). The loss function is characterized by

The objective function can be minimized by using the stochastic gradient descent (SGD) with the standard backpropagation (BP) [46]. In Dong’s view, the function of the three convolutional layers of SRCNN can be explained in analogy with the pipeline of sparse coding-based SR methods, which includes patch extraction and representation, Non-linear mapping, and reconstruction, respectively. Relying on the highly expressive capability of CNN, SRCNN can explore the nonlinear relationships between the LR and HR images and learn general image representation, which can be applied to various datasets and tasks.

Considering the overall development trend of CNN, that “the deeper the better” in the field of computer vision, Kim et al. proposed a very deep convolution network, termed VDSR. Figure 2b shows the structure of the VDSR, which indicates that VDSR uses 20 weight layers in a cascaded way to form the deep network. Except for the first and last layers, all the weight layers include 64 filters with size 64 × 3 × 3 and with ReLu on filter responses. In this way, VDSR has achieved a significantly larger perspective than SRCNN (41 × 41 vs 13 × 13) to help the network exploit more contextual information to model the SR-mapping tasks. For training, the VDSR adapts the MSE as a loss function and uses SGD with BP to train the network. At the same time, to accelerate the convergence speed of the deep network, Kim also provides several techniques, such as residual learning and adaptive gradient clipping, to ensure the deep network can be trained with a very high learning rate. Residual learning demands that the convolutional layers of VDSR only predict the difference between the LR image and the correlated HR image, i.e., residual images; the LR input image can then be added to the residual image via a skip connection to reconstruct the final HR image. Especially considering that the LR input image and the HR output image are similar in the SR tasks, training a deep convolution network that can predict residual images instead of HR images should be easier to accomplish. Hence, the VDSR has achieved good performance in both training time and reconstruction quality. In fact, nowadays, even if various new CNN models [24, 25], which have more complicated and elaborative designs, are proposed to complete the SR tasks, the VDSR should still be an efficient DL model.

For the training of the SRCNN in the pipeline, the batch size, momentum, and weight decay parameters were set to 128, 0.9 and 0, respectively. The learning rate was 10^{−4} for the first two convolutional layers and 10^{−4} for the third layer. The filter weights were initialized randomly via a Gaussian distribution \( (\upmu = 0, \delta^{2} = 0.001) \) and the biases were was initialized with the constant zero. On the other hand, for the training of VDSR, the batch size, momentum and weight decay parameters were set to 16, 0.9 and 0.0001, respectively. The learning rate was initially set to 0.1 and decreased by a factor of 10 every 30 epochs. When the learning rate reached 0.0001, the learning rate stops decreasing and keeps the fixed value in the following epoch. The filter weights are initialized by the method proposed by [47], where the biases were set to 0.

### Experimental setup

A simulation experiment was carried out for quantitative analysis and evaluation of the SISR methods (compared in this work) for the SR method-based pipeline using a clinical FFA dataset. All the experiments were implemented on a workstation (Intel i7-7700 CPU at 3.6 GHz, 32 GB RAM). The non-DL SISR methods were implemented using MATLAB. Meanwhile, DL-based SISR methods (SRCNN and VDSR) are trained using the Caffe package [48] on a GTX 1070 GPU and tested using the MatConvNet package [49].

#### Fundus fluorescein angiography dataset

#### Experimental protocol

The experimental study was performed in accordance with the workflow of the proposed pipeline. We used the original FFA images as the HR images and acquired the corresponding LR images by down-sampling the HR images in the spatial dimension. The down-sampling was done by implementing downsampling factors via a Bicubic downsampler. In this way, the original 185 FFA images were translated into 185 FFA image pairs. The FFA image pairs were divided into a training set TR1 (115 FFA image pairs) and a testing set TE1 (70 FFA image pairs). This is done to ensure that both TR1 and TE1 contain all ten types of homologous images and maintain a unified 23:14 distribution proportion (between TR1 and TE1) for each group of homologous images. Next, we used the HR-LR image pairs from TR1 as the input of the SISR algorithms to train the mapping models. The LR images from TE1 were tested next by using the trained SR models for reconstruction. Finally, the HR images from the TE1 served as the ground truth for quantitative analysis of the reconstruction performance of the SR methods. In this paper, we have performed the experiments using ten representative algorithms under two upscaling factors (2× and 4×) and choose the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) [50] as the quantitative evaluation indexes.

## Results

The average number of evaluation indexes of SISR algorithms (trained on TR1) for the testing set TE1 (with an upscaling factor: 2×, 4×)

PSNR | SSIM | |||
---|---|---|---|---|

2× | 4× | 2× | 4× | |

Bicubic | 40.11 | 34.51 | 0.994 | 0.968 |

NE + LS | 42.83 | 36.16 | 0.998 | 0.981 |

NE + NNLS | 42.25 | 36.10 | 0.998 | 0.981 |

SB-Yang | 42.80 | 36.20 | 0.998 | 0.981 |

SB-Zeyde | 43.50 | 36.60 | 0.998 | 0.982 |

ANR | 42.98 | 36.02 | 0.998 | 0.982 |

A+ | 43.18 | 36.97 | 0.998 | 0.985 |

JOR | 43.76 | 37.76 | 0.997 | 0.985 |

SRF | 47.06 | 41.30 | 0.998 | 0.986 |

SRCNN | 44.40 | 37.76 | 0.997 | 0.984 |

VDSR | 44.46 | 38.84 | 0.998 | 0.986 |

The training time and averaged reconstruction speed of the best three SISR methods

SRCNN | VDSR | SRF | |
---|---|---|---|

Training time (h) | 300 | 41 | 10 |

Reconstruction speed (s/sample) | 37.2 | 43.2 | 57 |

## Discussion

The average evaluation indexes of SISR algorithms (trained on TR1 and TR2, respectively) for the test set TE2 (with an upscaling factor: 4×)

PSNR | SSIM | |||
---|---|---|---|---|

TR1 | TR2 | TR1 | TR2 | |

SRF (TE2) | 41.86 | 41.23 | 0.987 | 0.986 |

VDSR (TE2) | 39.27 | 39.00 | 0.986 | 0.986 |

From Table 4, we can see that, although the performances of the two SISR methods decrease without the help of the homologous images in training set, the two SISR algorithms still achieve acceptable results via a trained model of non-homologous images. The SRF has successfully kept the reconstruction quality at a high level. In fact, to reduce the calculation intensity, we have already compromised the partial performance of SRF by simplifying the number of trees of the random forest from 15 (recommended in the original paper) to 6 in our experiments. Even after applying this trade-off, the SRF has “learned” a suitable number of data-dependent regression functions via numerous leaf nodes in the forest for the SR reconstruction of FFA images.

On the other hand, the relatively stable performance of VDSR, shown in Table 4, demonstrates that the training procedure of VDSR is less dependent on the homologous images. One of the possible explanations should be that the deep CNN’s strong capacity for learning and expression can help VDSR extract more general features from the FFA training images to complete the reconstruction. In fact, designing very deep CNN models is a recent trend for SISR algorithms. For example, Mao et al. [52] proposed a 30-layer residual encoder-decoder (RED) networks with symmetric skip connection. Tai et al. later introduced recursive blocks in DRRN (52 layers) [53] and memory blocks in MemNet (80 layers) [54] to construct multi-path deep networks. Li et al. [55] used modified residual blocks to construct EDSR (36 layers) and MDSR (165 layers) for single-scale SR task and multi-scale SR task respectively. All these networks have successfully improved the reconstruction quality of SR images depending on the fine hierarchical features extracted by deep CNN model. However, to make full use of this advantage, large training datasets are usually necessary to avoid the over-fitting problem and improve the final performance of the deep network. This is not the case in this work because the training datasets in our experiment have a limited number of FFA images. On the other hand, high-performance GPUs are another key requirement for the application of deep network to meet the demand of large storage and heavy computation. Considering that many computers in Chinese department of Ophthalmology, especially in primary hospitals, do not have GPUs that can achieve fast calculations of big data sets, the practical applications of the DL-based SISR methods remain limited. Hence, although we realize the potentiality of DL-based SISR algorithms, we believe that SRF should still be the competitive option for the resolution enhancement of FFA images with high generality and usability in a clinical setting at present stage because the algorithm can be efficiently trained on the small size of the training data and the relatively short training and testing time on CPU environment.

Next, we discuss the degradation model of the LR images. In our experiments, we used the degradation model \( x_{l} = Gx_{h} \) to simulate the LR FFA images, where \( G \) is the downsampling matrix. This degradation model should be treated as a simplified version of the normal degradation model \( x_{l} = GB_{u} x_{h} \)(\( B_{u} \) represents the blur matrix). This simplification is made to explore the clinical practice of abandoning images with obvious motion blur and out-of-focus blur that are not used for subsequent diagnosis and analysis. In our experiments, we are concerned with whether our SR algorithm has the capability to recover information from the spatially downsampled FFA images. In fact, the insufficiency of spatial sampling is always a major problem for clinical FFA imaging. On one hand, due to budget constraints, high-performance sensors are not standard equipment for all the clinical environments. In some primary hospitals, the sensor used for FFA imaging have relatively lower spatial resolution and can’t meet the demands of HR imaging. On the other hand, the fluorescence signal of FFA imaging has relative lower intensity than the normal reflective signal of conventional fundus imaging. Additionally, the exposure time can’t be markedly prolonged due to other practical considerations (e.g., eye movement), pixel binning (a kind of downsampling) is often used in clinical settings to increase the signal-to-noise ratio (SNR) of FFA images. Thus, we find that our degradation model of LR images considers common clinical problems. Hence, our simulated LR FFA images should have a certain degree of similarity with LR images typically used in clinics, which also becomes an important guarantee to generalize the experimental results to the clinical practice.

Finally, we believe that the SR-enhanced FFA images are meaningful for ophthalmologists, even if novel imaging modalities such as optical coherence tomography (OCT) have gained great success in recent years. There are three main reasons for our opinion. First, the FFA, as the current gold standard for evaluating the clinical fundus feature of DR and AMD, is still widely used in ophthalmology for diagnosing and classifying related fundus disorders [56–59]. Second, FFA is still the most commonly used method to plan laser treatment (photocoagulation) in clinical settings [60]. Third, for clinical research involving multimodal imaging, OCT, FFA and other modalities are often used cooperatively [61]. In fact, considering the characteristic of OCT images, we also wonder if our proposed SR-based pipeline method can be used to the enhance OCT images, which can be a potential research direction for future work.

## Conclusion

In conclusion, we have preliminary explored the effects of resolution enhancement of the FFA images using an SR-based pipeline method. Ten testing SISR methods, divided into four groups, are used for the proposed pipeline of our clinical FFA datasets. The experimental results are then analyzed and compared. From the results, we find that direct local regression-based approaches and DL-based approaches work well for our (clinical) datasets. Then, as the representative algorithms of these two groups of SISR methods, SRF and VDSR are further discussed on the reformed datasets to discuss the algorithms’ dependency on the training set. Both experimental results have shown that super-resolution method-based pipeline has the potential to enhance FFA images. The SRF has displayed remarkably-high effectiveness and outperformed other testing algorithms. Hence, we believe that the SRF is a feasible SR method that can be implemented on an ophthalmologist’s workstation to create an SR-based pipeline method for FFA images to assist ophthalmologists in enhancing these images in their clinical practices.

## Declarations

### Authors’ contributions

ZJ was involved in this work while doing his Ph.D. under the supervision of YP and QR. ZY and ZH participated in the training process of the proposed pipeline. SF, JG performed the acquisition of the clinical fundus images and consulted the obtained results. YL supervised the theoretical development and revised the manuscript. All authors read and approved the final manuscript.

### Acknowledgements

The authors would like to thank the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. Yanye Lu is a primary corresponding author.

### Competing interests

The authors declare that they have no competing interests.

### Ethics approval and consent to participate

The clinical image used in the research is provided by the affiliated hospital of the Xuzhou Medical University and are we have been informed that the patients were consent to participate in the study.

### Funding

This work was funded by the National Key Research and Development Program of China (2017YFE0104200); the National Natural Science Foundation of China (81421004); the National Key Instrumentation Development Project of China (2013YQ030651).

### Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

## Authors’ Affiliations

## References

- Zhao Y, Maccormick IJC, Parry DG, Leach S, Beare NAV, Harding SP, Zheng Y. Automated detection of leakage in fluorescein angiography images with application to malarial retinopathy. Sci Rep. 2015;5:10425.View ArticleGoogle Scholar
- Nanba K, Schwartz B. Nerve fiber layer and optic disc fluorescein defects in glaucoma and ocular hypertension. Ophthalmology. 1988;95(9):1227–33.View ArticleGoogle Scholar
- Gallasch G, Ritz E. The fundus in malignant hypertension. Nephrol Dial Transplant. 1997;12(7):1518–9.View ArticleGoogle Scholar
- Younge BR. Fluorescein angiography and retinal venous sheathing in multiple sclerosis. Can J Ophthalmol J Can Dophtalmol. 1976;11(1):31–6.Google Scholar
- Xiao M. Analysis on fundus fluorescein angiography in patients with age related macular degeneration. Int J Ophthalmol. 2010;10(5):962–3.Google Scholar
- Chen P, Zhao W, Wang N, Cai HJ. Clinical analysis of fundus fluorescence angiography on diabetic retinopathy. Int J Ophthalmol. 2007;7(3):863–4.Google Scholar
- Yau JWY, Rogers SL, Kawasaki R, Lamoureux EL, Kowalski JW, Bek T, Chen SJ, Dekker JM, Fletcher A, Grauslund J. Global prevalence and major risk factors of diabetic retinopathy. Diabetes Care. 2012;35(3):556–64.View ArticleGoogle Scholar
- Gibson DM. The geographic distribution of eye care providers in the United States: implications for a national strategy to improve vision health. Prev Med. 2015;73:30.View ArticleGoogle Scholar
- Tsai R. Multiframe image restoration and registration. In Adv Comput Vision Image Process. 1984;1:317–39.Google Scholar
- Zomet A, Rav-Acha A, Peleg S. Robust super-resolution. In: Proceedings of the 2001 IEEE computer society conference on computer vision and pattern recognition, vol. 641. CVPR 2001. 2001. p. I-645-I-650.Google Scholar
- Hardie R. A fast image super-resolution algorithm using an adaptive Wiener filter. IEEE Trans Image Process. 2007;16(12):2953–64.MathSciNetView ArticleGoogle Scholar
- Babacan SD, Molina R, Katsaggelos AK. Total variation super resolution using a variational approach. In: IEEE international conference on image processing; 2008. p. 641–4.Google Scholar
- Cheeseman P. Super-resolved surface reconstruction from multiple images. Technical report. 1994.Google Scholar
- Chang H, Yeung DY, Xiong Y. Super-resolution through neighbor embedding. In: IEEE computer society conference on computer vision & pattern recognition; 2004. p. 275–82.Google Scholar
- Bevilacqua M, Roumy A, Guillemot C, Morel A. Low-complexity single image super-resolution based on nonnegative neighbor embedding. In: BMVC; 2012. p. 1–10.Google Scholar
- Yang J, Wright J, Huang TS, Ma Y. Image super-resolution via sparse representation. IEEE Trans Image Process. 2010;19(11):2861–73.MathSciNetView ArticleGoogle Scholar
- Zeyde R, Elad M, Protter M. On single image scale-up using sparse-representations. Int Conf Curves Surf. 2010;6920:711–30.MathSciNetView ArticleGoogle Scholar
- Timofte R, De V, Gool LV. Anchored neighborhood regression for fast example-based super-resolution. In: IEEE international conference on computer vision. 2013. p. 1920–7.Google Scholar
- Timofte R, Smet VD, Gool LV. A+: adjusted anchored neighborhood regression for fast super-resolution. In: Asian conference on computer vision. 2014. p. 111–26.Google Scholar
- Dai D, Timofte R, Gool LV. Jointly optimized regressors for image super-resolution. Comput Graph Forum. 2015;34(2):95–104.View ArticleGoogle Scholar
- Schulter S, Leistner C, Bischof H. Fast and accurate image upscaling with super-resolution forests. In: Computer vision and pattern recognition, 2015. p. 3791–9.Google Scholar
- Dong C, Chen CL, He K, Tang X. Learning a deep convolutional network for image super-resolution. Berlin: Springer; 2014.View ArticleGoogle Scholar
- Kim J, Lee JK, Lee KM. Accurate image super-resolution using very deep convolutional networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition. 2015. p. 1646–1654.Google Scholar
- Kim J, Lee JK, Lee KM. Deeply-recursive convolutional network for image super-resolution. In: Computer vision and pattern recognition. 2016. p. 1637–45.Google Scholar
- Lai WS, Huang JB, Ahuja N, Yang MH. Deep Laplacian pyramid networks for fast and accurate super-resolution. In: IEEE conference on computer vision and pattern recognition. 2017.Google Scholar
- Doyley MM. Post-processing multiple-frame super-resolution in ultrasound imaging. In: SPIE medical imaging. 2012. p. 51.Google Scholar
- Zhao N, Wei Q, Basarab A, Kouamé D, Tourneret JY. Single image super-resolution of medical ultrasound images using a fast algorithm. In: IEEE international symposium on biomedical imaging, 2016. p. 473–6.Google Scholar
- Umehara K, Ota J, Ishida T. Application of super-resolution convolutional neural network for enhancing image resolution in chest CT. J Digit Imaging. 2017;3:1–10.Google Scholar
- Mejia J, Mederos B, Ortega L, Gordillo N, Avelar L. Small animal PET image super-resolution using Tikhonov and modified total variation regularisation. J Photograp Sci. 2017;65(3):162–70.Google Scholar
- Bai Y, Han X, Prince JL. Super-resolution reconstruction of MR brain images. In: Proc of 38th annual conference on information sciences and systems. 2004.Google Scholar
- Greenspan H. Super-resolution in medical imaging. Oxford: Oxford University Press; 2009.Google Scholar
- Bhatia KK, Price AN, Shi W, Hajnal JV. Super-resolution reconstruction of cardiac MRI using coupled dictionary learning. In: IEEE international symposium on biomedical imaging. 2014. p. 947–50.Google Scholar
- Rueda A, Malpica N, Romero E. Single-image super-resolution of brain MR images using overcomplete dictionaries. Med Image Anal. 2013;17(1):113–32.View ArticleGoogle Scholar
- Oktay O, Bai W, Lee M, Guerrero R, Kamnitsas K, Caballero J, Marvao AD, Cook S, O’Regan D, Rueckert D. Multi-input cardiac image super-resolution using convolutional neural networks. In: International conference on medical image computing and computer-assisted intervention. 2016. p. 246–54.Google Scholar
- Pham CH, Ducournau A, Fablet R, Rousseau F. Brain MRI super-resolution using deep 3D convolutional networks. In: IEEE international symposium on biomedical imaging. 2017. p. 197–200.Google Scholar
- Thapa D, Raahemifar K, Bobier WR, Lakshminarayanan V. Comparison of super-resolution algorithms applied to retinal images. J Biomed Optics. 2014;19(5):056002.View ArticleGoogle Scholar
- Ba O, Dj F. Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res. 1997;37(23):3311–25.View ArticleGoogle Scholar
- Aharon M, Elad M, Bruckstein A. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans Signal Process. 2006;54(11):4311–22.View ArticleGoogle Scholar
- Tropp JA, Gilbert AC. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans Inf Theory. 2007;53(12):4655–66.MathSciNetView ArticleGoogle Scholar
- Tikhonov AN, Arsenin VY. Solution of ill-posed problems. Math Comput. 1977;32(144):491.Google Scholar
- Debashis K. The EM algorithm and extensions. Technometrics. 1997;40(3):260.Google Scholar
- Cutler A, Cutler DR, Stevens JR. Random forests. Mach Learn. 2004;45(1):157–76.Google Scholar
- Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. 2014. arXiv:1409.1556.
- He K, Zhang X, Ren S, Sun J. Deep Residual learning for image recognition. In: Computer vision and pattern recognition. 2016. p. 770–8.Google Scholar
- Nair V, Hinton GE. Rectified linear units improve restricted boltzmann machines. In: International conference on machine learning. 2010. p. 807–14.Google Scholar
- Lécun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE. 1998;86(11):2278–324.View ArticleGoogle Scholar
- He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on image net classification. p. 1026–34.Google Scholar
- Vedaldi A, Lenc K. MatConvNet: convolutional neural networks for MATLAB. In: ACM international conference on multimedia. 2015. pp 689–92.Google Scholar
- Yangqing J, Evan S, Jeff D, Sergey L, Jonathan L. Caffe: convolutional architecture for fast feature embedding. 2014. p. 675–8.Google Scholar
- Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13(4):600–12.View ArticleGoogle Scholar
- De Boor C. Bicubic spline interpolation. Jmathphysics. 1962;41(3):212–8.MathSciNetMATHGoogle Scholar
- Mao XJ, Shen C, Yang YB. Image restoration using very deep convolutional encoder–decoder networks with symmetric skip connections. 2016.Google Scholar
- Tai Y, Yang J, Liu X. Image super-resolution via deep recursive residual network. In: IEEE conference on computer vision and pattern recognition. 2017. p. 2790–8.Google Scholar
- Tai Y, Yang J, Liu X, Xu C. MemNet: a persistent memory network for image restoration. 2017. p. 4549–57.Google Scholar
- Lim B, Son S, Kim H, Nah S, Lee KM. Enhanced deep residual networks for single image super-resolution. In: Computer vision and pattern recognition workshops. 2017. p. 1132–40.Google Scholar
- Stanga PE, Papayannis A, Tsamis E, Stringa F, Cole T, D’Souza Y, Jalil A. New findings in diabetic maculopathy and proliferative disease by swept-source optical coherence tomography angiography. Dev Ophthalmol. 2016;56:113–21.View ArticleGoogle Scholar
- Cennamo G, Romano MR, Nicoletti G, Velotti N, de Crecchio G. Optical coherence tomography angiography versus fluorescein angiography in the diagnosis of ischaemic diabetic maculopathy. Acta Ophthalmol. 2017;95(1):E36–42. https://doi.org/10.1111/aos.13159.View ArticleGoogle Scholar
- Coscas G, Lupidi M, Coscas F, Français C, Cagini C, Souied EH. Optical coherence tomography angiography during follow-up: qualitative and quantitative analysis of mixed type I and II choroidal neovascularization after vascular endothelial growth factor trap therapy. Ophthalmic Res. 2015;54(2):57–63.View ArticleGoogle Scholar
- Mokwa NF, Ristau T, Keane PA, Kirchhof B, Sadda SR, Liakopoulos S. Grading of age-related macular degeneration: comparison between color fundus photography, fluorescein angiography, and spectral domain optical coherence tomography. J Ophthalmol. 2013;5:385915.Google Scholar
- Kylstra JA, Brown JC, Jaffe GJ, Cox TA, Gallemore R, Greven CM, Hall JG, Eifrig DE. The importance of fluorescein angiography in planning laser treatment of diabetic macular edema. Ophthalmology. 1999;106(11):2068–73.View ArticleGoogle Scholar
- Attia S, Khochtali S, Kahloun R, Ammous D, Jelliti B, Ben YS, Zaouali S, Khairallah M. Clinical and multimodal imaging characteristics of acute Vogt–Koyanagi–Harada disease unassociated with clinically evident exudative retinal detachment. Int Ophthalmol. 2016;36(1):37–44.View ArticleGoogle Scholar