A collective tracking method for preliminary sperm analysis

Background Total motile sperm count (TMSC) and curvilinear velocity (VCL) are two important parameters in preliminary semen analysis for male infertility. Traditionally, both parameters are evaluated manually by embryologists or automatically using an expensive computer-assisted sperm analysis (CASA) instrument. The latter applies a point-tracking method using an image processing technique to detect, recognize and classify each of the target objects, individually, which is complicated. However, as semen is dense, manual counting is exhausting while CASA suffers from severe overlapping and heavy computation. Methods We proposed a simple frame-differencing method that tracks motile sperms collectively and treats their overlapping with a statistical occupation probability without heavy computation. The proposed method leads to an overall image of all of the differential footprint trajectories (DFTs) of all motile sperms and thus the overall area of the DFTs in a real-time manner. Accordingly, a theoretical DFT model was also developed to formulate the overall DFT area of a group of moving beads as a function of time as well as the total number and average speed of the beads. Then, using the least square fitting method, we obtained the optimal values of the TMSC and the average VCL that yielded the best fit for the theoretical DFT area to the measured DFT area. Results The proposed method was used to evaluate the TMSC and the VCL of 20 semen samples. The maximum TMSC evaluated using the method is more than 980 sperms per video frame. The Pearson correlation coefficient (PCC) between the two series of TMSC obtained using the method and the CASA instrument is 0.946. The PCC between the two series of VCL obtained using the method and CASA is 0.771. As a consequence, the proposed method is as accurate as the CASA method in TMSC and VCL evaluations. Conclusion In comparison with the individual point-tracking techniques, the collective DFT tracking method is relatively simple in computation without complicated image processing. Therefore, incorporating the proposed method into a cell phone equipped with a microscopic lens can facilitate the design of a simple sperm analyzer for clinical or household use without advance dilution.

Background Semen analysis is essential in both clinical and research settings for investigating male fertility status, which affects approximately 15% of couples throughout the USA [1]. A normal sperm concentration of semen ranges from 15 × 10 6 to 259 × 10 6 /ml [2]. Healthy sperms are motile in semen. Excessive immotile sperms result in male infertility. The total sperm count (TSC) and the motility, the ratio of the total motile sperm count (TMSC) to the TSC within a same volume [3], are important parameters for measuring sperm fertility. According to the latest (2010) World Health Organization (WHO) standard for normal sperm [3], the minimal concentration and motility of total sperms have been lowered to 15.0 × 10 6 /ml and 40.0%, respectively [4]. The product of these two numbers, 15 × 10 6 /ml × 40.0% = 6 × 10 6 /ml , indicates the minimal concentration of total motile sperms for normal sperm because motility is equivalently the ratio of the concentration of total motile sperms to the concentration of total sperms of a semen [3].
Some researchers have reported that the TMSC shows a better correlation with the spontaneous ongoing pregnancy rate than the WHO 2010 classification system [5,6]. Consequently, the TMSC [7,8] and the curvilinear velocity (VCL) [9][10][11] of the motile sperms are two of the characteristic parameters of a semen sample in a preliminary semen analysis as well. Note that the VCL is the average speed of a sperm head through its real path.
Traditionally, dozens of individual sperm parameters can be evaluated using an automatic computer-assisted sperm analysis (CASA) instrument [12,13] that tracks the trajectory of every sperm individually using a point-based tracking technique [14]. However, the individual point-tracking technique requires heavy computation for complicated image processing to detect, recognize and classify each of the target objects, individually [15]. As a semen sample is dense, severe overlapping of sperms would degrade the accuracy of CASA [16,17]. The disadvantages make the CASA instrument expensive for clinical use and subject to semen concentration below 50 × 10 6 /ml , as suggested by the WHO 2010 standard [3]. In other words, a semen sample of higher concentration requires advance dilution. However, not only is dilution inconvenient to the users but every dilution also increases the inaccuracy of the measurement. Recently, the overlapping problem inherited from the point-tracking technique has been greatly alleviated by using multi-target tracking algorithms [17][18][19][20]. Nevertheless, the required powerful algorithms increase the computation load.
As alternatives, inexpensive sperm analyzers are also now available [21][22][23]. These commercial products can analyze dense semen samples without advance dilution. However, their working mechanisms do not distinguish the motile sperms from the immotile or measure the speeds of the sperms.
In practice, many embryologists who work in the reproductive centers of hospitals or clinics still use a microscope to count the TSC manually, because manual cell counting is relatively reliable and inexpensive. Thus, manual counting is still the present gold standard method for cell counting [24][25][26]. However, manual counting is exhausting and time-consuming. A dense semen sample would lower the repeatability and thus the accuracy of manual counting [27].
In this paper, we proposed a collective rather than individual tracking method for estimating the total number and average speed of up to 980 motile sperms without using complicated image processing. Using a frame-differencing technique [28][29][30], the proposed method leads to an overall image of all of the differential footprint trajectories (DFTs) of all motile sperms and thus the overall area of the DFTs in real time. Accordingly, a theoretical DFT model was developed to formulate the overall DFT area as a function of time as well as the total number and average velocity of a group of moving objects.
In the DFT model, the problem of overlapping is treated using a statistical occupation probability. Finally, the DFT method was examined in the evaluations of TMSC and VCL using 20 semen samples and 523 synthetic bead samples. Using the least square fitting method, we obtained the optimal values of the TMSC and the average VCL that yielded the best fit for the theoretical DFT area to the measured DFT area. The results were found to be strong correlated with those obtained using manual counting and the CASA instrument used in this study, separately. Therefore, it will be shown that the proposed DFT method is simple in computation and appropriate for preliminary sperm analysis.

Insignificant influence of Gaussian speed distribution on TMSC and VCL
From the results of the first type of simulation conducted using the first 400 synthetic bead samples, Table 1 shows the percentage errors E N _DFT , E v_DFT between the optimal N DFT , v ave_DFT and the preset N set = 1000, v ave_set = 60 µm/s for the four settings of δ set at 0, 18, 36 and 54 µm/s under the condition of N set = 1000 , separately. Note that the simulation for each δ set was conducted 100 times for the average of N DFT , v ave_DFT . As the table shows, the maximal percentage errors of E N _DFT and E v_DFT were 7.42% at δ set = 18 and 2.09% at δ set = 0 , respectively. As a whole, the overall averages of the percentage errors E N _DFT , E v_DFT were (7.10% and 1.79%), respectively.

DFT method as accurate as CASA in TMSC and VCL evaluations in simulation
From the results of the second type of simulation conducted using the remaining 123 synthetic bead samples, Fig. 1 shows the optimal N DFT , v ave_DFT evaluated using the DFT method, the N CASA , v ave_CASA measured by the CASA instrument, and the N set , v ave_set preset for the 41 settings of N set under the condition of v ave_set = 60 µm/s and δ set = 0 µm/s . Specifically, Fig. 1a shows the total numbers N DFT and N CASA versus Table 1 Results of the first kind of simulation conducted using the first 400 synthetic bead samples: the percentage errors E N_DFT and E v_DFT of the total number N DFT and the average speed v ave_DFT of a group of synthetic beads due to their Gaussian speed distributions set at four different standard deviations δ set δ set (µm/s) @ N set = 1000 v ave = 60 µm/s  18:112 the preset N set that serves as the reference. It can be seen that the three series of N DFT , N CASA and N set form a straight line. The linear correlation between any two series among N DFT , N CASA , and N set was measured by using the Pearson correlation coefficient (PCC) analysis [31,32]. The resulting PCCs were found to be greater than 0.999 with the corresponding P-values less than 0.001. In principle, a P-value less than 0.05 is considered as statistically significant [33]. This indicates that the three series obtained using the three different methods were strongly correlated with a high statistical significance. In addition, the overall average percentage error E N _DFT was 2.36% in comparison with N set .
As for the average speed, Fig. 1b shows the average speeds v ave_DFT and v ave_CASA versus the preset v ave_set = 60 µm/s . The overall average percentage error E v_DFT was 4.35% in comparison with v ave_set .

DFT method as accurate as CASA in TMSC and VCL evaluations in experiment
From the results of the experiment conducted using the 20 human semen samples, Fig. 2 shows the optimal N DFT , v ave_DFT evaluated using the DFT method, the N CASA , v ave_CASA measured using the CASA instrument, and the N manual obtained by manual counting, separately. Specifically, Fig. 2a shows the total numbers N DFT and N CASA versus the manual count N manual that serves as the gold standard. It can be seen that the maximum N DFT or TMSC evaluated using the DFT method is greater than 980 sperms per video frame. In addition, the series of N DFT and N CASA are close to the series of N manual . The PCC between the series of N DFT and N manual was found to be 0.996 with a P-value less than 0.001. While that between the series of N DFT and N CASA was found to be 0.946 with a P-value less than 0.001. As a whole, the overall average of the percentage errors E N _DFT was 15.33% in comparison with the manual counts. Yet, the overall average of the coefficient of variation (CV) of the manual counting results was 5.81%.
As for the average speed, Fig. 2b shows only the average speeds v ave_DFT and v ave_CASA versus the serial number of the 20 semen samples, since it is difficult to estimate the Average speeds v ave_DFT and v ave_CASA versus the preset v set . Red points: N DFT and v ave_DFT evaluated using the DFT method. Blue points: N CASA and v ave_CASA measured using the CASA instrument. Green points: preset N set and v set = 60 µm/s speeds of sperms manually. Accordingly, the PCC between the series of v ave_DFT and v ave_CASA was found to be 0.771 with a P-value less than 0.006. It can be seen that, in general, the series of v ave_CASA is slightly higher than that of v ave_DFT in Fig. 2b. The series of N CASA is slightly lower than that of N DFT in Fig. 2a. This tendency will be discussed in the discussion section. Figure 3 illustrates the computation times of the DFT method, using the collective tracking technique, and the CASA method, applying a point-tracking technique, as a function of the total numbers of synthetic beads and sperms. It can be seen that the DFT method demonstrated a larger count range of TMSC for both synthetic beads and sperms in less time than the CASA method. In addition, the computation time of the DFT method was found independent of the number of synthetic beads or sperms while that of the CASA method increased with the total numbers of synthetic beads and sperms. This supports that the collective DFT method is effective and simple in the evaluation of the total number and average velocity of a group of moving objects.

Discussion
As shown in Table 1, the results of the first type of simulation indicated that the influence of the Gaussian speed distribution of a group of beads on the evaluations of N DFT and v ave_DFT is insignificant in the DFT method. This can be explained in principle. Referring to Fig. 5, one can imagine that the DFT area contributed by the beads that are faster than the average speed just compensates for that contributed by the beads slower than the average speed. This suggests that the DFT method is also valid for a group of beads, such as healthy semen, with a symmetric speed distribution by assuming that they move at their average speed. As shown in Fig. 1, the results of the second type of simulation exhibited separately that the accuracy of the DFT method in the evaluation of N DFT is as high as that of the CASA instrument in comparison with the preset N set . The resulting PCCs were all greater than 0.999 with the corresponding P-values all less than 0.05. The overall averages of the percentage errors E N _DFT of 2.36% and E v_DFT of 4.35% were all less than 5.0%. This performance persisted until there were 3500 synthetic beads for which overlapping was severe. This justifies the statistical treatment of the occupation probability for the overlapping problem in the proposed DFT model. However, as shown in Fig. 2, the experimental results evaluated using the DFT method showed a relatively high percentage error E N _DFT of 15.33% in comparison with the manual counts N manual . The results also showed a relatively low PCC of 0.771 with a P-value less than 0.05, in comparison with the average speed v ave_CASA measured using the CASA instrument. The deterioration of the experimental results is likely due to the following three factors: First, a finite CV of 5.81% for the series of manual counts N manual existed because most semen samples had high concentrations. Second, sperms slower than the cutoff speed of 5 µm/s were not counted by the CASA instrument by default. Thus, the series of v ave_CASA tended to be slightly faster than that of v ave_DFT , as shown in Fig. 2b. The series of N CASA tended to be slightly less than that of N DFT , as shown in Fig. 2a. Third, the real sperms are different from the synthetic beads in size and speed distribution. The consequence is attributed to the limitation of the DFT method. As discussed above, any asymmetry on the size or speed distribution of a semen sample would degrade the accuracy of the DFT method since the TMSC and VCL of a semen sample are evaluated from the overall area of the DFT of all motile sperms in the sample. Therefore, removing the first two factors caused by the manual counting and the CASA instrument, separately, one might expect that the overall averages of the percentage errors E N _DFT and E v_DFT caused by the DFT method alone would be reduced. Nevertheless, the two PCCs of 0.946 and 0.771 are all greater than 0.7, and the results of the DFT method and those of the CASA instrument are considered strong correlated [31,32]. Consequently, the DFT method is as accurate as the CASA method in the evaluations of TMSC and VCL.
In this study, the unhealthy semen samples were found containing more slow sperms than fast ones. It can be imagined that a small or slow sperm contributes a smaller DFT area than a large one. The more the slow sperms, the smaller the overall DFT area and thus the less the TMSC in comparison with the manual count. The case of unhealthy semen elucidates the limitation of the DFT method on the symmetry of the size and speed distribution of a group of target objects. In this sense, it is recommended to carry out the simple semen analysis right after a semen sample is collected to prevent the motility of the sperms from dropping too fast [34], which will change the speed distribution of the semen sample.

Conclusion
A collective DFT tracking method was proposed for evaluating the total number and average speed of a group of moving objects. In the DFT model, the objects are assumed to be identical in size. Their speeds can be either the same or different with a symmetric distribution. The proposed DFT method tracks the moving objects collectively and treats any overlapping using a statistical occupation probability without heavy computation. In this study, the DFT method was used to evaluate the TMSC and VCL of 523 synthetic bead samples in the simulation and 20 semen samples in the experiment. Recall that both TMSC and VCL are two characteristic parameters in preliminary sperm analysis.
The maximal count range of TMSC obtained using the DFT method is up to 980 sperms per video frame in the experiment and 3500 synthetic beads per frame in the simulation. The experimental results showed that the PCC between the two series of TMSC obtained using the DFT method and the CASA instrument was 0.946. The PCC between the two sets of VCL obtained using the method and the CASA instrument was 0.771. Their corresponding P-values were all less than 0.05. This indicates that the results obtained using the DFT method and those of the CASA instrument were strongly correlated with a high statistical significance [31,32]. As a consequence, the proposed DFT method is as accurate as the CASA instrument in evaluating TMSC and VCL. This also verifies the validity of the DFT model.
In comparison with those individual point-tracking techniques, the collective DFT tracking method is relatively simple in computation without complicated image processing and provides a large count range of TMSC. Therefore, incorporating the proposed method into a microscope or a cell phone equipped with a microscopic lens can facilitate the design of a simple sperm analyzer as accurate as CASA for clinical or household use without advance dilution.
In future work, the accuracy of the TMSC and VCL can be further improved by using a 4 K (3840 × 2160) resolution image sensor instead of the 640 × 480 one since the accuracy also depends on the resolution of the video frames of a semen sample.

Methods
The algorithm used in the proposed collective tracking method is simple. Consider a 10-s-long video film recorded with the charge-coupled device (CCD) camera inside the CASA instrument used in this study. First, two consecutive video frames of a semen sample were subtracted pixel by pixel. A unique color was marked on those resulting pixels whose absolute values are above a given threshold value. The resulting image shows the footprints of all motile sperms in a differential manner. Then, the subtraction process was repeated over a series of frames and the resulting differential footprint images were superimposed one after the other. This results in an overall image of the entire DFTs of all motile sperms. Figure 4 illustrates the typical color-coded DFT images of three semen samples of different concentrations for 10 s, individually. Each trajectory in the images represents a motile sperm. The longer and more colorful a trajectory is, the faster the corresponding sperm moves. As time goes by, each DFT extends and the overall area of the DFT increases with time.

Color-coded DFT
In the following sections, the concept, model and procedure of the DFT method will be presented. The dependence of the overall DFT area on the total number and average speed of the moving objects is illustrated. An equation of the overall DFT area as a function of time as well as the total number and average velocity of a group of moving objects is developed. The optimal values of the total number and average speed of the moving objects can be derived using the least square fitting method. Subsequently, 20 semen samples and 523 synthetic bead samples were prepared and recorded into grayscale instead of color-coded videos for the test of the proposed DFT method in the evaluations of TMSC and VCL. The test process was conducted by comparing the results evaluated using the DFT method with those obtained using the manual count method and the CASA instrument used in the study.

Concept: one-dimensional DFT of a single moving square
For simplicity, we first considered a square of width d moving at a constant speed v in one dimension. Figure 5a shows the continuous displacement of the square in seven consecutive video frames during a period from t 0 = 0 to t 6 = 6 t . Herein, t i = i t.
where i is an integer representing the frame number and t is a unit time interval. In this case, d = 4v t is assumed such that the square moves a distance of a body length d at t 4 = 4 t . Figure 5b shows a series of six differential footprint images of the square in different colors. Each image displays a rear and a front rectangle. This is obtained by subtracting two consecutive frames and coding the resulting frame with a unique color for the pixels whose absolute values are above a given threshold value. Each pair of rectangles is called the differential footprint of the square from a single move. Figure 5c illustrates the growth of the color-coded DFT of the moving square as a function of time by superimposing the six differential footprint images in succession. Notice that only the first four footprints appear in pairs in Fig. 5c. Subsequently, only the front rectangle of each succeeding pair appears in the DFT since the rear one always overlaps the front of a previous pair. The peculiar feature of the DFT method is reflected in Fig. 5d, a plot of the area of the DFT as a function of time. It can be seen that the DFT area a 1 (t i ) increases with time t i at two slopes separated at a turning point. This feature can be expressed using the following equation for a 1 (t i ): where the subscript "1" indicates a single square, i is the video frame number, and b R as well as b F are the areas of the rear as well as the front rectangles of a differential footprint, respectively. Under the condition b R = b F = dv t , the coordinates of the turning point are d/v, 2d 2 . In other words, the turning point reflects the speed v of the moving square.

Model: two-dimensional DFT of a group of moving beads
To be realistic, we further considered a group of identical beads with each of diameter d moving randomly at the same speed v in two dimensions. Initially, the locations and directions of movement of the beads are randomly distributed over a video frame. Figure 6 illustrates the concept of the DFT method used in developing the equation of the overall DFT area A 3 (t 1 ) of three beads from two consecutive video frames at t 0 and t 1 in the presence of bead overlapping. As shown in Fig. 6, each pair of crescents in red is the differential footprint of a bead. The overlapping of the three footprints changes the effective footprint area of each bead. The idea of evaluating the overall footprint area of the three beads at t 1 is to add the effective footprint area of each bead successively as follows. First, Fig. 6a shows the differential footprint of a single bead from a move. Regardless of the direction of the movement of the bead, the corresponding DFT area A 1 (t 1 ) of the bead is given by A 1 (t 1 ) = B R + B F , where B R and B F are the areas of the rear and the front crescents, respectively. Here, the symbols A 1 , B R and B F are rewritten in upper case in the model of multiple beads. Note that B F (= dv�t) remains constant at all times whereas B R can be approximated to be the average area of the various rear crescents such that Fig. 6b shows a statistical treatment of the overlapping of two beads. In the presence of a second bead anywhere in the frame, the probability for the second bead to Fig. 6 Concept of the effective footprint area of each bead in the presence of bead overlapping. a Effective footprint of a single bead from a move. b Effective footprint area occupied by two beads. c Effective footprint area occupied by three beads. The effective footprint areas of the respective beads are different occupy a full differential footprint of the area (B R + B F ) depends on the percentage of the empty area over the full area of the video frame from a statistical point of view. Accordingly, the occupation probability for a bead to occupy a full differential footprint is defined as 1−A occupied /C , where C is the full area of the video frame and A occupied is the area that has been occupied. In this case, the occupied area A occupied is A 1 (t 1 ) and A 1 (t 1 )/C is the percentage of the occupied area over the frame. The resulting occupation probability is Thus, the effective area occupied by the differential footprint of the second bead becomes (B R + B F )[1 − A 1 (t 1 )/C] , which is the product of the full area (B R + B F ) and the occupation probability. Statistically, the overall DFT area A 2 (t 1 ) of the two beads is the sum of the two effective areas of the first and the second beads, as given by: Third, Fig. 6c shows a simple geometric progression formula for the overlapping of three beads. Similarly to the statistical treatment of two beads, the presence of the third bead creates an additional effective area (B R + B F )[1 − A 2 (t 1 )/C] . Thus, the overall DFT area A 3 (t 1 ) of the three beads can be expressed as: It is worth noting that A 3 (t 1 ) is in the form of a geometric series and thus can be expressed in compact form as 3 . Further, the overall DFT area A N (t 1 ) of N beads at t 1 can be expressed as Based on the same idea, for the succeeding video frame at time t 2 , the overall DFT area A N (t 2 ) of the same number N of beads can be expressed in the form: Similarly to the turning point for a single moving square, as shown in Eq. (1) for a 1 (t i ) , A N (t i ) also presents a turning point for N moving beads. It can be imagined that both crescent areas B R and B F of each bead contribute to the overall DFT area before the bead moves a distance of a body length d for t i ≤ d/v or i ≤ d/(v�t) . Subsequently, for t i > d/v or i > d/(v�t) , only the front crescent area B F of each bead contributes to the overall DFT area. As a consequence, it can be shown that the overall DFT area A N (t i ) of N identical beads in a video frame at time t i can be expressed as Eq. (6).
(2) Indeed, the overall DFT area A N (t i ) is a function of time t i , the total number N and the average velocity v. Here, N and v serve as implicit parameters. Under the condition (B R + B F )/C ≪ 1 for small beads, Eq. (6) for A N (t i ) can be simplified in the following form, by making use of Taylor's expansion technique, Notice that Eq. (6) resembles Eq. (1) as A N (t i ) ≈ Na 1 (t i ) when the square in Eq. (1) is replaced with a bead. The consequence of a linear relation between multiple beads and a single bead is reasonable since bead overlapping is negligible in a group of small beads that are sparsely distributed. The applicability of Eq. (7) to small beads supports the statistical treatment of occupation probability for the bead overlapping problem in deriving Eq. (6) for A N (t i ).

Effect of Gaussian speed distribution on TMSC and VCL
In reality, the motile sperms in a semen sample [35] move at various speeds that follow a Gaussian-like symmetric probability distribution [36]. It is to be noted that the probability density of a Gaussian speed distribution is given by: where v ave is the average speed and δ is the standard deviation of the Gaussian speed distribution. Thus, it is too complicated to develop further an analytic equation for the DFT area as a function of time in this case. Instead, through simulation, the overall DFT area A N (t i ) versus time t i was plotted for a group of N beads whose speeds follow a Gaussian distribution, as discussed above. For comparison, a graph of A N (t i ) versus t i was also plotted for the same N beads but with all beads moving at the same average speed v ave of the Gaussian speed distribution. The simulation results demonstrated later will show that the two plots are nearly the same. Consequently, this allows us to use Eq. (6) for A N (t i ) in the analysis of real semen samples.

Procedure of DFT method
Similarly to the procedure for a single moving square as illustrated in Fig. 5, the procedure of the DFT method for a target semen sample or synthetic bead sample was as follows. First, a series of 30 consecutive grayscale video frames of the target sample were acquired, as shown in Fig. 5a. In this study, the resolution of the video frames was 640 × 480 pixels. The gray levels of the video frames ranged from 0 to 255. Second, any two consecutive frames were subtracted pixel by pixel. A new number "255" was assigned on those resulting pixels whose absolute values were above a threshold value, (6)  20 in this case, and a new number "0" was assigned elsewhere. This marked a differential footprint image of all motile sperms or synthetic beads, as shown in Fig. 5b. Third, the subtraction process was repeated through the 30 consecutive frames in sequence and the resulting differential footprint images were superimposed in succession. This resulted in 29 images of all of the DFTs of all sperms or beads at different times, showing the extension of the entire DFTs as a function of time, as shown in Fig. 5c. The total number of the pixels whose values were not "0" on a DFT image at time t i was just the overall area A DFT (t i ) , in pixels, of all of the DFTs on that image. Fourth, the overall DFT area of each of the 29 DFT images, A DFT (t i ) for 1 ≤ i ≤ 29 , was calculated and plotted versus time, as shown in Fig. 5d. Last, the least square fitting method was used to obtain the optimal values of N DFT and v ave_DFT that yielded the best fit for Eq. (6) for the theoretical A DFT (t i ) to the above measured A DFT (t i ).
The above image process is collective rather than individual. It does not need to recognize and classify each of the target objects, individually. Thus, the DFT method is simple in computation without complicated image processing.

Manual counting
Each semen sample was counted three times for averaging by three researchers trained at Mackay Memorial Hospital Reproductive Medicine Center, New Taipei City, Taiwan. The use of semen for research had been reviewed and approved by the hospital's institutional review board with the approval number 14MMHIS280.

Experiments and simulations
Twenty semen samples and 523 synthetic bead samples were used to verify the validity of the proposed DFT method in the evaluations of TMSC and VCL. In particular, two types of simulations were conducted using the 523 synthetic bead samples for two purposes. One was to estimate the influence of the speed distribution of a group of synthetic beads on the evaluation of the total number of beads in the proposed DFT method. The other was to evaluate the influence of the total number of beads on the accuracies of the DFT method on the evaluations of the total number and average speed of the beads.
In the first type of simulation, δ set was set to 0, 18, 36 and 54 at the preset N set = 1000 and v ave_set = 60 µm/s , separately. For each setting of δ set , the optimal values of N DFT and v ave_DFT of the corresponding synthetic bead sample in a video were first evaluated following the procedure of the DFT method. Then, the optimal N DFT , v ave_DFT was compared with the preset (N set = 1000, v ave_set = 60 µm/s) and the percentage errors E N _DFT ≡ (N DFT − N set )/N set and E v_DFT ≡ v ave_DFT − 60 /60 were obtained, separately. Recall that δ set = 0 corresponds to all beads moving at a single speed v ave in the DFT model. As a consequence, the percentage errors E N _DFT , E v_DFT at various δ set reflect the influence of the speed distribution of the beads on the evaluations of N DFT and v ave_DFT in the DFT method.
In the second type of simulation, N set was set to 41 different values ranging from 1 to 3500 at δ set = 0 and v ave_set = 60 µm/s , separately. For each setting of N set , the optimal N DFT and v ave_DFT of the synthetic bead sample in a video were first evaluated following the procedure of the DFT method. Then, these synthetic videos were sent to the CASA instrument to measure the total number N CASA and the VCL v ave_CASA of the N set beads for comparison. Thus, the corresponding percentage errors E N _CASA ≡ (N CASA − N set )/N set and E v_CASA ≡ v ave_CASA − 60 /60 were obtained. As a consequence, the errors E N _DFT , E v_DFT and E N _CASA , E v_CASA reflect the accuracies of the DFT method and the CASA instrument on the evaluations of N DFT , v ave_DFT and N CASA , v ave_CASA , respectively. To reduce uncertainties, every simulation of the first type was conducted 100 times for the average of N DFT , v ave_DFT . Thus, 400 different bead samples were synthesized and analyzed for the four settings of δ set at N set = 1000 and v ave_set = 60 µm/s . In addition, every simulation of the second type was conducted three times for the averages of N DFT , v ave_DFT and N CASA , v ave_CASA . Thus, 123 different bead samples were synthesized and analyzed for the 41 settings of N set at δ set = 0 and v ave_set = 60 µm/s . In total, 523 synthetic bead samples were analyzed in the two types of simulations. On the other hand, every semen experiment was conducted three times for the averages of N DFT , v ave_DFT , N CASA , v ave_CASA , and N manual . Thus, the 20 different semen samples were analyzed 60 times in total.

Sperm preparation
Twenty human semen samples were selected from 31 subjects for a wide distribution of TMSCs and VCLs of the sperms from unhealthy to healthy. The semen samples were collected by masturbation after 2 to 3 days of sexual abstinence. After liquefaction for at least 30 min, routine semen analysis was conducted according to the WHO 2010 criteria [3]. The semen samples were assessed by loading 28 µl of each sample into a 20 µm deep Leja slide with no dilution in any buffer. Then, the slides were inserted into the CASA instrument for semen analysis and recording, individually. After the experiment, all sperm samples were destroyed by using high-temperature steam sterilization. The process of each semen sample from collection to sterilization lasted within an hour.

Videos of semen samples
The movements of the motile sperms within each slide in the CASA were simultaneously recorded into a grayscale video for 2 s at a frame rate of 60 Hz by using the charge-coupled device camera (CCD; Sony XC-75 CE N50) inside the CASA instrument. The pixel size of the CCD camera is 8.6 µm (horizontal) × 8.3 µm (vertical). The resolution of the video frame was 640 × 480 pixels. The magnification of the objective in the CASA was 10× . The field of view of the CCD was about 550 × 398 µm 2 . Thus, under the magnification of 10× , the image of a normal sperm of 4.3 ± 0.6 µm in length [38] was nearly 43.0 µm long upon the image sensor of the CCD and occupied an area

Simulation settings
Five hundred and twenty-three synthetic bead samples were arranged to cover a wide distribution of total numbers and speeds of the synthetic beads as follows. The diameters of the synthetic beads were fixed at 4.3 µm, as long as that of a normal sperm [38]. Correspondingly, the synthetic image sizes of the beads on the video frame were 43 µm in diameter under the magnification of 10× . The initial locations and directions of movement of the synthetic images of the beads were randomly distributed over an area of 5.5 × 4.0 mm 2 or 640 × 480 pixels, equivalently. Thus, the image of each synthetic bead occupied an area of 5 × 5 pixels. The speeds of the beads were set to follow a Gaussian speed distribution [36] with an average speed of v ave and a standard deviation of δ set (µm/s). In this study, v ave_set was fixed at 60 µm/s, close to the average curvilinear velocity of real sperms. Thus, N set and δ set were the two parameters to be set for each of the 523 synthetic bead samples.

Videos of synthetic bead samples
The movements of the moving beads in each synthetic bead sample were transformed into a video with a resolution of 640 × 480 pixels for 2 s at 60 Hz by using the Lab-VIEW 2014 and Vision Assistant 2014 programs. Afterward, the videos of the 523 synthetic bead samples were used to verify the validity of the proposed DFT method.