### Preliminary image analysis

*L*
_{
GRAY
} input image in DICOM format, which is the source image recorded in GE ultrasound machine with a resolution of *M*× *N* = 614×816 pixels, is filtered using a median filter whose mask size is *M*
_{
h
}×*N*
_{
h
} = 3×3. The filtered image *L*
_{
MED
} provides the basis for further image analysis. In this image, derived from the learning, validation and test groups, an expert marks manually the area of analysis – it is usually the upper part of the thyroid lobe. The selected area is the only area marked manually. In this study, the area is rectangular and, in particular, square. Papers [20, 21] and [22] describe a fully automatic way of marking an area of the thyroid. However, a clearly visible artery is the basis for its operation. In pathological cases the arterial lumen gets narrowed reducing the effectiveness of automatic operation of the algorithm. Thus, the algorithm described in papers [21] and [22] can be regarded as the operator support.

In the marked area *L*
_{
S
} with a resolution of *M*
_{
s
}×*N*
_{
s
}, a morphological and statistical analysis is performed with the use of well-known techniques of analysis and image processing. These techniques were also suggested by the authors.

### Statistical and morphological analysis of the image

The statistical and morphological image analysis was shown based on the example of Hashimoto’s disease which is characterized by [23, 24]:

decreased echogenicity,

sometimes heterogeneous structure,

sometimes fibrosis seen as hyperechogenic linear structures.

On the basis of medical evidence concerning images of the thyroid lobes, two measured characteristic parameters for Hashimoto’s disease were formulated:

an average brightness value after the removal of clear follicles of any diameter,

measurement of statistical parameters of follicles whose size and shape is not strictly defined, mainly the number of their occurrences.

The measurement result of these parameters should not be sensitive to the size of the analyzed image *L*
_{
S
} (normalization).

Such a general formulation of features is due to a very large individual variation and the dependence of results from the method of measurement. The texture analysis methods have been defined on the basis of these characteristics, taking into consideration the analysis of minimum values after the elimination of white follicles. A method for the analysis of clear textures, in general, objects, has also been suggested. Among the known methods for texture analysis [25] and on the basis of papers [12–19], the analysis of the input image *Ls* with statistical and morphological methods have been suggested, obtaining10 attributes defined as follows:

*w(1) – average image power spectrum defined as:*

w\left(1\right)=\frac{\left({\sum}_{M}^{p=1}{\sum}_{N}^{q=1}{L}_{f}\left(p,q\right)\right)}{\left(M*N\right)}

(1)

Where *L*
_{
f
} is Fourier transform:

{L}_{f}(p,q)={\left|\sum _{m=1}^{M}\sum _{n=1}^{N}{L}_{s}(m,n){e}^{-j2\pi pm/M}{e}^{-j2\pi qn/N}\right|}^{2}

(2)

*p,q* - row and column of the matrix *L*
_{
f
}
*,*

*m,n* -row and column of the matrix *Ls*.

Examples of results obtained for various textures of the thyroid lobes are presented in the figure below (Figure 1). The sample results obtained for various sample textures of the thyroid presented in Figure 1 show that the feature is a normalized sum of amplitudes occurring for individual harmonics. Diagnostically, the feature points to a systematic (with some assumed frequency) repetition of nodules, in general, artifacts on the thyroid lobe.

*w(2)- regional minimum value on the Ls image*

w\left(2\right)=\underset{m\in \{1,..,M\},n\in \{1,..,N\}\bigwedge Lo(m,n)0}{min}\left(Ls\left(m,n\right)*Lo(m,n)\right)

(3)

where *L*
_{
O
} – a binary image of local minima occurrence.

An operation of this function (3) is based on the calculation of the average of local minima. An example of the area *L*
_{
O
}, from which the values of local minima are calculated, is shown below (Figure 2 – an example of input image, Figure 3 –*L*
_{
O
} image).

The minima indicated as values „1” in the image *Lo* (Figure 3) indicate the pixels involved in the calculation of the average value and, therefore, the value *w*(2). This feature provides reliable information on the average brightness of an image after the removal of clear artifacts and distortions, understood both as nodules as well as small inclusions or isolated bright pixels.

*w(3) – smoothness* where

w\left(3\right)=1-1/(1+{w}_{\mathit{STD}}{2}^{})

(4)

{w}_{\mathit{STD}}=\sqrt{\frac{1}{M*N}\sum _{M}^{m=1}\sum _{N}^{n=1}\left(Ls\left(m,n\right)-\overline{Ls}\right)}

(5)

\overline{Ls} the average brightness *Ls*.

Smoothness defined by the formula (4) is relatively easy to interpret because it is a standardized measure based on the standard deviation of the average. If STD and, therefore, the value *w*
_{
STD
} increases, the value of the feature *w*(3) decreases. Thus, images of the thyroid, consistent in terms of changes in brightness of pixels, have the value of the feature *w*(3) close to zero. And vice versa, large changes in brightness and high values of STD result in an increase in the feature *w*(3) to the value of 1.

*w(4) – the minimum value of brightness on the L*
_{
S
}
*image* *, having removed all pixels, which number for the given brightness is smaller than 20% of the largest number of brightness pixels,* i.e. *:*

hist\left(i\right)=\sum _{m=1}^{M}\sum _{n=1}^{N}k(i,m,n)

(6)

where

k\left(i,m,n\right)=\{\begin{array}{l}1\phantom{\rule{1em}{0ex}}if\phantom{\rule{2em}{0ex}}Ls\left(m,n\right)=1\hfill \\ 0\phantom{\rule{1em}{0ex}}other\hfill \end{array}

(7)

for *i* = 1,2,3,…,254,255.

his{t}_{m}=\underset{i}{max}\left(hist(i)\right)

(8)

where

his{t}_{w}\left(i\right)=\{\begin{array}{l}hist\left(i\right)\phantom{\rule{1em}{0ex}}if\phantom{\rule{2em}{0ex}}hist\left(i\right)>0.2*his{t}_{m}\hfill \\ his{t}_{m}\phantom{\rule{2em}{0ex}}other\hfill \end{array}

(9)

his{t}_{w}({i}^{*})=\underset{i}{min}\left(his{t}_{w}(i)\right)

(10)

The value of *i* = w*(4) calculated in this way is the minimum value in the analyzed area of the thyroid after the removal of distortions in the form of individual pixels with very low brightness. The threshold value of 0.2 was established arbitrarily on the grounds of noise present in the image *Ls* and preliminary analyses.

This feature is seemingly similar to *w*(2). However, in the case of a large number of bright nodules present in the analyzed area of the thyroid, the value of the feature *w*(4) remains constant while the value of the feature *w*(2) increases. The same situation occurs in the event of distortions in the form of pixels with low or zero brightness. In this case, they will be counted when determining the feature *w*(2). However, they will not be taken into account in the calculations of *w*(4).

*w(5) – position of the centre of GLCM (Gray-Level Co-occurrence Matrix) matrix gravity*

According to the definition, the matrix *L*
_{
GLCM
} is determined on the basis of the number of neighborhoods of the closest pixels in the *x*-axis. Neighborhoods of pixels brightness are stored in rows and columns of the matrix *L*
_{
GLCM
}. For example, for the image in Figure 4, the neighborhood matrix can be written as shown in Figure 5.

For example, the neighbourhood 3 and 4 occurs three times, and therefore the value 3 is stored in the third row and fourth column of the matrix GLCM (*L*
_{
GLCM
} -Figure 5). The other fields of the matrix GLCM are filled in the same way. After this stage, the matrix is summed with its transposition (if 3 neighbours with 4 then also 4 with the number of 3), and then normalized (to become independent from the resolution of the image *Ls*) and binarized to remove the effect of a small number of neighborhoods on the result, ie:

{L}_{\mathit{GLCMB}}(m,n)=\{\begin{array}{ccc}\hfill 1\hfill & \hfill if\hfill & \hfill {L}_{\mathit{GLCM}}(m,n)>{p}_{r}\hfill \\ \hfill 0\hfill & \hfill if\hfill & \hfill {L}_{\mathit{GLCM}}(m,n)\le {p}_{r}\hfill \end{array}

(11)

where *p*
_{
r
} – the threshold value defined as

{p}_{r}=0.1*\underset{m,n}{max}{L}_{\mathit{GLCM}}(m,n)

(12)

For the *L*
_{
GLCMB
} image obtained this way the centre of gravity of the object formed and its area are calculated, i.e.:

x=\frac{{\sum}_{n}^{N}{\sum}_{m}^{M}{L}_{\mathit{GLCMB}}\left(m,n\right)*m}{{\sum}_{n}^{N}{\sum}_{m}^{M}{L}_{\mathit{GLCMB}}\left(m,n\right)}

(13)

y=\frac{{\sum}_{n}^{N}{\sum}_{m}^{M}{L}_{\mathit{GLCMB}}\left(m,n\right)*n}{{\sum}_{n}^{N}{\sum}_{m}^{M}{L}_{\mathit{GLCMB}}(m,n)}

(14)

w\left(5\right)=\sqrt{{x}^{2}+{y}^{2}}

(15)

w\left(6\right)=\sum _{n}^{N}\sum _{m}^{M}{L}_{\mathit{GLCMB}}(m,n)

(16)

Both the feature *w*(5) and *w*(6) determine the brightness and uniformity of brightness in the image. The feature *w*(5) provides information about the brightness of pixels and is not dependent on the contrast present in the analyzed texture of the thyroid. In the case of high contrast appearing in the image *Ls* as well as in the case of low contrast and homogeneous pixel values, the distance of the center of gravity from the origin of coordinates does not change, and so the value of the feature *w*(5). The range of contrast in different values of derivatives determines the value of the feature *w*(6). An increasing value of the feature *w*(6) means that in the image *Ls*, there are a lot of different combinations of pairs of pixels. The examples are the images with large changes in thyroid histological structure accompanied by various changes in brightness of adjacent pixels.

*w(7) to w(10) – the result of square-tree decomposition*
*– and exactly the percentage number of areas of size 1×1 – w(7), 2×2 – w(8), 4×4 – w(9), 8×8 – w(10) occurrences, obtained for a 10% threshold.*

The square-tree decomposition is usually associated with image compression [25]. However, due to its properties, it can also be used to determine global features of images [25]. The name of this decomposition comes from the specificity of its operation. Namely, the analyzed area is divided into smaller and smaller squares until at certain criterion related to the values of pixels in the analyzed square is met. As a result of such a course of action, a tree subdivision of the right size at each level is created. The relevant picture of the thyroid *L*
_{
S
} with the resolution *M*
_{
s
}×*N*
_{
s
} can be divided into “*i*” rectangular areas *L*
_{
i
} with the resolution *M*
_{
i
}×*N*
_{
i
} for the "*i*" coefficient value in the range 1 < = *i* < = *I*. Thus, the smallest area may have a pixel size, that is 1×1, and the largest - *M*
_{
s
}×*N*
_{
s
} pixels. In practice, the resolution of the image *L*
_{
S
} is adjusted to a resolution which is the power of 2. Then, assuming a suitable threshold, it is divided into areas such as 8×8, 4×4, 2×2 and 1×1 in accordance with the rules described. An example of a division is shown in the figure below (Figure 6, Figure 7)

Individual values of the features *w*(7) to *w*(10) are calculated as a percentage share against the whole image. For example, the image *Ls* shown in Figure 7 contains 0 areas of size 1×1 (*w*(7) = 0), 32 areas of size 2×2 ( *w*(8) = 32*2/( *M*
_{
s
}
**N*
_{
s
})), 152 areas of size 4×4 (*w*(9) = 152*4/( *M*
_{
s
}
**N*
_{
s
})) and 148 areas of size 8×8 (*w*(10) = 148*8/( *M*
_{
s
}
**N*
_{
s
})). The number of areas and, thus, the value of features *w*(7), *w*(8), *w*(9) and *w*(10) is highly dependent on the number of instances of objects (bright or dark) and their size. Large homogeneous objects enhance the value of the features *w*(10), *w*(9), medium-sized objects - *w*(8), and small objects - *w*(7). It should be noted here that the division into the smallest areas (size 1×1) also occur on contrasting edges. Thus, their number is not always strictly proportional to the number of small objects on the thyroid lobe.

#### Justification for the selection of features *w*(1) to *w*(10)

All 10 features were selected following medical conditions. The analyzed ROI areas (images *Ls*) contain numerous artifacts which have a high impact on the outcome. The features *w*(1) to *w*(10) were chosen in such a way that elimination of artifacts provide a new image feature. The artifacts associated with both small bright nodules and other bright inclusions were removed in *w*(2). If the structure of the image *Ls* is non-uniform, there are no strict medical reasons for its analysis. For this reason, the features *w*(7) to *w*(10), which divide the image *Ls* into areas *L*
_{
i
}, were suggested. The number of these areas (of particular sizes) indicates additional features of the image. The 10% threshold was chosen based on the difference in the minimum and maximum values of the analyzed group of images. For the threshold value of 20, 30 or 40% (or more), almost zero values of the features *w*(7) to *w*(10) were obtained. For values below the threshold of 10%, oversegmentation was obtained. To measure the homogeneity from the global, other tools such as GLCM and FFT were used. The features *w*(5) and *w*(6) determine the location and number of pixels in the area *L*
_{
GLCMB
}. They are, thus, a measure of the spread of contrast in an image (feature *w*(6)) and information about its center of gravity (feature *w*(5)). These features are necessary when the amount of colloid in follicles decreases and when the number of cellular elements increases. Then, the image *Ls* is characterized by varying degrees of contrast which is less capable of being detected by the features *w*(7) to *w*(10). The noise in the image caused by small artifacts does not exceed 10%. For this reason, the binarization threshold chosen for the matrix GLCM was 0.1. In extreme cases, the image *Ls* may have follicles with very mild edges. Then, *w*(1) is a reliable feature because the features *w*(5) to *w*(6) *w* ill not be effective here. The feature *w*(3) associated with smoothness and uniformity of the image *Ls* is comparable in its properties. Because of the afore-mentioned artifacts (usually in the form of clear follicles), a direct measurement of the average brightness of pixels in an image *Ls* may not produce satisfactory results. For this reason, the feature *w*(4) was defined as the one which treats 10% of the darkest pixels as noise and, thus, they are not taken into account. Depending on the form of a texture (of the image *Ls*), the defined features favour a selection of its features. The values of the features *w*(1) to *w*(10) are also, in varying degrees, sensitive to the image *Ls* specifity. They favour a low contrast, remove noise, are a measure of brightness, cyclicality in the image or its smoothness.

#### Initial verification of features

All of the features *w*(1) to *w*(10) are the basis for building an expert system. Initially, however, a verification of influence of each of the features from *w*(1) to *w*(10) was made. It is worth emphasizing that each one of them was considered separately. For this purpose, the measurements of *ROC* curves (Receiver Operating Characteristics) were carried out for all 376 images (29 healthy subjects and 65 patients with Hashimoto’s disease). *ROC* function was calculated as the dependence of sensitivity on specifity. Assuming *FN* as false negative cases, *FP* as false positive, *TP* as true positive and *TN* as true negative, sensitivity was calculated as *TP*/( *TP* + *FN*) and specifity as *TN*/( *FP* + TN). The obtained results are shown in Figure 8. Each of the measurement points was obtained by changing the value of individual features from the minimum to maximum value in increments of 10%. The area under the *ROC* curve - *AUC* (Area Under Curve), is also marked in the same figure. *AUC* values lie within the range of 0.56 to 0.77 for each of the features *w*(1) to *w*(10) measured separately. The maximum value ( *AUC* = 0.77) was marked with a darkened area and occurs for the feature *w*(5). A separate analysis of each feature *w*(1) to *w*(10) does not give a true picture of their impact on the result. Therefore, the features *w*(1) to *w*(10) will be considered together later in this paper. Their impact on the final result will also be further examined in a broader context.