Hyperspectral image analysis method is associated with three stages:
-
Pre-processing of images in which filtration is the main element,
-
Calibration linked to the automatic recognition of the pattern position,
-
Processing of images enabling proper segmentation of the skin areas.
Details of these three stages are described below.
Pre-processing
Image pre-processing concerns correct reading and interpretation of the data recorded by the camera PFD-V10E in dat and raw format. This camera records information for each line (each row) N and at the same time registers a full spectral range. In this case, λ∈(397, 1030) nm is equivalent to the adopted spectral distance with the registration of 800 lines. This process is shown in Figure 1. Depending on the type of data, raw or dat, each pixel is recorded at 32 or 16 bits of data. The exact number of rows and columns is stored in a file with the extension hdr that contains all the typical header information. This is information relating to a particular frequency of the spectral range, type of data storage, sensor type, and others. For the analysed data, the image resolution M × N was varied in the range from 15 × 1312 to 899 × 1312 pixels. The range of changes was strictly dependent on the scan area. This area was limited mainly due to image acquisition time of 12 seconds for the registration of the spectrum at the maximum resolution and in the full range. A dynamic error related to the possible displacement of the scan area during measurements was minimized by mechanical stops and skin area orientation ensuring patient’s comfort.
The images L
GRAY
(m,n,k), where m-row, n-column and k–the next wavelength λ (k∈(1,K)), read from the files with the extension dat or raw were further filtered. For each image in the sequence, a median filter with a mask h sized M
h
× N
h
= 3 × 3 pixels was used. The mask size was dependent on the amount of pollution and the level of noise. In the case of recorded images, the noise and artefacts did not exceed the size of 2 pixels per one cluster. For this reason, a sufficient filter mask size was 3 × 3 pixels. In this way, the noise-free image L
M
(m,n,k) was subjected to calibration.
Calibration
The acquired images L
M
(m,n,k) are not calibrated. Calibration involves referring each pixel of the registered skin area to the white pattern [32–34]. For the registered cases, the pattern was a white stripe placed at the top- Figure 2. Automatic detection of the pattern position was implemented in the proposed algorithm. It concerned recognition of one of the pattern contours using information about the brightness gradient of adjacent pixels for each column, i.e.:
(1)
for m∈(1,N-1) where p
r
is a binarization threshold determined automatically according to Otsu’s formula [35].
The pattern in the form of a white stripe was 20 × 400 mm, which was equivalent to the number of rows m
w
= 80 ± 5 for its set distance from the camera lens. The value of ±5 pixels is associated with a possible image shift or rotation. The number of columns was covered by the pattern in its entirety. Therefore, the searched pattern boundary contour was designated as:
(2)
On this basis, average brightness for each column is calculated, i.e.:
(3)
Examples of graphs of L
w
(n,k) for k = 400, 401 and 402 are shown in Figure 3. The image Figure 3 a) and its zoom Figure 3 b) show the differences in average brightness values. The image must be calibrated with respect to these changes. For each value k, calibration must be performed independently. Calibration of individual images is carried out as:
(4)
for:
In the case of pixels which exceed the value “1”, adjustment is necessary:
(5)
The image L
K
(m,n,k) having brightness values in the range from 0 to 1 is further subjected to the next processing steps.
Image processing
The input image L
K
(m,n,k) after calibration is the basis for the segmentation process. For this purpose, a sample diagram of brightness changes in a sample ROI was made for the human skin which mainly consists of water, melanin and haemoglobin. The results for each k-th image (at different wavelengths) are shown in Figure 4 a), i.e.:
(6)
(7)
(8)
The ROI was associated with the hand area shown in Figure 2 and included the range M
ROI
× N
ROI
= 150 × 150 pixels. In this range, the values , which are the mean, minimum and maximum values of brightness changes in the ROI respectively, were calculated. The results obtained shown in Figure 4 a) are also dependent on the individual variability of patients and the method of lighting and setting the camera angle relative to the patient. The influence of these elements on the result is revealed by the shift of curves shown in Figure 4 a) up or down, which increases or decreases the mean brightness value. Therefore, normalization performed for the entire image sequence with respect to changes for k-th images is necessary, i.e.:
(9)
for
For the images L
O
(m,n,k) modified in this way the obtained results of the mean, minimum and maximum values also change in the same sample ROI, i.e.: . The results obtained are shown in Figure 4b. The normalized images L
O
(m,n,k) also enable automatic segmentation in accordance with the reference curve of melanin and haemoglobin content for each wavelength. The reference content of melanin and haemoglobin can be acquired from external sources, for example from literature data [36], or on the basis of the selected ROI. In the latter case, the result will be as follows - image L
D
(m,n), i.e.:
(10)
Therefore, the image L
D
(m,n) contains information about the average error- Figure 5. It is calculated for individual pixels relative to the reference waveform . On this basis, binarization-based segmentation can be performed for the binarization threshold p
w
designated manually or automatically (the afore-mentioned Otsu formula [35]). From a practical point of view, the effect of the binarization threshold selection (provided manually) on the segmentation results obtained is of interest. For this purpose, the impact of changes in the threshold p
w
on the changes in the surface area A(p
w
) of the segmented object was investigated, i.e.:
(11)
where:
(12)
The results obtained are shown in Figure 6. The area optimal from the point of view of the segmentation results is marked in green. The term “optimal” refers to such a fragment of the curve A(p
w
) which is a flat area. For the threshold p
w
= 0.2 ± 0.9, the segmentation result is correct. This fact was proven when comparing it with the result obtained by an expert relying on manual marking (gold standard). The error is here defined as:
(13)
where A
Z
is the surface area resulting from the expert’s work.
The results obtained from equation (13) it was assumed that the results obtained by the expert are repetitive and do not differ from results obtained by other experts. Not in every case, however, it must be so. The exact considerations presented in [37–39].
The results of changes in the error δ
w
for varying p
w
(p
w
∈(0.1, 0.3)) are shown in Figure 7a). From the presented graph and for the case under consideration, the smallest error (δ
w
≈ 0) is obtained for p
w
= 0.23. For automatic selection of the binarization threshold (Otsu’s formula), the error is 9%. Sample binarization results are shown in Figure 7b for the thresholds p
w
∈{0.1, 0.15. 0.2, 0.27}. Visual assessment gives the best results for p
w
= 0.23. However, it should be noted here that the adopted binarization threshold values are the acceptable standard deviation of the reference distribution of the skin spectrum from the measured pixels (formula (10)). In general, for any image containing the human skin, the following approaches are possible:
-
automatic selection of the binarization threshold according to the Otsu’s formula – it enables to obtain a binary image of the object,
-
manual selection of the binarization threshold p
w
dependent on the acceptable tolerance of individual pixels of the image relative to the reference waveform - 1% tolerance is p
w
= 0.01, 10% tolerance is p
w
= 0.1 respectively, etc.,
-
automatic selection of the binarization threshold p
w
depending on the location of the ‘flat’ area (Figure 6).
Depending on the desired end result, one of the above methods is selected by an operator. Figure 8 shows a sequence of images L
DB
(m,n) for p
w
∈(0,1) changed with 0.2 step from the area of the hand, forearm, finger (thumb) and tattoo.