Validating an infrared thermal switch as a novel access technology
© Memarian et al; licensee BioMed Central Ltd. 2010
Received: 1 June 2010
Accepted: 5 August 2010
Published: 5 August 2010
Recently, a novel single-switch access technology based on infrared thermography was proposed. The technology exploits the temperature differences between the inside and surrounding areas of the mouth as a switch trigger, thereby allowing voluntary switch activation upon mouth opening. However, for this technology to be clinically viable, it must be validated against a gold standard switch, such as a chin switch, that taps into the same voluntary motion.
In this study, we report an experiment designed to gauge the concurrent validity of the infrared thermal switch. Ten able-bodied adults participated in a series of 3 test sessions where they simultaneously used both an infrared thermal and conventional chin switch to perform multiple trials of a number identification task with visual, auditory and audiovisual stimuli. Participants also provided qualitative feedback about switch use. User performance with the two switches was quantified using an efficiency measure based on mutual information.
User performance (p = 0.16) and response time (p = 0.25) with the infrared thermal switch were comparable to those of the gold standard. Users reported preference for the infrared thermal switch given its non-contact nature and robustness to changes in user posture.
Thermal infrared access technology appears to be a valid single switch alternative for individuals with disabilities who retain voluntary mouth opening and closing.
An access technology is a system that senses a physical movement or physiological change from a person and uses this information to drive a user interface . This technology is particularly useful for individuals with severe motor impairments who lack a conventional means of access and communication. Examples of access technologies developed for this population are sip and puff switches , chin switches , computer vision-based systems [4, 5], or electromyography-based systems [6, 7]. While attempts to design and develop new access technologies are ongoing [8–11] an important challenge is to ensure that the new access technology is valid. Validity in the present context refers to the correlation between the switch activations of the access technology of interest and those of an established gold standard. This concept is similar to criterion validity . In engineering design, validity testing is usually performed with prototypes to verify system functionality . If an access technology fails to be valid, the user will find it frustrating to use, and hence the technology will quickly lose its appeal as an assistive device. On the other hand, if the system proves to be adequately valid, then poor performance of the access technology is most likely due to user error rather than technology error. In other words, validity testing is crucial to distinguish between user-dependent errors and algorithm-dependent errors.
In the present paper we describe and discuss empirical validity testing of an enhanced version of the infrared thermography based access technology proposed in . We report an experiment designed to gauge the validity of this access technology with respect to a conventional chin switch. The chin switch was chosen as the benchmark for two reasons. Firstly, the chin switch is guaranteed to activate upon the application of force, and secondly, the chin switch can be activated by exactly the same motion that triggers the infrared thermal switch, i.e., mouth opening.
In the following sections, first, we explain the infrared thermal switch and its underlying computational algorithm. Next, we detail the experimental setup for the validation study and the ensuing data analysis. The paper will be rounded out with a presentation and discussion of the empirical results.
A. Infrared Thermal Switch
The infrared thermography-based access technology used in this study is an enhanced version of the system proposed in . In particular, we employ a different motion estimation method and introduce a new anthropometric filter of non-mouth objects to reduce false positive activations. Also, we invoke additional software and hardware to realize a real time infrared thermographic single switch. This system captures the facial temperature distribution of a user with an infrared thermal camera and translates the user's voluntary mouth opening activity into a switch activation using a computerized algorithm.
A.2. Video Processing Algorithm
The video processing algorithm for the infrared thermal switch was implemented in SIMULINK R2008a. The algorithm consisted of three main modules, namely user face localization, motion and intensity analyses, and filtering of non-mouth objects. In the following, we describe an enhanced and online version of the original algorithm proposed in , with a new motion estimator and new anthropometric filters. We remind the reader that although enhancements to the original algorithm are presented below, the main focus of this study was the concurrent validity testing of the thermal switch, which is detailed in the subsequent section.
Given that the face is generally warmer than the surrounding environment, the face localization module began with an adaptive thresholding of the frame's grey scale intensities based on Otsu's method . Thresholding was followed by the selection of the user's face, i.e., the largest round or semi-round object within the black and white thresholded image. Zero padding was used at all four sides of the frame to account for user head motion which may have caused the segmented facial region to overlap with the frame borders. Figure 1(a) and 1(b) show the original input video frame and the localized face within that frame, respectively.
Motion and Intensity Analyses
This phase of the algorithm is based on two key observations. First, the inside of the human mouth is typically warmer than the surrounding areas of the face. Consequently, an open mouth appears as a bright patch on the thermal video. Second, opening and closing of the mouth involves motion.
In order to detect the warm regions within the face, a second Otsu thresholding  was performed on the segmented face, this time with double the previous threshold value. This thresholding yielded the warm region mask, as shown in Figure 1(c).
The greater the similarity between the two matrices, the smaller the SAD values. In contrast, motion results in dissimilarity and hence a bigger SAD value. The algorithm retained only those pixels in the frame whose SAD values fell in the motion range of mouth open/close activity. Each person opens his or her mouth at a different speed and so individualized SAD thresholds were selected during an initial calibration period. The motion analysis phase yielded a motion mask, which included all the pixels that have comparable motion to mouth opening and closing activity. Of course, this mask did not only include the mouth pixels, but also other areas that have similar motion (e.g., eyebrows, eye lids), hence, the need for a simultaneous warm, region mask. Figure 1(d) depicts the warm facial regions that also exhibited motion.
Filtering non-mouth objects
200 pixels <Area <1500 pixels
(Area of object)/(Area of bounding box) >0.4
(menton sellion length) <Object's centroid < (menton sellion length)
(head breadth) <Object's centroid < (head breadth)
The first condition is basically a size filter that eliminated objects that were too small or too large to be an open mouth. The minimum and maximum size limits in this equation are an order of magnitude larger than the values previously reported in , because in the current study we use a higher resolution infrared thermal camera, and also place the camera closer to the participant to minimize background artifacts. The second and third conditions are morphological filters, which removed respectively, regions that were too long to qualify as a mouth and areas that were too hollow as the mouth was expected to be solid. The fifth condition is a newly introduced anthropometric filter. Based on human face anthropometry, the mouth is located in the lower half of the menton-sellion length and in the middle third of the head breadth . The menton sellion length is the distance in the midsagittal plane between the menton landmark at the bottom of the chin and the sellion landmark at the deepest point of the nasal root depression. The head breadth is the maximum horizontal breadth of the head above the level of the ears [19, 20].
Objects that did not satisfy the expected size, shape, and location of an open mouth within the face were filtered out in this phase. Finally the only object that satisfied all the above conditions was considered to be an open mouth. The bounding box of this object is marked on the video frame as shown in Figure 1(e).
B. Validity Testing Experiment
The objective of this experiment was to validate the detection algorithm of the infrared thermal switch introduced above. The accuracy of the infrared thermal switch may be impacted by user error or algorithm error. An error is either a missed activation or a false alarm. For example a user-dependent miss is when the user does not open his/her mouth, when he/she is cued to do so, while an algorithm-dependent miss is when the user does voluntarily open his/her mouth upon cue, but the system does not detect it and thus does not activate the switch. A false activation can similarly be generated either because of the user opening his/her mouth at the wrong time or the algorithm erroneously picking up non-mouth objects as mouth opening. The infrared thermal switch validation study was carried out to focus on the validity of the presented computerized algorithm by isolating the algorithm-dependent errors from user-dependent errors.
The participants were ten able-bodied university students, (23.5 ± 2.12 years old, 6 females). None had prior experience with either a chin or infrared thermal switch. The purpose of the study was explained to the participants prior to the study and they provided written consent to participate. The participants' knowledge of the purpose of the study was unlikely to bias their test performance and their views on advantages and disadvantages of switches. Since the study objective was algorithm validation, we needed to focus on algorithm-dependent errors. Hence, able-bodied participants were recruited to minimize user-dependent errors.
The protocol consisted of 3 test sessions on separate days, with 6 trials per session. Each trial was 3 minutes in length. The task was to select a target number assigned by the experimenter from a pseudo-random sequence of numbers by opening and closing the mouth. Each sequence of numbers involved 60 cues out of which 10 were actionable cues, i.e., 10 target numbers for which the participant was expected to activate the switches. The numbers were presented every 2500 milliseconds, and in three stimulus modalities, namely, visual (i.e., number displayed on a computer screen), auditory (i.e., number articulated in a synthesized voice via the computer speakers), and audiovisual (i.e., both modalities simultaneously). In each session, there were two trials of each modality. The participants were asked to open and close their mouth upon observing and/or hearing the target number. The presentation modality order was randomized throughout the trials to minimize bias. To avoid learning or habituation effects, a different target number was used for each session. Overall, each participant received 180 actionable cues (3 sessions, 6 trials, 10 actionable cues). During each trial, the following data were automatically logged for both the chin and the infrared thermal switches: stimulus presentation time, response time, number of hits (true positives), true negatives, misses (false negatives), and false alarms (false positives). The protocol was approved by the Research Ethics Board of Holland-Bloorview Kids Rehabilitation Hospital and University of Toronto.
At the end of every session, the participants were asked to provide feedback about their experience with each switch. The participants were asked questions to solicit their subjective feedback and the research personnel took written notes of their answers. Specifically, they were asked to comment on their preference of switch type based on the postural requirements of switch use, the post-session level of fatigue, and the perceived disadvantages of using each switch over an extended period of time.
B.5. Data Analysis
To rigorously assess if user performance with the infrared thermal switch was significantly different from his/her performance with the chin switch, a two-sample Kolmogorov-Smirnov (KS) test with a 5% significance level was conducted. Two-way repeated measures analyses of variance investigated the effects of switch type, stimulus modality and their interaction on efficiency, and response time.
The two-way repeated measures ANOVA overwhelmingly indicated that there were no main effects of switch type (p = 0.98) or presentation modality (p = 0.78) on efficiency and no effect of their interaction on efficiency (p = 0.998).
The two-way repeated measures ANOVA also revealed no main effects of switch type (p = 0.25), or stimulus presentation modality (0.82) on the response time. Also, no interaction effect of switch type and stimulus presentation modality on the response time was found (p = 0.997).
All participants achieved comparably high efficiencies with both the chin and the infrared thermal switches. From Figure 3, all participants except one achieved an information transmission efficiency of greater than 80% with the infrared thermal switch. The two-sample KS test confirmed the validity of the infrared thermal switch.
From qualitative observations of participant behavior, we noticed that during the trials, participant 4 gradually tended to lean back in his seat, such that his chin became misaligned with the chin switch. As a result, his chin did not contact the switch when he opened his mouth. This resulted in many chin switch misses and thus the chin switch efficiency dropped as shown in Figure 3. Participant 10 reported difficulty in maintaining a posture that did not activate the chin switch unintentionally. This participant only opened her mouth slightly during the task, meaning that there was very limited chin excursion. As a consequence, she leaned forward to keep her chin very close to the chin switch. This physical arrangement generated many false positives as her chin contacted the chin switch even when her mouth was closed. Unsurprisingly, chin switch efficiency declined. Remarkably, in both these instances, the infrared thermal switch maintained a greater than 80% efficiency. These examples demonstrate that the infrared thermal switch is more robust than the conventional chin switch to user motion and changes in user posture. Thus, the infrared thermal switch may be a more appropriate option for people with severe and multiple motor disabilities who may have involuntary and spastic movements.
The absence of switch type and stimulus modality effects on performance efficiency implies that the infrared thermal switch can be as useful as a conventional chin switch, regardless of the sensory modality in which the information is presented to the switch user.
Most of the participants reported that the audiovisual stimulus modality was the easiest to follow and respond to, while the audio only stimulus was the most difficult. This agrees with the recent literature contending that the response time to bi-modal stimuli (e.g., audiovisual) is shorter than the response time to uni-modal stimuli (e.g., visual or auditory alone) . Some participants found the combination of chin switch and audio stimulus particularly challenging. Those participants closed their eyes and intently listened to the audio stimuli. While having their eyes closed did not affect their infrared thermal switch performance, it did cause participants to miss several chin switch activations simply because they lost track of the switch's physical location. This finding highlights the convenience of the infrared thermal switch for patients who may have vision impairments; they do not need to actively track the location of the switch relative to their chin.
Selected qualitative feedback (pros and cons) from participants.
Infrared Thermal Switch
Flexibility: switch works even in spite of user body motion and head rotation (as long as user face is in camera's field of view)
More sensitive switch
Possible to activate the switch with closed eyes
Reduced detection sensitivity immediately after taking cold drink or food (this problem is alleviated with time)
Low detection sensitivity due to considerable head rotation or change of posture
Reduced detection sensitivity if too much saliva in the mouth (this problem is alleviated with one swallow)
Too many false positives due to small head motion, talking or any activity that makes the chin touch the switch
Prolonged periods of switch operation may cause fatigue or jaw pain
May cause infection if it enters user mouth
May cause skin irritation
There are further advantages associated with the infrared thermal switch. Being a non-contact access technology, it is hygenic and does not involve the risk of contact injury. Infrared thermography is lighting and color invariant . Thus the infrared thermal switch can be used by people of all ethnicities, night or day, regardless of ambient lighting conditions. These benefits make the infrared thermal access technology an appealing access tool for people with severe motor disabilities. Any individual with voluntary control of his or her mouth movement may be a candidate for infrared thermal switch use. Indeed, like other binary switches, the infrared thermal switch can facilitate typing, game playing and communication, among other activities. Despite these benefits, there are limitations associated with this new technology, such as those indicated in Table 1. The findings in the present study are based on a modest sample of ten able-bodied adults. However, the small between-subject variation in efficiency values suggests that the sample was quite homogenous in terms of switch usage.
The practical issue of cost is worthy of mention. While fully radiometric infrared thermal imaging is considered a costly technology, the proposed infrared thermal switch can be affordably implemented given that only a grayscale intensity representation rather than the absolute temperature of the client's face is required. Hence, a less costly (approximately 2000 USD) non-radiometric camera can be used. In addition to the camera, very simple and inexpensive hardware (e.g. module latching relay and USB switch converter) have been used in the design of the infrared thermal switch. Infrared thermal cameras are usually designed to withstand harsh climates and rough handling . Nonetheless, repairing an infrared thermal camera that is beyond its warranty period can be a costly undertaking.
In addition to cost, the required in situ technical adjustments should also be considered. The client's orofacial temperature is likely to vary subsequent to the consumption of food or drink or with changes in ambient temperature. As a consequence, recalibration of the client's intensity and motion thresholds may be required at the start of each session.
In this paper, we presented an experiment with 10 able-bodied individuals to validate a novel infrared thermal switch algorithm. A previously proposed mutual information measure was used to quantify user performance. Users completed multiple trials of a number identification task using both infrared thermal and conventional chin switches concurrently. User performance was comparable between the proposed thermal switch and the gold standard chin switch, establishing the concurrent validity of the former. This is a clinically positive result given that the thermal switch offers the additional benefits of being non-contact and robust to user posture. The infrared thermal switch is valid with respect to a conventional chin switch and ought to be considered for clients with severe disabilities who retain voluntary mouth opening/closing.
The authors would like to acknowledge IBM Canada, Natural Sciences and Engineering Research Council of Canada, Ontario Centres of Excellence, Whipper Watson Scholarship, and Bloorview Children Hospital Foundation Scholarship for funding this research. We are grateful to Mr. Pierre Duez for programming the stimulus presentation software in Visual Basic, and Mr. Steven Bernstein for preparing Figure 2.
- Tai K, Blain S, Chau T: A review of emerging access technologies for individuals with severe motor impairments. Assist Technol 2008, 20: 204–219. 10.1080/10400435.2008.10131947View ArticleGoogle Scholar
- Jones M, Grogg K, Anschutz J, Fierman R: A sip-and-puff wireless remote control for the Apple iPod. Assist Technol 2008, 20(2):107–110. 10.1080/10400435.2008.10131937View ArticleGoogle Scholar
- Lancioni GE, O'Reilly MF, Singh NN, Sigafoos J, Tota A, Antonucci M, Oliva D: Children with multiple disabilities and minimal motor behavior using chin movements to operate microswitches to obtain environmental stimulation. Res Dev Disabil 2006, 27(3):290–298. 10.1016/j.ridd.2005.02.003View ArticleGoogle Scholar
- Betke M, Gips J, Fleming P: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans Neural Syst Rehabil Eng 1996, 10(1):1–10. 10.1109/TNSRE.2002.1021581View ArticleGoogle Scholar
- Perini E, Soria S, Prati A, Cucchiara R: FaceMouse: A human-computer interface for tetraplegic people. Lect Notes Comput Sci 2006, 3979: 99–108. full_textView ArticleGoogle Scholar
- Huang CN, Chen CH, Chung HY: Application of facial electromyography in computer mouse access for people with disabilities. Disabil Rehabil 2006, 28(4):231–237. 10.1080/09638280500158349View ArticleGoogle Scholar
- Chen Y, Lai JS, Luh JJ, Kuo TS: SEMG-controlled telephone interface for people with disabilities. J Med Eng Technol 2002, 26(4):173–176. 10.1080/03091900210127933View ArticleGoogle Scholar
- Leung B, Chau T: A multiple camera tongue switch for a child with severe spastic quadriplegic cerebral palsy. Disabil Rehabil Assist Technol 2010, 5(1):58–68. 10.3109/17483100903254561View ArticleGoogle Scholar
- Blain S, Mihailidis A, Chau T: Assessing the potential of electrodermal activity as an alternative access pathway. Med Eng Phys 2008, 30(4):498–505. 10.1016/j.medengphy.2007.05.015View ArticleGoogle Scholar
- Memarian N, Venetsanopoulos AN, Chau T: Infrared thermography as an access pathway for individuals with severe motor impairments. J Neuroeng Rehabil 2009, 6(11):1–8.Google Scholar
- Falk T, Chan J, Duez P, Teachman G, Chau T: Augmentative communication based on realtime vocal cord vibration detection. IEEE Trans Neural Syst Rehabil Eng 2010, 18(2):159–163. 10.1109/TNSRE.2009.2039593View ArticleGoogle Scholar
- American Psychological Association: Standards for Educational and Psychological Testing. 2nd edition. Washington, DC; 1992.Google Scholar
- Black R: Managing the Testing Process: Practical Tools and Techniques for Managing Hardware and Software Testing. 3rd edition. Hoboken, NJ: Wiley; 2009.Google Scholar
- ThermaCAM SC640 FLIR Systems Inc [http://www.flir.com/uploadedFiles/Thermography_APAC/Products/Product_Literture/SC640_Datasheet%20APAC.pdf]
- Otsu N: A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 1979, 9: 62–66. 10.1109/TSMC.1979.4310076View ArticleGoogle Scholar
- Koga T, Linuma K, Hirano A, Lijima Y, Ishiguro T: Motion-compensated interframe coding for video conferencing. In Proceedings of the National Telecommunication Conference: 29 November-03 December 1981. New Orleans, LA; 1981. G5.3.1–5.Google Scholar
- Wang Y, Ostermann J, Zhang Y: Video Processing and Communications. Upper Saddle River, NJ: Prentice Hall; 2002.Google Scholar
- Uematsu S: Symmetry of skin temperature comparing one side of the body to the other. Thermology 1986, 1: 4–7.Google Scholar
- Bailar JC, Meyer EA, Pool R, (Editors): Assessment of the NIOSH head and- face anthropometric survey of U.S. respirator users. Washington, DC: The National Academies Press; 2007.Google Scholar
- DeCarlo D, Metaxas D, Stone M: An anthropometric face model using variational techniques. Proceedings of SIGGRAPH:19–24 July 1998; Orlando, FL 1998, 67–74.Google Scholar
- Memarian N, Venetsanopoulos AN, Chau T: Mutual information as a measure of contextual effects on single switch use. The Open Rehabilitation Journal 2009, 2: 1–10. 10.2174/1874943700902010001View ArticleGoogle Scholar
- Hecht D, Reiner M, Halevy G: Multimodal virtual environments: response times, attention, and presence. Presence 2006, 15(5):515–523. 10.1162/pres.15.5.515View ArticleGoogle Scholar
- Steketee J: Spectral emissivity of the skin and pericardium. Phys Med Biol 1973, 18(5):686–694. 10.1088/0031-9155/18/5/307View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.