- Open Access
A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels
© Zourmand et al.; licensee BioMed Central Ltd. 2014
- Received: 20 September 2013
- Accepted: 14 July 2014
- Published: 25 July 2014
The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined.
Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production.
- Vocal tract shape
- Articulators’ movements
- Malay vowel sounds
- Active contour
- Acoustic parameters
- Formant frequencies
The investigation of articulator shape during speech production can facilitate the understanding of the mechanisms of speech production. According to the acoustical theory of speech production , understanding speech production requires consideration of the vocal tract as an acoustical tube as its cross-sectional area changes during the speech production process . Various studies were performed to support this theory when it was first suggested. From the 1940s to the 1970s, a large number of radiography experiments were conducted to collect the data that revealed the shape of the vocal tract during speech production. For subsequent research on acoustic speech production, the collected data were employed to develop early analog models for articulation. In the succeeding decades, continued research coupled with the advent of computers resulted in remarkable advancements in modeling the articulatory and acoustic processes. In addition, articulation models have been used to study the more complex aspects of modeling, such as the three-dimensional shape of the tongue and its movements [3–10].
Different instruments have been used by researchers to measure the shapes of the vocal tract and articulators. The X-ray CT method is a powerful tool for this purpose. As a considerable part of the entire vocal tract length that is observed by X-ray CT imaging, 3D information that indicates the shape of the vocal tract, as well as tongue shape and movement pattern, is obtainable . Nevertheless, this method presents certain drawbacks that decrease the use of such a system including the harmful effects on humans in relation to the X-ray imaging instrument. Dynamic data on tongue movement in the oral cavity can be provided by an X-ray micro beam and electromagnetic articulography [12, 13], which are categorized as point tracking tools. Ultrasound scanners can supply dynamic images of moving structures in the oral cavity, such as the tongue surface in both midsagittal and transverse planes [14, 15]. Nevertheless, ultrasound transmission properties limit the use of such devices to mapping anterior airway surfaces.
The disadvantages of using the aforementioned methods motivate us to employ the MRI system in this study. One of the most significant points concerning MRI for non-medical purposes is its ability to provide images similar to those obtained by X-ray CT but without any side effects from radiation. MRI is unconstrained by the positioning of a subject in obtaining images of different directions and angles. Images of each slice of the vocal tract are obtainable with an acceptable quality for speech production study.
Many researches have been done on dynamic or static study of vocal tract based on MRI. Technology development in magnetic resonance imaging has made investigation of articulators during speech production feasible. Real-time MRI for speech production has been studied in different languages [16–18] such as French , German , Swedish , European Portuguese [22, 23], Finnish , Czech  and Japanese . In Malay language, however, no research has been performed on this matter. Here, dynamic study of prolonged Malay vowels is performed. Investigating the production of Malay vowels would be helpful in diagnosing articulation disorders. In particular, the data such as this could be useful as a standard vowel pronunciation of normal people which can be compared with other data to determine any disorder in this matter. K-space in acquisition techniques including partial Fourier or spiral acquisition method is frequently used for increasing the temporal MR resolution [16, 27, 28]. Information provided by different vocal tract measurement techniques has been used in developing some kinds of biomechanical simulation tools for simulating the movements of the muscles in vocal tract [29, 30]. The simulation tool  has been employed in some further studies to determine the functions of vocal tract organs [32–35].
However blurring of some parts of the acquired image is still a drawback for this technology because during the scanning time the subject needs to remain to be stationary (see [36, 37] for the challenges in MRI study of articulation). As a remedy for the blurring problem, in some studies, a stroboscopic method is employed to recapture some images for the same speech in different periods in order to produce a reliable MR sequence . However, some limitations for this method are apparent. For example, the speaker needs to repeat the utterances several times. Not all mistakes by the speakers can be avoided since exact repetition is not possible. In other words, a main bottleneck for this research is that many effective factors during articulation change from one speaker to another, which is referred to as interspeaker variability . This variability can be categorized as anatomical and psychological features [28, 40–42]. In addition, in Malay language no study of speech production based on dynamic MRI has been done so far. Consequently, this study is considered a pioneer in the framework of the dynamic study of speech articulation in Malay language based on MRI.
MRI however presents certain disadvantages, such as the duration of the scanning process. Sometimes scanning takes several minutes, which can be tedious for subject. In the study of the pronunciation of phonemes, the subject is required to utter the speech sound several times , which can be strenuous. Additionally, because of partial blurring, the images obtained by MRI are sometimes of unacceptable quality. Another drawback of MRI is the low image contrast between tissues with low hydrogen content and airways. Consequently, segmenting the scanned image to determine the regions occupied by airways (such as the oral cavity) can result in errors . In MRI, the quality of the object in an image depends on the thickness of the scanned tissue. Usually, MRI provides clear and undistorted images from the object with the thickness of at least 3 mm. Moreover, the loud sound produced by the gradient coils during scanning interferes with the voice of subject during the recording process. Despite these drawbacks, an MRI system provides valuable information on the vocal tract shape that is formed as subject’s uttered speech. To address the image-blurring problem during the scanning process, this study proposes image processing techniques including active contours for the use of MRI in studying articulation. The results indicate that these techniques enable the measurement of articulation parameters efficiently.
Research was previously conducted using a 3D reconstruction of the vocal tract (from MR images) for speech simulation . The study employed the region growing method to obtain axial slices from the vocal tract. However, as slices of the vocal tract are obtained, the tongue performs several partial movements as the subject pronounces a phoneme and it is difficult to stay absolutely still for a prolonged time. Consequently, scanned images of certain regions on the tongue boundaries may be of insufficient quality given that even minor tongue movement blurs the scanned images. Thus, the accuracy of evaluating the vocal tract slices by region growing techniques decreases. As a remedy, researchers have suggested the use of human operators to trace the boundaries of the oral cavity and region growing methods that require the determination of the initial seeds in the growing regions . Most of the relevant methods mentioned in literature [22, 40, 43, 46–48] are semi-automatic and consequently require human intervention, making the process tiresome for specialists, and therefore, prone to error. In this paper, we employ an active contour that focuses on the tongue tracking. By determining the number of control points of the active contour with an automated method, we control its degree of freedom, thereby enabling a smooth and relatively accurate evaluation of the tongue boundary even when this boundary is partially blurred in MR images.
Active contours, or “snakes”, are mathematical models that define deformable curves on the image domain. These methods, categorized as deformable models, are of special interest for medical image segmentation [23, 47, 49]. In this framework, internal and external forces influence the deformation of the curves. Internal forces are dynamically defined and computed from the curve characteristics, and external forces are obtained mostly from the image in which the active contour is applied.
According to the literature, active contours are divided into two categories: geometric [50–55] and parametric active contours [56, 57]. Kass et al. in 1987 were the first to attempt the development of an active contour based on the energy minimization of splines and external constraints, including the energies defined by the image edges that deform curves. To smoothen the curves, the authors defined an internal energy based on curvature. However, the weak points of their active contour model, including sensitivity in the selection of initial points and its inability to track non-convex objects, motivated modifications to their model.
Williams and Shah in  introduced the greedy snake algorithm. They employed a fully discrete method to compute the movement of the snake. For this purpose, the neighborhood pixels of each snake point were used to identify the minimum energy obtainable for the movement. Furthermore, an efficient method for evaluating the curvature of discrete curves was employed.
In our experiments to investigate tongue shape and movement, the materials we considered include the pronunciation of a preselected set of Malay vowels. To this end, our subjects were made to lie on an MRI scanner were asked to pronounce the Malay vowels. The mouth region of the head, including the oral cavity, tongue, and lips, was examined during the experiments. The active contour employed in this approach required tracking the tongue in the MRI frames. To prevent lengthy computations of more sophisticated active contour algorithms, the greedy active contour model was employed. Image preprocessing techniques including morphological filtering were applied to MR images to ensure effective performance despite partial image blurring.
MRI scanning parameters and image acquisition protocol
MR parameters for vocal tract image acquisition
MR Echo, using 8 channel cardiac coil
ETL (echo train length)
256 × 256 pixels
Anatomical information of subjects
To reduce image blurring during image acquisition, the subjects were required to maintain vocal tract shape (i.e., hold the mouth position constant for a certain period) as they pronounced the vowels. Prior to the scanning, the subjects performed phonation practice. Some assumptions were made on the basis of a scanning protocol, described as follows. To reduce the intensity of the sounds heard by the subjects during the imaging process, the subjects were asked to use earplugs.
Afterwards, they were positioned on the MRI table in a comfortable state. Pieces of cloth were placed under their heads to limit their head movement to a minimum. We positioned the heads of the subjects in the center of the magnet. As the experimental condition that must be taken to the consideration is the head, particularly the upper jaw of the subject, it should not move during the experiments. Prior to each image acquisition session, a sagittal localizer was used to provide an appropriate field of view for the scanning location. Subject utterances during the scanning were recorded but due to the noise of the environment, the recordings were not reliable.
To conduct a dynamic study of vowel production, we asked the subjects to pronounce several repetitions of six prolonged Malay vowel sequences (/a/, /e/, /ә/, /i/, /o/ and /u/) during the scanning process. In addition to the MRI scanning process, for acoustical analysis of the speeches, the subjects were asked to pronounce the same Malay vowels for 5 s each at a comfortable pitch and loudness level. The speech sounds were recorded using a Shure SM58 microphone in a regular room environment. The mouth-to-microphone distance was fixed at 2–3 cm. Gold-Wave digital audio editor software was used to record the speech sounds at a sampling rate of 20 kHz with 16-bit resolution. There was no co-articulation either in the recording speech nor in MRI scanning process. To date, no dynamic MRI-based study has been performed on the production of prolong Malay vowels.
Formant frequencies extraction
Besides MRI data for the study of the articulatory parameters, the Praat software was used to determine formant frequencies of the prolonged vowels of the subjects  based on the recordings. The following standard formant settings were used: 5500 Hz of maximum formant frequency for female and 5000 Hz for male subjects, five formants, 25 milliseconds of window length, and a dynamic range of 30 dB. There were two possibilities for extracting formant frequencies using Praat, namely, Praat manual extraction as well as the extraction of automatic formant frequencies using Praat scripting. In this study, the formant frequencies were obtained using the automatic method, and the average values were used instead of the middle point value; this decreased the possible error of the Praat calculation of formants because instead of one point for each sample, several points were extracted from each sample and then the average was calculated. The number of points used for each sample depended on the sample length, and it was equal to the length of the sample divided by the length of the window frame (25 milliseconds).
Instrumentation and data collection
In a large number of MRI studies [27, 43, 45, 59], authors dismiss the focus on the contour extraction of MRI frames. The reason can be an implicit assumption that high-resolution MR images with acceptable contrast and quality are collected. Consequently, image processing software extracts contours for the quantitative investigation of articulatory parameters. In general, however, this supposition does not hold. As the tongue moves during imaging, blurring is unavoidable. Under these circumstances, the extraction of tongue contours in advance is a challenging task.
Numerous methods are used to enhance acquisition of MR image sequences and appropriate trigger systems have been proposed. In clinical practice, however, the triggering method based on electrocardiogram monitoring is performed in some studies [43, 59]. To increase the temporal resolution for real-time imaging, researchers put forward some other techniques [16, 27, 28]. In these methods, images are acquired at different speeds on the basis of ultrafast imaging sequences. Multiple echoes during the imaging process are employed. However, because of partial motion of subject during scanning process motion artifacts are observed in the yielded images.
To resolve blurring in MR images, we propose an active contour-based method for extracting tongue contours in MRI frames. By determining the control points of active contours, the tongue contours can be traced even when the tongue is partially blurred. If the blurring is not severe, the traced contours are reliable for the experiments. Otherwise, the blurred frames are ignored and other frames are used for analysis.
where G σ (x, y) stands for a two-dimensional Gaussian blurring filter with a standard deviation of σ. The filter is used to blur the image gradient, thereby influencing the snake by the image gradient from a larger distance.
where vector v(s) t contains the indices to the snake points at time step t and v(s) t − 1 contains the snake points at time step t - 1. n and th stop denote the total number of control points in the snake and a threshold for the stopping criterion .
Tongue properties from an articulatory perspective
As mentioned, the upper boundary of the tongue is critical for producing vowel sounds. As a result, the active contour aims at tracking the upper boundary of the tongue. For this purpose, some preprocessing steps such as dilation and erosion operations are performed to obtain the initial points for the active contour. The initial points of the active contour employed in this study are divided into two groups: upper initial points and lower initial points.
Obtaining initial points for the active contour
At first glance it is observable that the highest area of oral cavity is obtainable while pronouncing vowel /a/. Comparing the tongue structure while pronouncing the vowel /a/ and /ә/ a tongue tip and tongue body raising in vowel /ә/ is observable. In contrast, a back raising of tongue is shown for articulation of vowel /e/ in comparison with /a/ and /ә/. Front raising of tongue in vowel /i/ is considerable. Tongue shape in vowel /o/ and /u/ is quite similar and both of the vowels show a considerable tongue back raising but the observable difference is the lip aperture which is higher in vowel /o/. As a summary back raising in vowels /o/, /u/, and /e/ are dominant while front raising in /i/ and /ә/ are dominant.
Tongue tip constriction location (TTCL),
Tongue body constriction location (TBCL), and
Lip aperture (LA; distance between the upper and lower lips).
The measurements are done with the coordinate system based on the palatal plane which is an anatomical standard plane in the midsagittal slice and can be drawn based on a line from the anterior nasal spine to the posterior nasal spine.
Articulatory parameters obtained in this study
Measurement (mm) of articulatory parameters by frame number
As Figure 7(a) shows, the TTCL values of vowels /i/ and /ә/ are the lowest compared to those of the other vowels because they are front vowels and the TTCL parameter must be lower for the back vowels. Conversely, the back vowels /o/ and /a/ have the highest TTCL.In addition, Figure 7(b) presents the TBCL, which represents the height of the tongue in the articulation of different vowels. Given that /i/ is a high vowel, the value of the TBCL is at its lowest, while the vowel /a/, which is a low vowel, has the highest TBCL. Moreover, Figure 7(c) presents the LA value, which represents the lip aperture for the different vowels. The highest LA value is generated for vowel /a/ while the lowest is observed for vowel /u/. This result is attributed to the requirements in which the lips should be completely open when the vowel /a/ is articulated, but should be closer together when the vowel /u/ is produced.
Formant Frequency values with STD
420.48 ± 13.57
767.95 ± 12.25
1231.05 ± 25.17
1562.41 ± 27.54
358.12 ± 6.03
692.99 ± 7.03
1578.04 ± 24.33
2059.27 ± 22.54
333.12 ± 2.72
654.43 ± 2.34
1324.39 ± 50.29
1711.98 ± 48.65
228.99 ± 0.60
490.61 ± 0.04
1738.69 ± 4.63
2624.81 ± 4.21
371.59 ± 5.26
620.58 ± 5.65
918.09 ± 2.03
1245.83 ± 2.38
326.49 ± 5.21
502.72 ± 4.86
754.98 ± 25.44
1157.55 ± 23.76
In the study of speech articulation, MRI imaging yields helpful and precise information on the shape of articulators, as well as their position during speech production. Moreover, their dynamics can be appropriately investigated for the study of their temporal functions during articulation. However, the movement of articulators is an issue that demands higher temporal imaging resolution for a more accurate quantification. In this study, a proposed approach for this problem has been examined based on an image processing technique that uses active contours. After applying preprocessing methods to the MR images, we obtained the initial points for the active contours. Afterwards, the active contour was applied to the MRI frames. Consequently, the tongue contour was appropriately traced for the study of speech articulation parameters.
In the experiments, six Malay vowels were produced by the male and female subjects, and the articulatory parameters were measured using the proposed algorithm. The specific tongue shape and position for all the six Malay vowels were also obtained. The experiments demonstrated the correlations between acoustic speech and articulatory parameters. Specifically, the first formant frequency (F1) was positively correlated to TBCL, whereas the second formant frequency (F2) was negatively correlated to TTCL. The observations during this study can be helpful for researches regarding speech synthesis techniques. Furthermore, it can improve understanding of speech articulation in Malay language which can be useful for clinical usages of diagnosis of speech disorders and speech rehabilitation procedures.
The authors would like to thank the Ministry of Science, Technology and Innovation, Malaysia (MOSTI) for funding this study under the Science Fund (Project No: 06-01-03-SF0516) and University of Malaya under UMRG grant (RP016A/13AET). The authors would like to thank Mr. Mohd Yushafizal Mohd Yusof, University Malaya Medical Center (UMMC) for obtaining the MRI images.
- Fant G: Acoustic theory of speech production. The Netherlands: 's-Gravenhage: Mouton; 1960.Google Scholar
- Perkell JS: Physiology of Speech Production: Results and Implications of a Quantitative Cineradiographic Study. Cambridge, Mass: MIT Press; 1969.Google Scholar
- Kim H: Gradual tongue movements in Korean Palatalization as coarticulation: new evidence from stroboscopic cine-MRI and acoustic data. J Phon 2012, 40: 67–81.View ArticleGoogle Scholar
- Takano S, Honda K: An MRI analysis of the extrinsic tongue muscles during vowel production. Speech Comm 2007, 49: 49–58.View ArticleGoogle Scholar
- Kim H, Honda K, Maeda S: Stroboscopic-cine MRI study of the phasing between the tongue and the larynx in the Korean three-way phonation contrast. J Phon 2005, 33: 1–26.View ArticleGoogle Scholar
- Kim H, Maeda S, Honda K: Invariant articulatory bases of the features [tense] and [spread glottis] in Korean plosives: New stroboscopic cine-MRI data. J Phon 2010, 38: 90–108.View ArticleGoogle Scholar
- Chiba T, Kajiyama M: The Vowel: Its Nature and Structure. Phonetic Society of Japan, Kaiseikan: Tokyo; 1941.Google Scholar
- Stevens KN, Kasowski S, Fant CGM: An electrical analog of the vocal tract. J Acoust Soc Am 2005, 25: 734–742.View ArticleGoogle Scholar
- Mermelstein P: Articulatory model for the study of speech production. J Acoust Soc Am 1973,53(4):1070–1082.View ArticleGoogle Scholar
- Kelly JL, Lochbaum CC: Speech synthesis. Paper G42. In Proceedings of the Fourth International Congress on Acoustics. Copenhagen, Denmark; 1962:1–4.Google Scholar
- Iskarous K: Patterns of tongue movement. J Phon 2005, 33: 363–381.View ArticleGoogle Scholar
- Schonle PW, Grabe K, Wenig P, Hohne J, Schrader J, Conrad B: Electromagnetic articulography: use of alternating magnetic fields for tracking movements of multiple points inside and outside the vocal tract. Brain Lang 1987, 31: 26–35.View ArticleGoogle Scholar
- Iskarous K: Vowel constrictions are recoverable from formants. J Phon 2010, 38: 375–387.View ArticleGoogle Scholar
- Shawker TH, Sonies BC: Tongue movement during speech: a real-time ultrasound evaluation. J Clin Ultrasound 1984, 12: 125–133.View ArticleGoogle Scholar
- Stone M: A three-dimensional model of tongue movement based on ultrasound and x-ray microbeam data. J Acoust Soc Am 1990, 87: 2207–2217.View ArticleGoogle Scholar
- Demolin D, Hassid S, Metens T, Soquet A: Real-time MRI and articulatory coordination in speech. Comptes Rendus Biologies 2002, 325: 547–556.View ArticleGoogle Scholar
- Engwall O: From real-time MRI to 3d tongue movements. Interspeech 2004.Google Scholar
- Baer T, Gore J, Boyce S, Nye P: Application of MRI to the analysis of speech production. Magn Reson Imaging 1987, 5: 1–7.View ArticleGoogle Scholar
- Badin P, Serrurier A: Three-dimensional modeling of speech organs: Articulatory data and models. Trans Tech Commit Psychol Physiol Acoustics 2006, 36: 421–426.Google Scholar
- Behrends J, Hoole P, Leinsinger GL, Tillmann HG, Hahn K, Reiser M, Wismüller A: A segmentation and analysis method for MRI data of the human vocal tract. In Bildverarbeitung für die Medizin. Berlin Heidelberg: Springer; 2003:186–190.Google Scholar
- Engwall O: A 3d tongue model based on MRI data. Interspeech 2000, 901–904.Google Scholar
- Ventura SMR, Vasconcelos MJM, Freitas DRS, Ramos IMA, Tavares JMR: Speaker-specific articulatory assessment and measurements during Portuguese speech production based on magnetic resonance Images. Language Acquisition 2012.Google Scholar
- Rua Ventura SM, Freitas DRS, Ramos IMA, Tavares JMR: Morphologic differences in the vocal tract resonance cavities of voice professionals: an MRI-based study. J Voice 2013,27(2):132–140.View ArticleGoogle Scholar
- Palo P, Aalto D, Aaltonen O, Happonen R-P, Malinen J, Vainio M: Articulating finnish vowels: results from MRI and sound data. Linguistica Uralica 2012, 3: 194–199.View ArticleGoogle Scholar
- Vampola T, Horacek J, Svec JG: FE modeling of human vocal tract acoustics. Part I: Production of Czech vowels. Acta Acustica United Acustica 2008, 94: 433–447.View ArticleGoogle Scholar
- Takemoto H, Honda K, Masaki S, Shimada Y, Fujimoto I: Measurement of Temporal Changes in Vocal Tract Area Function during a continuous vowel sequence using a 3D Cine-MRI Technique. 6th Int Seminar on Speech Production 2003, 284–289.Google Scholar
- Narayanan S, Nayak K, Lee S, Sethy A, Byrd D: An approach to real-time magnetic resonance imaging for speech production. J Acoust Soc Am 2004, 115: 1771.View ArticleGoogle Scholar
- Ma’dy K, Sader R, Zimmermann A, Hoole P, Beer A, Zeilhofer H, Hannig C: Assessment of consonant articulation in glossectomee speech by dynamic MRI. Paper presented at the Proceedings of 7th International Conference on Spoken Language Processing (ICSLP), Denver, CO 2002, 961–964.Google Scholar
- Story BH: Comparison of magnetic resonance imaging-based vocal tract area functions obtained from the same speaker in 1994 and 2002. J Acoustical Soc Am 2008, 123: 327–335.View ArticleGoogle Scholar
- Story BH, Titze IR, Hoffman EA: Vocal tract area functions from magnetic resonance imaging. J Acoustical Soc Am 1996, 100: 537–554.View ArticleGoogle Scholar
- Stavness I, Lloyd JE, Payan Y, Fels S: Coupled hard–soft tissue simulation with contact and constraints applied to jaw–tongue–hyoid dynamics. Int J Numerical Methods Biomed Eng 2011, 27: 367–390.View ArticleMATHGoogle Scholar
- Gick B, Stavness I, Chiu C, Fels S: Categorical variation in lip posture is determined by quantal biomechanical-articulatory relations. Canadian Acoustics 2011, 39: 178–179.Google Scholar
- Story BH, Bunton K: Simulation and identification of vowels based on a time-varying model of the vocal tract area function. In Vowel Inherent Spectral Change. Berlin Heidelberg: Springer; 2013:155–174.View ArticleGoogle Scholar
- Guzman M, Laukkanen A-M, Krupa P, Horáček J, Svec JG, Geneid A: Vocal tract and glottal function during and after vocal exercising with resonance tube and straw. J Voice 2013, 27: 523. e519–523. e534Google Scholar
- Kivelä A, Kuortti J, Malinen J: Resonances and mode shapes of the human vocal tract during vowel production. Proceedings of 26th Nordic Seminar on Computational Mechanics, to appear 2013.Google Scholar
- Aalto D, Malinen J, Palo P, Aaltonen O, Vainio M, Happonen R-P, Parkkola R, Saunavaara J: Recording Speech Sound and Articulation in MRI. Biodevices 2011, 168–173.Google Scholar
- Aalto D, Aaltonen O, Happonen R-P, Jääsaari P, Kivelä A, Kuortti J, Luukinen J-M, Malinen J, Murtola T, Parkkola R: Measurement of acoustic and anatomic changes in oral and maxillofacial surgery patients. arXiv preprint arXiv:13092811 2013.Google Scholar
- Mathiak K, Klose U, Ackermann H, Hertrich I, Kincses WE, Grodd W: Stroboscopic articulography using fast magnetic resonance imaging. Int J Lang Commun Disord 2000, 35: 419–425.View ArticleGoogle Scholar
- Vasconcelos MJ, Ventura SM, Freitas DR, Tavares JMR: Inter-speaker speech variability assessment using statistical deformable models from 3.0 Tesla magnetic resonance images. Proc Inst Mech Eng H J Eng Med 2012, 226: 185–196.View ArticleGoogle Scholar
- Crary MA, Kotzur IM, Gauger J, Gorham M, Burton S: Dynamic magnetic resonance imaging in the study of vocal tract configuration. J Voice 1996, 10: 378–388.View ArticleGoogle Scholar
- Di Girolamo M, Corsetti A, Laghi A, Ferone E, Iannicelli E, Rossi M, Pavone P, Passariello R: Assessment with magnetic resonance of laryngeal and oropharyngeal movements during phonation. La Radiologia Medica 1996, 92: 33.Google Scholar
- Engwall O: A revisit to the Application of MRI to the Analysis of Speech Production-Testing our assumptions. Proc of 6th International Seminar on Speech Production 2003, 43–48.Google Scholar
- Ventura SMR, Freitas DRS, Tavares JMR: Toward dynamic magnetic resonance imaging of the vocal tract during speech production. J Voice 2011, 25: 511–518.View ArticleGoogle Scholar
- Baer T, Gore JC, Gracco LC, Nye PW: Analysis of vocal-tract shape and dimensions using magnetic-resonance-imaging - vowels. J Acoust Soc Am 1991, 90: 799–828.View ArticleGoogle Scholar
- Xiaofeng L, Murano E, Stone M, Prince JL: Harp tracking refinement using seeded region growing. Biomedical Imaging: From Nano to Macro, 2007 ISBI 2007 4th IEEE International Symposium on; 12–15 April 2007 2007, 372–375.Google Scholar
- Stone M, Davis E, Douglas A, Ness Aiver M, Gullapalli R, Levine W, Lundberg A: Modeling tongue surface contours from cine-mri images. J Speech Lang Hear Res 2001, 44: 1026–1040.View ArticleGoogle Scholar
- Ma Z, Tavares JMR, Jorge RN, Mascarenhas T: A review of algorithms for medical image segmentation and their applications to the female pelvic cavity. Comput Methods Biomech Biomed Engin 2010, 13: 235–246.View ArticleGoogle Scholar
- Vasconcelos MJM, Ventura SR, Freitas DRS, Tavares JMR: Using statistical deformable models to reconstruct vocal tract shape from magnetic resonance images. Proc Inst Mech Eng H J Eng Med 2010, 224: 1153–1163.View ArticleGoogle Scholar
- Ventura S, Freitas D, Tavares JMR: Application of MRI and biomedical engineering in speech production study. Comput Methods Biomech Biomed Engin 2009, 12: 671–681.View ArticleGoogle Scholar
- Osher SJ, Sethian JA: Fronts propagation with curvature dependent speed: Algorithms based on Hamilton-Jacobi formulations. J Comput Phys 1988, 79: 12–49.MathSciNetView ArticleMATHGoogle Scholar
- Malladi R, Sethian JA, Vemuri BC: Shape modeling with front propagation - a level set approach. IEEE Trans Pattern Anal Mach Intell 1995, 17: 158–175.View ArticleGoogle Scholar
- Caselles V, Kimmel R, Sapiro G: Geodesic active contours. Computer Vision, 1995 Proceedings, Fifth International Conference on; 20–23 Jun 1995 1995, 694–699.Google Scholar
- Kichenassamy S, Kumar A, Olver P, Tannenbaum A, Yezzi A: Conformal curvatures flows: From phase transitions to active vision. Arch Rational Mech Anal 1996, 134: 275–301.MathSciNetView ArticleMATHGoogle Scholar
- Siddiqi K, Lauziere YB, Tannenbaum A, Zucker SW: Area and length minimizing flows for shape segmentation. IEEE Trans Image Process 1998, 7: 433–443.View ArticleGoogle Scholar
- Xu CY, Prince JL: Snakes, shapes, and gradient vector flow. IEEE Trans Image Process 1998, 7: 359–369.MathSciNetView ArticleMATHGoogle Scholar
- Kass M, Witkin A, Terzopoulos D: Snakes - active contour models. Int J Comput Vis 1987, 1: 321–331.View ArticleGoogle Scholar
- Williams DJ, Shah M: A fast algorithm for active contours and curvature estimation. Cvgip-Image Underst 1992, 55: 14–26.View ArticleMATHGoogle Scholar
- Boersma P, Weenink D: Praat: doing phonetics by computer (Version 5.1. 05) [Computer program]. 2009.Google Scholar
- Browman CP, Goldstein L: Articulatory gestures as phonological units. Phonology 1989, 6: 201–251.View ArticleGoogle Scholar
- Tiilikainen NP: A Comparative Study of Active Contour Snakes. Denmark: Copenhagen University; 2007.Google Scholar
- Johnson MH, Pizza S, Alwan A, Cha JS: Vowell category dependence of the relationship between palate height, tongue height, and oral area. J Speech Lang Hear Res 2003, 46: 738–753.View ArticleGoogle Scholar
- Mokhtari P, Kitamura T, Takemoto H, Honda K: Principal components of vocal-tract area functions and inversion of vowels by linear regression of cepstrum coefficients. J Phon 2007, 35: 20–39.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.