نوع مقاله : مقاله پژوهشی
نویسنده
استادیار زبانشناسی، گروه زبانشناسی، دانشگاه اصفهان، اصفهان، ایران
چکیده
پژوهش های پیشین نشان داده اند که پارامترهای ریتم گفتار میتوانند گویندگان زبانهای گوناگون با ساختارهای واج آرایی متفاوت را از هم تشخیص دهند. به طور خاص، در زبان فارسی تاکنون دو دسته از پارامترهای ریتم گفتار یعنی پارامترهای مبتنی بر دیرش و پارامترهای مبتنی بر شدت بررسی شده اند. با توجه به پژوهش های پیشین، برآنیم تا در پژوهش حاضر بررسی گسترده تری پیرامون قابلیت های فردویژة این پارامترها انجام دهیم. به این منظور، با استفاده از مدل آماری رگرسیون لجستیک چنداسمی، پارامترهای مختلف ریتم گفتار را در پیکرهای متشکل از ۲۰ گویشور مرد فارسیزبان که هر کدام ۱۰۰ جمله فارسی را با سرعت عادی بیان کرده بودند، بررسی کردیم. یافتهها نمایانگر آن بود که پارامترهای مبتنی بر دیرش نسبت به پارامترهای مبتنی بر شدت عملکرد نسبتاً بهتری داشته اند. این احتمال وجود دارد که دلیل برتری این پارامترها به سبب ساختار هجایی ساده زبان فارسی و نیز اتکای بیشتر آن به دیرش برای بازنمایی تکیه واژگانی باشد. یافتههای این پژوهش از این جهت اهمیت دارد که از یک سو به درک پارامترهای مناسبتر در تشخیص هویت گویندههای فارسیزبان کمک میکند و از سویی دیگر، بر این نکته نیز همزمان تأکید میکند که ویژگیهای زبان ویژه در مطالعات تشخیص هویت گوینده بایستی مورد توجه قرار گرفته شوند.
کلیدواژهها
موضوعات
عنوان مقاله [English]
Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity
نویسنده [English]
- Homa Asadi
Assistant Professor of Linguistics, University of Isfahan, Isfahan, Iran
چکیده [English]
Previous studies have demonstrated the efficacy of speech rhythm measures in speaker identification across various languages with different phonotactic structures. In Persian language, in particular, two categories of speech rhythm metrics were examined: duration-based and intensity-based metrics. Building upon these prior works, the current study delves deeper into the discrimination capabilities of the mentioned measurement types—duration-based versus intensity-based—in the context of Persian speakers. To achieve this, a multinomial logistic regression model was employed on a dataset comprising 20 male Persian speakers, each reciting 100 sentences at a normal speaking pace. Findings revealed that, when distinguishing between Persian speakers, duration-based measures outperform intensity-based ones, however, this excellence is very slight. This observation is significant, as it sheds light on the suitability of specific rhythm metrics for Persian speaker identification. I postulate that this discrepancy in performance may be attributed to the simple syllable structure of Persian and the lesser reliance on intensity as a primary indicator of lexical stress. This research contributes valuable insights into the choice of rhythm metrics for optimal Persian speaker identification and underscores the importance of considering linguistic features when developing speaker recognition systems.Top of Form
کلیدواژهها [English]
- forensic phonetics
- speaker identification
- speech rhythm measures
- Persian language
- Asadi, H. & Alinezhad, B. (2023). Between-speaker syllable intensity variability in Persian. In 20th International Congress of the Phonetic Sciences (ICPhS), 3804-3808, Prague, Czech Republic.
- Asadi, H., & Alinezhad, B. (2022). Speech Rhythm Measures: Acoustic Cues for Speaker Identification. Language Research, 12(2), 29-49. https://doi.org/10.22059/jolr.2021.304539.666624
- Asadi, H., Nourbakhsh, M., He, L., Pellegrino, E. & Dellwo, V. (2018). Between-speaker rhythmic variability is not dependent on language rhythm, as evidence from Persian reveals. International Journal of Speech, Language and the Law, 25(2), 151- 174. https://doi.org/10.1558/ijsll.37110
- Bijankhan, M. (2018) Phonology. In A. Sadeghi & P. Shabani-Jadidi (Eds.), The Oxford Handbook of Persian Linguistics, 111–141. Oxford: Oxford University Press.
- Boersma, P. & Weenink, D. (2013). Praat: Doing Phonetics by Computer. http://www.praat.org, Accessed 13 July 2013.
- Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A.A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5(7), e1000436. https://doi.org/10.1371/journal.pcbi.1000436
- Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence. PhD dissertation, Bonn University.
- Dellwo, V. & Fourcin, A. (2013). Rhythmic characteristics of voice between and within languages. Travaux Neuchâtelois de Linguistique, 59: 87–107. https://www.zora.uzh.ch/id/eprint/91230/
- Dellwo, V., Leemann, A. & Kolly, M. (2012). Speaker idiosyncratic rhythm features in the speech signal. In Proceedings of INTERSPEECH, Portland, USA. https://doi.org/10.5167/uzh-68554
- Dellwo, V., Leemann, A., & Kolly, M. J. (2015). Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors. The Journal of the Acoustical Society of America, 137(3), 1513–1528. https://doi.org/10.1121/1.4906837
- Dellwo, V. & Wagner, P. (2003). Relations between language rhythm and speech rate. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS), 471-474. Barcelona, Spain. https://doi.org/10.5167/uzh-111779
- Fry, D.B. (1958). Experiments in the perception of stress. Language and Speech, 1(2), 126-152. https://doi.org/10.1177/002383095800100207
- Garnier, M., Wolfe, J., Henrich, N. & Smith, J. (2008). Interrelationship between vocal effort and vocal tract acoustics: a pilot study. In Proceedings of INTERSPEECH, 2302-2305. Brisbane, Australia. http://dx.doi.org/10.21437/Interspeech.2008-588
- Grabe, E. & Low, E. L. (2002). Durational variability in speech and rhythm class hypothesis. In N. Warner & C. Gussenhoven (Eds.), Papers in Laboratory Phonology 7, 515-543, Berlin and New York: Mouton de Gruyter. https://doi.org/10.1515/9783110197105.2.515
- He, L. & Dellwo, V. (2016). The role of syllable intensity in between-speaker rhythmic variability. The International Journal of Speech, Language and the Law. Vol 23, 243-273. https://doi.org/10.1558/ijsll.v23i2.30345
- He, L., & Dellwo, V. (2014). Speaker idiosyncratic variability of intensity across syllables. In Proceedings of INTERSPEECH, 233-237, Singapore. https://doi.org/10.5167/uzh-103024
- Lazard, G. (1992). Grammar of contemporary Persian. Mazda Publishers.
- Leemann, A., Kolly, M.-J., & Dellwo, V. (2014). Speaker-individuality in suprasegmental temporal features: implications for forensic voice comparison. Forensic Science International, 238, 59-67. https://doi.org/10.1016/j.forsciint.2014.02.019
- Marcus, S. (1981). Acoustic determinants of perceptual center (p-center) location. Perception and Psychophysics, 30, 247–256. https://doi.org/10.3758/bf03214280
- Moez, Ajili., Bonastre, Jean- François., Rossato, Solange. (2018). Voice comparison and rhythm: Behavioral differences between target and non-target comparisons. In Proceedings of INTERSPEECH, 1061-1065. Hyderabad, India. https://doi.org/10.21437/Interspeech.2018-61
- Nolan, F. & Asu, E. L. (2009). The pairwise variability index and coexisting rhythms in language. Phonetica, 66(1–2), 64–77. https://doi.org/10.1159/000208931
- Prieto, P., del Mar Vanrell, M., Astruc, L., Payne, E., & Post, B. (2012). Phonotactic and phrasal properties of speech rhythm. Evidence from Catalan, English, and Spanish. Speech Communication, 54, 681–702. https://doi.org/10.1016/j.specom.2011.12.001
- R Core Team (2021) R: A Language and Environment for Statistical Computing (version 3.3.3). R Foundation for Statistical Computing. http://www.Rproject.org, Accessed 20 November 2021.
- Ramus, F., Nespor, M. & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, Vol 73, 265-292. https://doi.org/10.1016/S0010-0277(00)00101-3
- Rose, P. (2002). Forensic speaker identification, New York: Taylor & Francis.
- Sadeghi, V. (2011). Acoustic correlates of lexical stress in Persian. In Proceedings of the 17th International Congress of Phonetic Sciences (ICPhS), 1738-1741. Hong Kong.
- Sadeghi, V. (2015). A phonetic study of vowel reduction in Persian, Language Related Research, 30, 165–187. http://lrr.modares.ac.ir/article-14-7916-en.html
- Taghva, N., Moloodi, A., & Abolhasanizadeh, V. (2021). Acoustic correlations of speech rhythms in Persian based on variability of between-speakers characteristics. Journal of Researches in Linguistics, 12(2), 27-50. https://doi.org/10.22108/jrl.2021.126261.1535
- Taghva, N., Moloodi, A., Abolhasanizadeh, V., & Tabei, R. (2023). A corpus study of durational rhythmic measures in the Kalhori variety of Kurdish. Loquens, 10(1-2), e098. https://doi.org/10.3989/loquens.2023.e098
- Tilsen, S. & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of theamplitude envelope: characterizing rhythmic patterns within and across languages. Journal of the Acoustical Society of America, 134(1), 628–639. https://doi.org/10.1121/1.4807565
- Wang, Q. (2008). L2 stress perception: The reliance on different acoustic cues. In Speech Prosody, 635-638. Campinas, Brazil.
- Weingartova, Lenka. (2014). Rhythm metrics for speaker identification in Czeck. ActaUniversitatis Carolinae Philologica, 1(10), 33-42.
- White, L. & Mattys, S.L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35(4), 501–522. https://doi.org/10.1016/j.wocn.2007.02.003
- Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O., & Mattys, S. L. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of the Acoustical Society of America, 127(3), 1559–1569. https://doi.org/10.1121/1.3293004
- Windfuhr, G. L. (1979). Persian grammar: History and state of its study. New York: De Gruyter Mouton.
- Yoon, T.J. (2010). Capturing inter-speaker invariance using statistical measures of speech rhythm. In Electronic Proceedings of Speech Prosody, (pp. 1-4), Chicago/IL, USA. https://doi.org/10.21437/SpeechProsody.2010-58