بررسی مقایسه ای پارامترهای ریتم گفتار در تشخیص هویت گویندگان فارسی زبان: دیرش در برابر شدت

اسدی, هما

doi:10.22051/jlr.2023.45448.2370

بررسی مقایسه ای پارامترهای ریتم گفتار در تشخیص هویت گویندگان فارسی زبان: دیرش در برابر شدت

نوع مقاله : مقاله پژوهشی

نویسنده

هما اسدی

استادیار زبان‌شناسی، گروه زبان‌شناسی، دانشگاه اصفهان، اصفهان، ایران

10.22051/jlr.2023.45448.2370

چکیده

پژوهش های پیشین نشان داده اند که پارامترهای ریتم گفتار میتوانند گویندگان زبانهای گوناگون با ساختارهای واج آرایی متفاوت را از هم تشخیص دهند. به طور خاص، در زبان فارسی تاکنون دو دسته از پارامترهای ریتم گفتار یعنی پارامترهای مبتنی بر دیرش و پارامترهای مبتنی بر شدت بررسی شده اند. با توجه به پژوهش های پیشین، برآنیم تا در پژوهش حاضر بررسی گسترده تری پیرامون قابلیت های فردویژة این پارامترها انجام دهیم. به این منظور، با استفاده از مدل آماری رگرسیون لجستیک چنداسمی، پارامترهای مختلف ریتم گفتار را در پیکرهای متشکل از ۲۰ گویشور مرد فارسی‌زبان که هر کدام ۱۰۰ جمله فارسی را با سرعت عادی بیان کرده بودند، بررسی کردیم. یافته‌ها نمایانگر آن بود که پارامترهای مبتنی بر دیرش نسبت به پارامترهای مبتنی بر شدت عملکرد نسبتاً بهتری داشته اند. این احتمال وجود دارد که دلیل برتری این پارامترها به سبب ساختار هجایی ساده زبان فارسی و نیز اتکای بیشتر آن به دیرش برای بازنمایی تکیه واژگانی باشد. یافته‌های این پژوهش از این جهت اهمیت دارد که از یک سو به درک پارامترهای مناسبتر در تشخیص هویت گوینده‌های فارسیزبان کمک میکند و از سویی دیگر، بر این نکته نیز همزمان تأکید میکند که ویژگیهای زبان ویژه در مطالعات تشخیص هویت گوینده بایستی مورد توجه قرار گرفته شوند.

کلیدواژه‌ها

موضوعات

آواشناسی و واج‌شناسی

عنوان مقاله [English]

Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity

نویسنده [English]

Homa Asadi

Assistant Professor of Linguistics, University of Isfahan, Isfahan, Iran

چکیده [English]

Previous studies have demonstrated the efficacy of speech rhythm measures in speaker identification across various languages with different phonotactic structures. In Persian language, in particular, two categories of speech rhythm metrics were examined: duration-based and intensity-based metrics. Building upon these prior works, the current study delves deeper into the discrimination capabilities of the mentioned measurement types—duration-based versus intensity-based—in the context of Persian speakers. To achieve this, a multinomial logistic regression model was employed on a dataset comprising 20 male Persian speakers, each reciting 100 sentences at a normal speaking pace. Findings revealed that, when distinguishing between Persian speakers, duration-based measures outperform intensity-based ones, however, this excellence is very slight. This observation is significant, as it sheds light on the suitability of specific rhythm metrics for Persian speaker identification. I postulate that this discrepancy in performance may be attributed to the simple syllable structure of Persian and the lesser reliance on intensity as a primary indicator of lexical stress. This research contributes valuable insights into the choice of rhythm metrics for optimal Persian speaker identification and underscores the importance of considering linguistic features when developing speaker recognition systems.Top of Form

کلیدواژه‌ها [English]

forensic phonetics
speaker identification
speech rhythm measures
Persian language

مراجع

Asadi, H. & Alinezhad, B. (2023). Between-speaker syllable intensity variability in Persian. In 20th International Congress of the Phonetic Sciences (ICPhS), 3804-3808, Prague, Czech Republic.
Asadi, H., & Alinezhad, B. (2022). Speech Rhythm Measures: Acoustic Cues for Speaker Identification. Language Research, 12(2), 29-49. https://doi.org/10.22059/jolr.2021.304539.666624
Asadi, H., Nourbakhsh, M., He, L., Pellegrino, E. & Dellwo, V. (2018). Between-speaker rhythmic variability is not dependent on language rhythm, as evidence from Persian reveals. International Journal of Speech, Language and the Law, 25(2), 151- 174. https://doi.org/10.1558/ijsll.37110
Bijankhan, M. (2018) Phonology. In A. Sadeghi & P. Shabani-Jadidi (Eds.), The Oxford Handbook of Persian Linguistics, 111–141. Oxford: Oxford University Press.
Boersma, P. & Weenink, D. (2013). Praat: Doing Phonetics by Computer. http://www.praat.org, Accessed 13 July 2013.
Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A.A. (2009). The natural statistics of audiovisual speech. PLoS Computational Biology, 5(7), e1000436. https://doi.org/10.1371/journal.pcbi.1000436
Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence. PhD dissertation, Bonn University.
Dellwo, V. & Fourcin, A. (2013). Rhythmic characteristics of voice between and within languages. Travaux Neuchâtelois de Linguistique, 59: 87–107. https://www.zora.uzh.ch/id/eprint/91230/
Dellwo, V., Leemann, A. & Kolly, M. (2012). Speaker idiosyncratic rhythm features in the speech signal. In Proceedings of INTERSPEECH, Portland, USA. https://doi.org/10.5167/uzh-68554
Dellwo, V., Leemann, A., & Kolly, M. J. (2015). Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors. The Journal of the Acoustical Society of America, 137(3), 1513–1528. https://doi.org/10.1121/1.4906837
Dellwo, V. & Wagner, P. (2003). Relations between language rhythm and speech rate. In Proceedings of the 15^th International Congress of Phonetic Sciences (ICPhS), 471-474. Barcelona, Spain. https://doi.org/10.5167/uzh-111779
Fry, D.B. (1958). Experiments in the perception of stress. Language and Speech, 1(2), 126-152. https://doi.org/10.1177/002383095800100207
Garnier, M., Wolfe, J., Henrich, N. & Smith, J. (2008). Interrelationship between vocal effort and vocal tract acoustics: a pilot study. In Proceedings of INTERSPEECH, 2302-2305. Brisbane, Australia. http://dx.doi.org/10.21437/Interspeech.2008-588
Grabe, E. & Low, E. L. (2002). Durational variability in speech and rhythm class hypothesis. In N. Warner & C. Gussenhoven (Eds.), Papers in Laboratory Phonology 7, 515-543, Berlin and New York: Mouton de Gruyter. https://doi.org/10.1515/9783110197105.2.515
He, L. & Dellwo, V. (2016). The role of syllable intensity in between-speaker rhythmic variability. The International Journal of Speech, Language and the Law. Vol 23, 243-273. https://doi.org/10.1558/ijsll.v23i2.30345
He, L., & Dellwo, V. (2014). Speaker idiosyncratic variability of intensity across syllables. In Proceedings of INTERSPEECH, 233-237, Singapore. https://doi.org/10.5167/uzh-103024
Lazard, G. (1992). Grammar of contemporary Persian. Mazda Publishers.
Leemann, A., Kolly, M.-J., & Dellwo, V. (2014). Speaker-individuality in suprasegmental temporal features: implications for forensic voice comparison. Forensic Science International, 238, 59-67. https://doi.org/10.1016/j.forsciint.2014.02.019
Marcus, S. (1981). Acoustic determinants of perceptual center (p-center) location. Perception and Psychophysics, 30, 247–256. https://doi.org/10.3758/bf03214280
Moez, Ajili., Bonastre, Jean- François., Rossato, Solange. (2018). Voice comparison and rhythm: Behavioral differences between target and non-target comparisons. In Proceedings of INTERSPEECH, 1061-1065. Hyderabad, India. https://doi.org/10.21437/Interspeech.2018-61
Nolan, F. & Asu, E. L. (2009). The pairwise variability index and coexisting rhythms in language. Phonetica, 66(1–2), 64–77. https://doi.org/10.1159/000208931
Prieto, P., del Mar Vanrell, M., Astruc, L., Payne, E., & Post, B. (2012). Phonotactic and phrasal properties of speech rhythm. Evidence from Catalan, English, and Spanish. Speech Communication, 54, 681–702. https://doi.org/10.1016/j.specom.2011.12.001
R Core Team (2021) R: A Language and Environment for Statistical Computing (version 3.3.3). R Foundation for Statistical Computing. http://www.Rproject.org, Accessed 20 November 2021.
Ramus, F., Nespor, M. & Mehler, J. (1999). Correlates of linguistic rhythm in the speech signal. Cognition, Vol 73, 265-292. https://doi.org/10.1016/S0010-0277(00)00101-3
Rose, P. (2002). Forensic speaker identification, New York: Taylor & Francis.
Sadeghi, V. (2011). Acoustic correlates of lexical stress in Persian. In Proceedings of the 17^th International Congress of Phonetic Sciences (ICPhS), 1738-1741. Hong Kong.
Sadeghi, V. (2015). A phonetic study of vowel reduction in Persian, Language Related Research, 30, 165–187. http://lrr.modares.ac.ir/article-14-7916-en.html
Taghva, N., Moloodi, A., & Abolhasanizadeh, V. (2021). Acoustic correlations of speech rhythms in Persian based on variability of between-speakers characteristics. Journal of Researches in Linguistics, 12(2), 27-50. https://doi.org/10.22108/jrl.2021.126261.1535
Taghva, N., Moloodi, A., Abolhasanizadeh, V., & Tabei, R. (2023). A corpus study of durational rhythmic measures in the Kalhori variety of Kurdish. Loquens, 10(1-2), e098. https://doi.org/10.3989/loquens.2023.e098
Tilsen, S. & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of theamplitude envelope: characterizing rhythmic patterns within and across languages. Journal of the Acoustical Society of America, 134(1), 628–639. https://doi.org/10.1121/1.4807565
Wang, Q. (2008). L2 stress perception: The reliance on different acoustic cues. In Speech Prosody, 635-638. Campinas, Brazil.
Weingartova, Lenka. (2014). Rhythm metrics for speaker identification in Czeck. ActaUniversitatis Carolinae Philologica, 1(10), 33-42.
White, L. & Mattys, S.L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35(4), 501–522. https://doi.org/10.1016/j.wocn.2007.02.003
Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O., & Mattys, S. L. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of the Acoustical Society of America, 127(3), 1559–1569. https://doi.org/10.1121/1.3293004
Windfuhr, G. L. (1979). Persian grammar: History and state of its study. New York: De Gruyter Mouton.
Yoon, T.J. (2010). Capturing inter-speaker invariance using statistical measures of speech rhythm. In Electronic Proceedings of Speech Prosody, (pp. 1-4), Chicago/IL, USA. https://doi.org/10.21437/SpeechProsody.2010-58

دوره 15، شماره 49 - شماره پیاپی 49
اسفند 1402
صفحه 61-82

تعداد مشاهده مقاله: 312
تعداد دریافت فایل اصل مقاله: 258

بررسی مقایسه ای پارامترهای ریتم گفتار در تشخیص هویت گویندگان فارسی زبان: دیرش در برابر شدت

Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity

مراجع

دوره 15، شماره 49 - شماره پیاپی 49
اسفند 1402
صفحه 61-82

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

بررسی مقایسه ای پارامترهای ریتم گفتار در تشخیص هویت گویندگان فارسی زبان: دیرش در برابر شدت

Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity

مراجع

دوره 15، شماره 49 - شماره پیاپی 49اسفند 1402صفحه 61-82

فایل ها

هم رسانی

ارجاع به این مقاله

آمار

دوره 15، شماره 49 - شماره پیاپی 49
اسفند 1402
صفحه 61-82