Previous studies have demonstrated the efficacy of speech rhythm measures in speaker identification across various languages with different phonotactic structures. In Persian language, in particular, two categories of speech rhythm metrics were examined: duration-based and intensity-based metrics. Building upon these prior works, the current study delves deeper into the discrimination capabilities of the mentioned measurement types—duration-based versus intensity-based—in the context of Persian speakers. To achieve this, a multinomial logistic regression model was employed on a dataset comprising 20 male Persian speakers, each reciting 100 sentences at a normal speaking pace. Findings revealed that, when distinguishing between Persian speakers, duration-based measures outperform intensity-based ones, however, this excellence is very slight. This observation is significant, as it sheds light on the suitability of specific rhythm metrics for Persian speaker identification. I postulate that this discrepancy in performance may be attributed to the simple syllable structure of Persian and the lesser reliance on intensity as a primary indicator of lexical stress. This research contributes valuable insights into the choice of rhythm metrics for optimal Persian speaker identification and underscores the importance of considering linguistic features when developing speaker recognition systems.Top of Form
Asadi, H. & Alinezhad, B. (2023). Between-speaker syllable intensity variability in Persian. In 20th International Congress of the Phonetic Sciences (ICPhS), 3804-3808, Prague, Czech Republic.
Asadi, H., Nourbakhsh, M., He, L., Pellegrino, E. & Dellwo, V. (2018). Between-speaker rhythmic variability is not dependent on language rhythm, as evidence from Persian reveals. International Journal of Speech, Language and the Law, 25(2), 151- 174. https://doi.org/10.1558/ijsll.37110
Bijankhan, M. (2018) Phonology. In A. Sadeghi & P. Shabani-Jadidi (Eds.), The Oxford Handbook of Persian Linguistics, 111–141. Oxford: Oxford University Press.
Boersma, P. & Weenink, D. (2013). Praat: Doing Phonetics by Computer. http://www.praat.org, Accessed 13 July 2013.
Chandrasekaran, C., Trubanova, A., Stillittano, S., Caplier, A. & Ghazanfar, A.A. (2009). The natural statistics of audiovisual speech. PLoS ComputationalBiology, 5(7), e1000436. https://doi.org/10.1371/journal.pcbi.1000436
Dellwo, V. (2010). Influences of speech rate on the acoustic correlates of speech rhythm: An experimental phonetic study based on acoustic and perceptual evidence. PhD dissertation, Bonn University.
Dellwo, V. & Fourcin, A. (2013). Rhythmic characteristics of voice between and within languages. Travaux Neuchâtelois de Linguistique, 59: 87–107. https://www.zora.uzh.ch/id/eprint/91230/
Dellwo, V., Leemann, A. & Kolly, M. (2012). Speaker idiosyncratic rhythm features in the speech signal. In Proceedings of INTERSPEECH,Portland, USA. https://doi.org/10.5167/uzh-68554
Dellwo, V., Leemann, A., & Kolly, M. J. (2015). Rhythmic variability between speakers: articulatory, prosodic, and linguistic factors. The Journal of the Acoustical Society of America, 137(3), 1513–1528. https://doi.org/10.1121/1.4906837
Dellwo, V. & Wagner, P. (2003). Relations between language rhythm and speech rate. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS),471-474. Barcelona, Spain. https://doi.org/10.5167/uzh-111779
Garnier, M., Wolfe, J., Henrich, N. & Smith, J. (2008). Interrelationship between vocal effort and vocal tract acoustics: a pilot study. In Proceedings ofINTERSPEECH, 2302-2305. Brisbane, Australia. http://dx.doi.org/10.21437/Interspeech.2008-588
Grabe, E. & Low, E. L. (2002). Durational variability in speech and rhythm class hypothesis. In N. Warner & C. Gussenhoven (Eds.), Papers in Laboratory Phonology 7, 515-543, Berlin and New York: Mouton de Gruyter. https://doi.org/10.1515/9783110197105.2.515
He, L. & Dellwo, V. (2016). The role of syllable intensity in between-speaker rhythmic variability. The International Journal of Speech, Language and the Law. Vol 23, 243-273. https://doi.org/10.1558/ijsll.v23i2.30345
He, L., & Dellwo, V. (2014). Speaker idiosyncratic variability of intensity across syllables. In Proceedings of INTERSPEECH,233-237, Singapore. https://doi.org/10.5167/uzh-103024
Lazard, G. (1992). Grammar of contemporary Persian. Mazda Publishers.
Leemann, A., Kolly, M.-J., & Dellwo, V. (2014). Speaker-individuality in suprasegmental temporal features: implications for forensic voice comparison. Forensic Science International, 238, 59-67. https://doi.org/10.1016/j.forsciint.2014.02.019
Marcus, S. (1981). Acoustic determinants of perceptual center (p-center) location. Perception and Psychophysics, 30, 247–256. https://doi.org/10.3758/bf03214280
Moez, Ajili., Bonastre, Jean- François., Rossato, Solange. (2018). Voice comparison and rhythm: Behavioral differences between target and non-target comparisons. In Proceedings ofINTERSPEECH, 1061-1065. Hyderabad, India. https://doi.org/10.21437/Interspeech.2018-61
Nolan, F. & Asu, E. L. (2009). The pairwise variability index and coexisting rhythms in language. Phonetica, 66(1–2), 64–77. https://doi.org/10.1159/000208931
Prieto, P., del Mar Vanrell, M., Astruc, L., Payne, E., & Post, B. (2012). Phonotactic and phrasal properties of speech rhythm. Evidence from Catalan, English, and Spanish. Speech Communication, 54, 681–702. https://doi.org/10.1016/j.specom.2011.12.001
R Core Team (2021) R: A Language and Environment for Statistical Computing (version 3.3.3). R Foundation for Statistical Computing. http://www.Rproject.org, Accessed 20 November 2021.
Rose, P. (2002). Forensic speaker identification, New York: Taylor & Francis.
Sadeghi, V. (2011). Acoustic correlates of lexical stress in Persian. In Proceedings of the 17th International Congress of Phonetic Sciences (ICPhS),1738-1741. Hong Kong.
Taghva, N., Moloodi, A., & Abolhasanizadeh, V. (2021). Acoustic correlations of speech rhythms in Persian based on variability of between-speakers characteristics. Journal of Researches in Linguistics, 12(2), 27-50. https://doi.org/10.22108/jrl.2021.126261.1535
Taghva, N., Moloodi, A., Abolhasanizadeh, V., & Tabei, R. (2023). A corpus study of durational rhythmic measures in the Kalhori variety of Kurdish. Loquens, 10(1-2), e098. https://doi.org/10.3989/loquens.2023.e098
Tilsen, S. & Arvaniti, A. (2013). Speech rhythm analysis with decomposition of theamplitude envelope: characterizing rhythmic patterns within and across languages. Journal of the Acoustical Society of America, 134(1), 628–639. https://doi.org/10.1121/1.4807565
Wang, Q. (2008). L2 stress perception: The reliance on different acoustic cues. In Speech Prosody, 635-638. Campinas, Brazil.
Weingartova, Lenka. (2014). Rhythm metrics for speaker identification in Czeck. ActaUniversitatis Carolinae Philologica, 1(10), 33-42.
White, L. & Mattys, S.L. (2007). Calibrating rhythm: First language and second language studies. Journal of Phonetics, 35(4), 501–522. https://doi.org/10.1016/j.wocn.2007.02.003
Wiget, L., White, L., Schuppler, B., Grenon, I., Rauch, O., & Mattys, S. L. (2010). How stable are acoustic metrics of contrastive speech rhythm? Journal of theAcoustical Society of America, 127(3), 1559–1569. https://doi.org/10.1121/1.3293004
Windfuhr, G. L. (1979). Persian grammar: History and state of its study. New York: De Gruyter Mouton.
Yoon, T.J. (2010). Capturing inter-speaker invariance using statistical measures of speech rhythm. In Electronic Proceedings of Speech Prosody, (pp. 1-4), Chicago/IL, USA. https://doi.org/10.21437/SpeechProsody.2010-58
Asadi, H. (2024). Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity. ZABANPAZHUHI (Journal of Language Research), 15(49), 61-82. doi: 10.22051/jlr.2023.45448.2370
MLA
Homa Asadi. "Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity", ZABANPAZHUHI (Journal of Language Research), 15, 49, 2024, 61-82. doi: 10.22051/jlr.2023.45448.2370
HARVARD
Asadi, H. (2024). 'Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity', ZABANPAZHUHI (Journal of Language Research), 15(49), pp. 61-82. doi: 10.22051/jlr.2023.45448.2370
VANCOUVER
Asadi, H. Comparative Analysis of Speech Rhythm Measures for Persian Speaker Identification: Duration vs. Intensity. ZABANPAZHUHI (Journal of Language Research), 2024; 15(49): 61-82. doi: 10.22051/jlr.2023.45448.2370