Providing a suitable method for allophonic labeling of speech corpuses according to the IPA system

Document Type : Research

Authors

1 MA, Computational linguistics, Department of Linguistics, Faculty of Foreign Languages, University of Isfahan,Isfahan, Iran

2 PHD, Linguistics, Associate Professor and Faculty Member in Department of Linguistics, Faculty of Foreign Languages, University of Isfahan, Isfahan, Iran

3 PHD, Artificial intelligence, Assistant Professor and Faculty Member in Department of Artificial intelligence, Faculty of Computer, University of Isfahan, Isfahan,Iran

4 PHD, Artificial intelligence, Assistant Professor and Faculty Member, Faculty of Mathematics, Statistics and Computer Science, University of Tehran, Tehran, Iran

Abstract

The corpus is a collection of spoken and / or written texts that can be used for linguistic analysis. More precisely, it can be said that these texts are purposefully labeled and categorized based on specific rules and allow the user to do various studies. Corpus linguistics is a branch of applied linguistics that examines and compares different aspects of linguistic data, and, of course, corpora are integral tools of this branch of linguistics. Due to the increasing role and importance of corpus linguistics in development of various sciences in recent decades, the produce and development of various linguistic corpora has been one of the priorities of scientists and researchers in different languages ​​during these years.
After the creation of speech processing systems since about two decades ago, the use of context-dependent methods has become particularly prominent  in an effort to increase the accuracy of these systems and some special studies  conduct in linguistics,. One of the best ways to achieve this, is to use corpora that, have special labels in addition to segmentation at the phoneme level, to indicate the differentiation of various allophones. These allophnescan only be achieved by obtaining the necessary phonological rules. In linguistics, this process can be called allophonic labeling of corpus.
About 10 years after the introduction of allophonic corpora in the world, no allophonic labeling has been performed for any of Persian language corpora yet. The small Farsdat corpus is the main spoken corpus in Persian. Hence, the need to equip this corpus with allophonic labels to increase the accuracy, to improve the performance of speech processing systems , and to produce specific study, research programs, and tools in linguistic is obvious. In order to elucidate the method proposed in the present study for allophonic labeling of phonemic corpuses, and in parallel for equipping the Persian language with at least one allophonic corpus, the steps of the task are precisely performed on the small Farsdat phonemic corpus. The corpus is one of Persian-language corpora in the last two decades that consists of 6080 sentences spoken by 304 Persian speakers. The speakers of this corpus have indeed one of the most widely spoken dialects in Persian and all of sentences in this corpus, are segmented in to different levels. The segmentation of sentences in word and phoneme levels results in their efficiency in various speech processing systems, such as speech recognition systems, broad transcription systems, and text-to-speech systems. Moreover,  the small Farsdat corpus has the potential to be used in the systems.
The suggested solution to prepare an allophonic corpus is to implement a program using the rule-based method and applying it on the phonemic corpus to add allophonic labels on it. The basis of the rule-based method in this research is access to rules for converting phonemes into allophones. After compiling these rules from the resources available in each language and preparing the appropriate settings (for implementation),  the program is implemented. Finally by applying this program to the phonemic corpus, an allophonic corpus is prepared.
As noted, special phonological rules are required to convert phonemes into allophones in Persian and to add allophonic labels to the small Farsdat corpus. The purpose of this research is not to study phonemes based on acoustic and laboaratory approaches in order to obtain Persian allophones; but rather to formulate and synchronize phonemes identified in various studies and then to adapt them to the International Phonetic Alphabet System. This ultimately leads to provide a standard set of allophones as far as possible and to achieve the phonological rules necessary for converting phonemes into allophones in Persian (based on existing studies.
Although one of the limitations of this study is its incompleteness regarding the extraction of different allophones in Persian, the implemented program has the capability to be updated. if  any studies are carried out in the field of allophones to supplement the existing theoretical resources in the future, it has the possibility to be  to modified or to be enhanced regarding the performance . The present study may also highlight the need for more recent linguistic experiments and the use of more accurate tools and facilities to identify Persian phonemes. This can increase the motivation of phonetics and phonology researchers to take more practical steps in this field as well.
After providing the necessary preparations in the phonemic corpus (such as the syllable segmentation) and implementing the above rules, the allophonic labels can be added to the phonemic corpus by implementing this program on it.

Keywords


ReferencesAhmadi, T., Karshenas, H., Alinezhad, B., & Naqavi Ravandi, M. (2018, February). Automatic syllabification of Persian words based on Pulgram principles. Paper presented at thethe Fifth international Conference of Language Studies. [In Persian].
Ahmadi, T., Karshenas, H., Babaali, B., & Alinezhad, B. (2020). Automatic recognition of Persian phonemes using allophone modeling. Research cCnter of Intelligent Signal Processing, 17 (3), 37-54 [In Persian].
Alinezhad, B. (2010). Persian aspiration and voicing in laryngeal phonology. Journal of Researches in Linguistics, 1 (2), 63-80 [In Persian].
Alinezhad, B. (2016). Fundamentals of phonology. Isfshsn: University of Isfahan [In Persian].
Alinezhad, B., & Hosseini Balam, F. (2013). Fundamentals of acoustic phonetics. Isfahan: University of Isfahan [In Persian].
Alinezhad, B., & Mirsaeedi, A. (2014). The Phonological process of consonant-to-consonant assimilation in Persian: An acoustic exploration authors. Journal of Language Researchi, 6 (11), 163-183 [In Persian].
Babaali, B. (2016). A state-of-the-art and efficient framework for Persian speech recognition. Research Center of Intelligent Signal Processing, 13(3), 51-62 [In Persian].
Bahrani, M. (2005). Using context-dependent structures for continuous speech recognition based on the hidden Markov model(Master's thesis). Sharif University of Technology, Tehran, Iran [In Persian].
Bijankhan, M. (2001). Persian allophones system in the framework of articulatory phonemics theory. Journal of the Faculty of Literature and Humanities. 44 (156), 95-117 [In Persian].
Bijankhan, M. (2005). The phonology of optimality theory. Tehran: SAMT [In Persian].
Bijankhan, M. (2013). Phonetic system of Persian language. Tehran: SAMT [In Persian].
Bijankhan, M., Sheikhzadegan, M. J., & Roohani, M. R. (1994). FARSDAT-The speech database of Farsi spoken language. In R. Togneri (Ed.), Proceedings of the 5th Australian International Conference on Speech science and Technology (Vol.2, pp. 826-829). Perth: Australian Speech Science and Technology Association.
Chomsky, N., & Halle, M. (1968). The sound pattern of English. New York: Harper and Row.
Deihaim, G. (1979). An introduction to general phonetics. Tehran: National University of Iran [In Persian].
Gussenhoven, C., & Jacobs, H. (2017). Understanding phonology. Abingdon: Routledge.
Haghshenas, A. M. (2013). Phonetic. Tehran: Agah [In Persian].
Hardcastle, W. J., Laver J., & Gibbon, F.E. (2010). The handbook of phonetic sciences. New York: John Wiley & Sons.
Imedjdouben, F.,& Houacine, A. (2015, November). Generation of allophones for speech synthesis dedicated to the Arabic language. Papar presented at First International Conference on New Technologies of Information and Communication (NTIC), Mila, Algeria. Retrieved from https://ieeexplore.ieee.org
Kodr Zafaranloo Kambozia, A. (2013). Phonology rule-based approach. Tehran: SAMT [In Persian].
Ladefoged, P., & Johnson, K. (2014). A course in phonetics. Canada: Nelson Education.
Meshkato Dini, M. (2009). The sound pattern of language. Mashhad: Ferdowsi University of Mashhad [In Persian].
Mirsaeidi, A. S. (2011). Phonetic study of phonological process assimilation and dissimilation in Persian (PhD dessertation).Isfahan University, Isfahan, Iran [In Persian].
Modarresi Ghavami, G. (2007). Neutralization of contradiction between voiced and unvoiced stops in Persian. Journal of Proceeding of Allameh TabatabaeeU niversity. 219, 441-454 [In Persian].
Modarresi Ghavami, G. (2011). Phonetics: The scientific study of speech. Tehran: SAMT [In Persian].
Noorbakhsh, M. (2013). Physical phonology using computer. Tehran: Elm [In Persian].
Noorbakhsh, M., Bijankhan, M., & Rohani, H. (2010). Perception of voice onset time (VOT) in standard Persian initial stops. Journal of Language Research, 1 (2), 173-203 [In Persian].
Norbakhsh, M. (2015). Uvular consonants in standard Persian. Journal of Language Research7 (15), 151-170 [In Persian].
Retrievedfrom <https://www.inf.pucrs.br/~propor2010/proceedings/regular_papers/VeigaEtAl.pdf>
Roach, P. (2010). English phonetics and phonology: A practical course. Stuttgart: Ernst Klett Sprachen.
Sadeghi, V. (2007). The effect of aspiration on Persian stop voicing contrast. Journal of Language and Linguistics, 65-84 [In Persian].
Sadeghi, V. (2010). The phonetics and phonology of Persian glottal consonants. Journal of Researches in Linguistics. 2 (1), 49-62 [In Persian].
Samareh, Y. (1999). Phonetics of Persian language. Tehran: Academic Publishing Center [In Persian].
Sameti, H., & Bahrani, M. (2005). Extraction and modeling context dependent phone units for improvement of continuous speech recognition accuracy by clustering. Journal of Electrical Engineering and Computer Engineering of Iran, 3 (1), 45-51 [In Persian].
Sepanta, S. (1998). Acoustic phonetics of Persian language. Isfahan: Golha [In Persian].
Sharifi Atashgah, M., & Sadeghi, V. (2011). Phoneme recognition algorithm design using the acoustic correlates of the phonological features. Journsl of Signal and Data Processing, 2 (16), 13-28 [In Persian].
Veiga, A., Candeias, S., Sá, L., & Perdigão, F. (2010, April). Using coarticulationrules in automatic phonetic transcription. Paper presented at International Conf. on Computational Processing of Portuguese - PROPOR, 2010. Porto Alegre, Brazil. Retrieved from https://www.researchgate.net
Weisser, M. (2016). Practical corpus linguistics: An introduction to corpus-based language analysis. New York: John Wiley & Sons.
Xu, J., Pan, J., & Yan, Y. (2016). Agglutinative language speech recognition using automatic allophone deriving. Chinese Journal of Electronics, 25(2), 328-333.
Yarmohammadi, L. (1985). An iIntroduction to phonetics. Tehran: University Publication Center [In Persian].
Zahedi, K., & Fakharian, F. (2010). Consonantal assimilation in modern Persian: Afeature Geometry approach, Journal of Researches in Linguistics. 3 (5), 47-64 [In Persian]