An overview of prosodic transcription systems: a comparison of the "Tones and Break Indices: ToBI" and " Rhythm and Pitch: RaP"

Document Type : Research

Authors

1 Ph.D. Candidate of Linguistics, Allameh Tabataba’i University. Tehran, Iran.

2 Associate professor of Linguistics, Allameh Tabataba’i University , Tehran, Iran.

3 Associate professor, Dept. of Communicative Sciences and Disorders College of Communication Arts & Sciences, Michigan State University, Detroit, USA.

Abstract

The selection of the appropriate labeling system in any prosodic study depends on the research purpose. In the current research, we have reviewed the labeling system known as Tones and Break Indices (ToBI) (Pierrehumbert & Hirschberg, 1990) and its alternative labeling systems including Rhythm and Pitch (RaP) (Dilley, 2005; Dilley & Brown, 2005; Dilley et al., 2006). The problems of the ToBI system were summarized and presented. Furthermore, a review of the studies conducted on intonation in Persian using the ToBI system within the framework of Auto-segmental Metrical theory (AM) showed that the global problems of this system is also observable in its application for the analysis of Persian intonation patterns (e.g. Eslami, 2005; Sadeghi, 2018).
Originally, the main goal of ToBI was to provide a standard transcription tool for labeling intonational features, including prominence patterns and prosodic structure of an utterance so that different users with different working fields could use and interpret each other's linguistic data. In the ToBI transcription system, L and H represent low and high tones, respectively. The diacritic * represents pitch accent, and % represents boundary tones (Beckman & Elam, 1997). This system was initially designed for transcribing the intonation and prosodic structure of English utterances (Silverman et al., 1992; Beckman & Hirschberg, 1994; Beckman & Ayers Elam, 1997; Beckman, Hirschberg, and Shattuck- Hufnagel 2005), as well as a few typologically different languages— for example, GToBI for German (Grice & Benzmüller, 1995), K- ToBI for Korean (Beckman & Jun, 1996; Jun 2000), and J_ToBI for Japanese (Venditti, 1997), and Persian (Eslami,2005). Jun (2022), Ladd (2022), and Dilley and Breen (2022) have identified the shortcomings and problems of the ToBI phonetic labeling system to create an International Prosodic Alphabet (IPrA) (Hualde & Prieto, 2016).
The Rhythm and Pitch (RaP) system based on enhanced Auto-segmental Metrical theory (AM+) was proposed by Dilley and her colleagues (Dilley, 2005; Dilley & Brown, 2005; Dilley et al., 2006; Dilley & Breen, 2012) to overcome the difficulties of ToBI in showing variations and gradation of the categories and to emphasize the importance of distinguishing rhythmic or metrical prominence from pitch prominence. In this system, pitch information is labeled as three tonal targets (H, L, E) and compared to the previous pitch pattern (higher, lower, or equal to it) in the speech signal. Therefore, labels in RaP have a phonetic representation. Metrical prominence (at three levels of strong, weak, and none) and prosodic structure (at two levels, intonational phrase (IP) and intermediate phrase (ip)) are labeled in the rhythm layer. Although RaP was presented as a “method of transcribing rhythm and related pitch in English” (Dilley and Brown, 2005 p, 2), the concepts and principles of this system can be applied to other languages.

Keywords

Main Subjects


  1. Abolhasani Zadeh, V. Bijankhan, M., & Gussenhoven, C. (2012). The Persian pitch accent & its retention after focus. Lingua, 122, 1380-1394. https://doi.org/10.1016/j.lingua.2012.06.002
  2. Arbisi-Kelm, T. (2006). An intonational analysis of disfluency patterns in stuttering [Doctoral dissertation, University of California]. https://doi.org/10.21437/Interspeech.2005-47
  3. Arvaniti, A., & Baltazani, M. (2005). Intonational analysis and prosodic annotation of Greek corpora. In Jun, S.-A. (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 84–117). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199249633.003.0004
  4. Arvaniti, A., D. R. Ladd, & I. Mennen. (1998). Stability of Tonal Alignment: The Case of Greek Prenuclear Accents. Journal of Phonetics, 26, 3–25. https://doi.org/10.1006/jpho.1997.0063
  5. Bartels, C., & Kingston, J. (1994). Salient pitch cues in the perception of contrastive focus. The Journal of the Acoustical Society of America, 95(5_Supplement), 2973-2973. https://doi.org/10.1121/1.408967
  6. Beckman, M. E., and G. M. Ayers Elam. 1997. Guidelines for ToBI Labeling. [Unpublished manuscript]. USA: Ohio State University. https://www.ling.ohio-state.edu/research/phonetics/E_ToBI/
  7. Beckman, M. E., & Hirschberg, J. (1994). The ToBI Annotation Conventions. [Unpublished manuscript]. USA: Ohio State University. http://www.ling.ohio-state.edu/research/phonetics/E_ToBI/ToBI/ToBI.6.html
  8. Beckman, M. E., & Pierrehumbert, J. (1986). Intonational Structure in Japanese and English. Phonology Yearbook, 3, 255–309. https://doi.org/10.1017/S095267570000066X
  9. Beckman, M. E., Hirschberg, J. & Shattuck- Hufnagel, S. (2005). The Original ToBI System and the Evolution of the ToBI Framework. In S. A. June (Ed.), Prosodic Models and Transcription: Towards Prosodic Typology (pp. 9–54). Oxford University Press. https://doi.org/10.7916/D87P97T5
  10. Birch S, Clifton, C. Jr. (1995). Focus, accent, and argument structure: Effects on language comprehension. Language and Speech, 38(4), 365–392. https://doi.org/10.1177/002383099503800403
  11. Breen, M., Dilley, L. C. Kraemer, J. & E. Gibson. (2012). Inter-Transcriber Reliability for Two Systems of Prosodic Annotation: ToBI (Tones and Break Indices) and RaP (Rhythm and Pitch). Corpus Linguistics and Linguistic Theory, 8, 277–312. https://doi.org/10.1515/cllt-2012-0011
  12. Brugos, A., & Shattuck-Hufnagel, S. (2012, July 31). A proposal for labelling prosodic disfluencies in ToBI [Conference presentation]. Advancing Prosodic Transcription for Spoken Language Science and Technology, Stuttgart, Germany. https://blogs.bu.edu/prosodylab/2012/08/14/poster-a-proposal-for-labelling-prosodic-disfluencies-in-tobi/
  13. Dilley, L. C., & Breen, M. (2022). An enhanced autosegmental-metrical theory (AM+) facilitates phonetically transparent prosodic annotation: A reply to Jun. In J. Barnes & S. Shattuck-Hufnagel (Eds.), Prosodic Theory and Practice (pp. 182-203). Cambridge, MA: MIT Press. https://doi.org/10.21437/TAL.2018-14
  14. Dilley, L. C., & McAuley, J. D. (2008). Distal prosodic context affects word segmentation and lexical processing. Journal of Memory and Language, 59, 294–311. https://doi.org/10.1016/j.jml.2008.06.006
  15. Dilley, L. C., Ladd, D. R., & Schepman, A. (2005). Alignment of L & H in Bitonal Pitch Accents: Testing Two Hypotheses. Journal of Phonetics, 33, 115–119. https://doi.org/10.1016/j.wocn.2004.02.003
  16. Dilley, L. C. & Brown, M. (2005). The RaP (rhythm and pitch) labeling system. v. 1.0. Massachusetts Institute of Technology. https://pdfs.semanticscholar.org/5f73/1dbcafb2b64da6eb15daa67718866bc74cc9.pdf
  17. Dilley, L. C., Breen, M., Bolivar, M., Kraemer, J., & Gibson, E. (2006). A comparison of inter-transcriber reliability for two systems of prosodic annotation: RaP (Rhythm and Pitch) and ToBI (Tones and Break Indices). In INTERSPEECH (pp. 317–320). http://hdl.handle.net/1721.1/88539
  18. Eslami, M. (2006). PToBI: A phonological system in transcribing the intonation of Persian. In R. Hoffmann (Ed.), Elektroniche Sprachsignal- verarbeiung (pp. 45-53). TU press.
  19. Eslami, M. (2011). Phonology: Analysis of the Persian Intonation System (2nd ed.). SAMT Publication, Tehran, Iran. https://samt.ac.ir/en/book/4125/phonology-analyzing-the-intonation-system-of-persian [In Persian]
  20. Eslami, M., & Bijankhan, M. (2000). Pitch accent placement and its use in speech processing. In Proceedings of 5th Annual International CSI Computer Conference (CSICC’2000). The University of Shahid Beheshti, Tehran, Iran.
  21. Grice, M., Baumann, S., & Benzmüller, R. (2005). German intonation in Autosegmental Metrical Phonology. In Jun, S. A. (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 55–83). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199249633.003.0003
  22. Hekmati, R. & Bijankhan, M. (2019). Prosodic Analysis of Ezafe Construction in the Framework of Prosodic Phonology. Scientific Journal of Language Research, 11(31), 127–128. https://doi.org/10.22051/JLR.2019.16223.1369 [In Persian]
  23. Hirst, D., & Di Cristo, A. (1999). A Survey of Intonation Systems. In D. Hirst & A. Di Cristo (Eds.), Intonation Systems (pp. 1-44). Cambridge University Press. https://doi.org/10.1353/lan.2000.0088
  24. Hualde, J., & Prieto, P. (2016). Towards an International Prosodic Alphabet (IPrA). Laboratory Phonology, 7(1), 1–25. https://doi.org/10.5334/labphon.11.
  25. Jun, S. A. (2022). The ToBI Transcription System: Conventions, Strengths, and Challenges. In
  26. Barnes & S. Shattuck-Hufnagel (Eds.), Prosodic Theory and Practice (pp. 151-181). USA: The MIT Press. https://doi.org/10.7551/mitpress/10413.003.0007
  27. Jun, S. A. (2005). Prosodic Typology: The Phonology of Intonation & Phrasing. Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199249633.001.0001
  28. Ladd, D. R. (2022). The Trouble with ToBI. In Jonathan Barnes & Stefanie Shattuck-Hufnagel (Eds.), Prosodic Theory and Practice (pp. 247-257). The MIT Press. https://doi.org/10.7551/mitpress/10413.003.0007
  29. Ladd, D. R. (2008). Intonational Phonology (2nd ed.). Cambridge University Press.
  30. https://doi.org/10.1017/CBO9780511808814
  31. Mahjani, B. (2003). An instrumental study of prosodic features & intonation in
    Modern Farsi (Persian)
    [Master’s thesis, Linguistics and Social Sciences University of Edinburgh]. Edinburgh, Scotland. https://www.timasearch.com/bm/edinburgh/behzad_mahjani.pdf
  32. Morrill, T., Dilley, L., & McAuley, J. D. (2014). Prosodic patterning in distal speech context: Effects of list intonation and f0 downtrend on perception of proximal prosodic structure. Journal of Phonetics, 46, 68-85. https://psycnet.apa.org/record/2014-36055-006
  33. Pierrehumbert, J. (1980). The Phonology and Phonetics of English Intonation [Doctoral dissertation, MIT]. Massachusetts, USA. http://dspace.mit.edu/handle/1721.1/16065
  34. Pierrehumbert, J., & Hirschberg, J. (1990). The Meaning of Intonational Contours in the Interpretation of Discourse. In P. R. Cohen, J. Morgan, & M. E. Pollack (Eds.), Intensions in Communication (pp. 271–311). MIT Press. https://doi.org/10.7551/mitpress/3839.003.0016
  35. Pitrelli, J, Beckman, M. & Hirschberg, J. (1994). Evaluation of prosodic transcription labeling reliability in the ToBI framework. In Proceedings of the International Conference on Spoken Language Processing (pp. 123-126). https://doi.org/10.21437/ICSLP.1994-34
  36. Price, M. P. J., Ostendorf, S., Shattuck- Hufnagel, S., & Fong, C., (1991). The Use of Prosody in Syntactic Disambiguation. Journal of Acoustical Society of America 60, 2956–2970. https://doi.org/10.1121/1.401770
  37. Sadeghi, V. & Sheykhi, S. (2018). A corpus-based study of Persian intonation. Persian Language & Iranian Dialects, 3(2), 35-54. https://doi.org/10.22124/PLID.2018.9926.1252
  38. Sadeghi, V. (2018). The Prosodic Structure of the Persian language: Lexical Stress & Intonation. SAMT. https://samt.ac.ir/en/book/2664/the-prosodic-structure-of-the-persian-language [in Persian]
  39. Sadat-Tehrani, N. (2007). The Intonational Grammar of Persian [Doctoral dissertation, University Manitoba]. Manitoba, Canada. http://hdl.handle.net/1993/2839
  40. Sadat-Tehrani, N. (2009). The alignment of L + H* pitch accents in Persian intonation.
    Journal of the International Phonetic Association, 39, 205-230. https://doi.org/10.1017/S0025100309003892
  41. Sadat-Tehrani, N. (2011) The intonation patterns of interrogatives in Persian. Linguistic
    discovery
    , 9(1), 105-36. https://doi.org/10.1349/PS1.1537-0852.A.389
  42. Scarborough, R. (2007). The intonation of focus in Farsi. UCLA Working Papers in Phonetics, 105, 19–34. https://escholarship.org/uc/item/83k7q53v
  43. Silverman, K., Beckman, M., Pitrelli, J., Ostendorf, M., Wightman, C., Price, P., Pierrehumbert, J., & Hirschberg. J (1992). ToBI: A Standard for Labeling English Prosody. https://doi.org/10.21437/ICSLP.1992-260
  44. Taheri-Ardali, M., Rahmani, H., & Xu, Y. (2014). The perception of prosodic focus in Persian. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of the 7th International Conference on Speech Prosody (pp. 515-519). Dublin: Trinity College. https://doi.org/10.21437/SpeechProsody.2014-90
  45. Tilsen, S., & Arvaniti, A. (2013). Speech rhythm analysis with decom- position of the amplitude envelope: Characterizing rhythmic patterns within and across languages. The Journal of the Acoustical Society of America, 134, 628–, 2013. https://doi.org/10.1121/1.4807565
  46. Venditti, J. (2005). The J_ ToBI model of Japanese intonation. In Jun, S.-A. (Ed.), Prosodic typology: The phonology of intonation and phrasing (pp. 172–200). Oxford University Press. https://doi.org/10.1093/acprof:oso/9780199249633.003.0007
  47. Weber, A., Braun, B., & Crocker, M. W. (2006). Finding referents in time: Eye-tracking evidence for the role of contrastive accents. Language and Speech, 49, 367–392. https://doi.org/10.1177/00238309060490030301
  48. Wightman, C., Shattuck- Hufnagel, S., Price, P., & Ostendorf, M. (1992). Segmental Durations in the Vicinity of Prosodic Phrase Boundaries. Journal of Acoustical Society of America 91, 1707–1717. https://psycnet.apa.org/record/1992-38606-001