نوع مقاله : مقاله پژوهشی
نویسندگان
1 استادیار پژوهشگاه علوم و فناوری اطلاعات ایران (ایرانداک). تهران.ایران
2 پژوهشگاه علوم و فناوری اطلاعات ایران (ایرانداک)، تهران، ایران
چکیده
کلیدواژهها
عنوان مقاله [English]
نویسندگان [English]
Based on the language used in their constituent texts, corpora are categorized as monolingual, bilingual, or multilingual. A comparable corpus is a bilingual or multilingual corpus that includes similar texts in the same subject areas. In other words, a comparable corpus is a collection of documents in two different languages that cover similar topics. Comparable corpora can be composed of general texts, providing various possibilities for discourse analysis, pragmatics, analysis of text genres, and sociolinguistics. Examples of such corpora could include collections of encyclopedia entries, or literary texts from a certain period of time. However, the most common types of comparable corpora, which attract many audiences are those related to specialized fields and containing a high density of vocabulary and technical terms. Such a corpus is called a specialized comparable corpus. In this study, a specialized comparable corpus was built from the Persian and English abstracts of theses and dissertations registered in IranDoc. The corpus is named PARSA.
کلیدواژهها [English]