Ajou University repository

Towards robust complexity indices in linguistic typology A corpus-based assessmentoa mark
Citations

SCOPUS

3

Citation Export

DC Field Value Language
dc.contributor.authorOh, Yoon Mi-
dc.contributor.authorPellegrino, François-
dc.date.issued2023-09-22-
dc.identifier.issn1569-9978-
dc.identifier.urihttps://aurora.ajou.ac.kr/handle/2018.oak/33731-
dc.identifier.urihttps://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=85174411875&origin=inward-
dc.description.abstractThere is high hope that corpus-based approaches to language complexity will contribute to explaining linguistic diversity. Several complexity indices have consequently been proposed to compare different aspects among languages, especially in phonology and morphology. However, their robustness against changes in corpus size and content hasn’t been systematically assessed, thus impeding comparability between studies. Here, we systematically test the robustness of four complexity indices estimated from raw texts and either routinely utilized in crosslinguistic studies (Type-Token Ratio and word-level Entropy) or more recently proposed (Word Information Density and Lexical Diversity). Our results on 47 languages strongly suggest that traditional indices are more prone to fluctuation than the newer ones. Additionally, we confirm with Word Information Density the existence of a cross-linguistic trade-off between word-internal and across-word distributions of information. Finally, we implement a proof of concept suggesting that modern deep-learning language models can improve the comparability across languages with non-parallel datasets.-
dc.description.sponsorshipYoon Mi Oh was funded by Ajou University ( S-2019-G0001-00088 ).-
dc.language.isoeng-
dc.publisherJohn Benjamins Publishing Company-
dc.titleTowards robust complexity indices in linguistic typology A corpus-based assessment-
dc.typeArticle-
dc.citation.endPage829-
dc.citation.number4-
dc.citation.startPage789-
dc.citation.titleStudies in Language-
dc.citation.volume47-
dc.identifier.bibliographicCitationStudies in Language, Vol.47 No.4, pp.789-829-
dc.identifier.doi10.1075/sl.22034.oh-
dc.identifier.scopusid2-s2.0-85174411875-
dc.identifier.urlhttp://www.ingentaconnect.com/content/jbp/sl-
dc.subject.keywordcomplexity metric robustness-
dc.subject.keywordcomplexity trade-off-
dc.subject.keywordlinguistic typology-
dc.subject.keywordmorphological complexity-
dc.subject.keywordnon-parallel corpus-
dc.type.otherArticle-
dc.identifier.pissn0378-4177-
dc.description.isoatrue-
dc.subject.subareaLanguage and Linguistics-
dc.subject.subareaCommunication-
dc.subject.subareaLinguistics and Language-
Show simple item record

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

OH, YOON MI Image
OH, YOON MI오윤미
Department of French Language and Literature
Read More

Total Views & Downloads

File Download