Skip to Content

Phondi Meeting: Yan Dong presents

Friday, November 9, 2012
12:00 AM
Lorch 473

on Elastic Word Length in Chinese

Phondi this Friday will have graduate student Yan Dong present on her recent research into the phenomenon of so-called "elastic word length" in Chinese. This is a topic that has recently received quite a bit of attention in Chinese linguistics, partially because of the work done by very own San Duanmu on this topic. The title and abstract of Yan's presentation are given below.

Family size and elastic word length in Chinese

Many Chinese words have elastic length (long and short forms), such as mei (coal) and meitan (coal-coal), ya (duck) and yazi (duck-affix), dian (electricity) and dianshi (electricity-vision). Short forms can be either the right or the left-side member in the compounds. For example, rong (melt) is both the short form for ronghua (melt-melt) and jinrong (gold-melt).

This raises a question: what determines which member is deleted in the short form? I pursue the hypothesis that “informativity” influences the choice of the short form and investigated one measure of informativity, family size, in large corpora. Family size refers to compounds that share the same left or right member, such as ronghua, ronghe which share the left-side member; fanrong, guangrong which share the same right-side member. Since words with smaller family size carry less information of the compound, I expect that words with smaller family size to be deleted in the short form. To investigate this hypothesis, I extracted all senses that have equivalent disyllabic forms from 3000 senses of monosyllabic words in Modern Chinese Dictionary (2005). Family size was first calculated based on the character, then recalculated based on the pronunciation for comparison.

Another type of family size that may be crucial for member selection is the family size of the members at the time of the creation of the compound.  This could be checked with Google Ngram that provides the first occurrence of the members and the compound, as well as their frequency by year. This calculation is expected to be more accurate in predicting the choice of the short form.