Sound symbolism is a well-researched topic of psycholinguistics, which tries to comprehend the connection between the sound of a word and its meanings. The Bouba-Kiki effect, one form of sound symbolism, claims that people perceive the pronunciation of “Kiki” as pointier than that of “Bouba.” There is no research that focuses on modeling such per- ception, i.e., how pointy a pronunciation sounds to humans, through computational and data-driven approaches. To address this, this paper first proposes the novel concept of “phonetic pointiness” defined as how pointy a shape humans are most likely to associate with a given pro- nunciation. We then model this phonetic pointiness from computational and data-driven approaches to calculate a score for an arbitrary pro- nunciation. There are three proposed models: a referential model, an expressive model, and a combined model, which integrates the previ- ous two. The idea comes from an existing psycholinguistic classification of two types of sound symbolisms: referential symbolism and expres- sive symbolism, where the former relates to vocabulary knowledge, while the latter is based on pure human intuition. The proposed models are constructed only with image and language data available on the Web, therefore not requiring task-specific human annotations. We evaluate these models through a crowd-sourced user study, finding a promis- ing correlation between human perception and the phonetic pointiness calculated by the proposed models. The results indicate that human perception can be modeled better by combining both types of sound sym- bolisms. Furthermore, by observing the behaviors of the models, we show several possible use-cases, such as product naming and psycholinguistic research, which can be a useful insight to further studies and applications.
種類: Journal paper at Multimedia Tools and Applications (MTAP), ?(?), ?????-?????
日付: April 2023