カストナー・マーク

博士(情報学)

A multi-modal dataset for analyzing the imageability of concepts across modalities

研究業績へ戻る

著者: Marc A. Kastner, Chihaya Matsuhira, Ichiro Ide, Shin'ichi Satoh

あらすじ:

Recently, multi-modal applications bring a need for a human-like understanding of the perception differences across modalities. For example, while something might have a clear image in a visual context, it might be perceived as too technical in a textual context. Such differences related to a semantic gap make a transfer between modalities or a combination of modalities in multi-modal processing a difficult task. Imageability as a concept from Psycholinguistics gives promising insight to the human perception of vision and language. In order to understand cross- modal differences of semantics, we create and analyze a cross- modal dataset for imageability. We estimate three imageability values grounded in 1) a visual space from a large set of images, 2) a textual space from Web-trained word embeddings, and 3) a phonetic space based on word pronunciations. A subset of the corpus is evaluated with an existing imageability dictionary to ensure a basic generalization, but otherwise targets finding cross-modal differences and outliers. We visualize the dataset and analyze it regarding outliers and differences for each modality. As additional source of knowledge, part-of-speech and etymological origin of all words are estimated and analyzed in context of the modalities. The dataset of multi-modal imageability values and an interactive browser will be made publicly available.

種類: 4th IEEE International Conference on Multimedia Information Processing and Retrieval (MIPR2021)

日付: September 2021

DOI: 10.1109/MIPR51284.2021.00039

外部リンク: [ github ] [ supplemental visualizations ]


発表資料


添付ファイル


この研究についてコメントやご意見がある場合、ぜひ以下にコメントを投稿してくだい。メールにてご連絡も大歓迎です。
© 2013-2023 Marc A. Kastner. Powered by KirbyCMS. Some rights reserved. Privacy policy.