カストナー・マークアウレル

博士(情報学)

Estimating the visual variety of concepts by referring to Web popularity

研究業績へ戻る

著者: Marc A. Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Daisuke Deguchi, Hiroshi Murase

あらすじ:

Increasingly sophisticated methods for data processing demand knowledge on the semantic relationship between language and vision. New fields of research like Explainable AI demand to step away from black-boxed approaches and understanding how the underlying semantics of data sets and AI models work. Advancements in Psycholinguistics suggest, that there is a relationship from language perception to how language production and sentence creation work. In this paper, a method to measure the visual variety of concepts is proposed to quantify the semantic gap between vision and language. For this, an image corpus is recomposed using ImageNet and Web data. Web-based metrics for measuring the popularity of sub-concepts are used as a weighting to ensure that the image composition in a dataset is as natural as possible. Using clustering methods, a score describing the visual variety of each concept is determined. A crowd-sourced survey is conducted to create ground-truth values applicable for this research. The evaluations show that the recomposed image corpus largely improves the measured variety compared to previous datasets. The results are promising and give additional knowledge about the relationship of language and vision.

種類: Journal paper at Multimedia Tools and Applications (MTAP), 78(7), 9463-9488

日付: April 2019

DOI: 10.1007/s11042-018-6528-x


添付ファイル


この研究についてコメントやご意見がある場合、ぜひ以下にコメントを投稿してくだい。メールにてご連絡も大歓迎です。
© 2013-2023 Marc A. Kastner. Powered by KirbyCMS. Some rights reserved. Privacy policy.