Recent applications in image processing often use a multimodal approach using both text and imagery. This is prone to semantic gap issues when converting between image and language. There has been few research quantifiying visual differences when assessing semantic relationships. In this research, we analyze datasets composed of logically related concepts. By visualizing a Bag-of-Visual-Words (BoVW) model spatially, visual semantics of logically related sub-concepts are shown. To find hidden semantics of related concepts, the most common visual words of an image in relation to its neighbors are highlighted. This provides additional semantic knowledge on how sub-ordinate concepts visually relate to another. It is thought to give an insight on the human perception of these concepts, and can be used in future research to estimate psycholinguistic ratings.
Type: Poster at MIRU Symposium (画像の認識・理解シンポジウム)
Publication date: August 2018