D-ro Marc A. Kastner

Pri mi

Aliaj Lingvoj

Deutsch

English

日本語

Tell as You Imagine: Sentence Imageability-aware Image Captioning

Reen al la antaŭa paĝo

Aŭtoroj: Kazuki Umemura, Marc A. Kastner, Ichiro Ide, Yasutomo Kawanishi, Takatsugu Hirayama, Keisuke Doman, Daisuke Deguchi, Hiroshi Murase

Resumo:

Image captioning as a multimedia task is advancing in terms of performance in generating captions for general purposes. However, tailoring generated captions to different applications remains difficult. In this paper, we propose a sentence imageability-aware image captioning method to generate captions tailoring to various applications. Sentence imageability describes how easily the caption can be mentally imagined. This concept is applied to the captioning model to get a better understanding of the perception of a generated caption. First, we extend an existing image caption dataset by augmenting its captions' diversity. Then, a sentence imageability score for each augmented caption is calculated. A modified image captioning model is trained using this extended dataset to generate captions tailored to a specified imageability score. Experiments show promising results in generating imageability-aware captions. Especially, results from a subjective experiment shows that the perception of the generated captions correlates with the specified score.

Tipo: MultiMedia Modelling (MMM) 2021. Lecture Notes in Computer Science, vol 12573

Dato de publikigo: January 2021

DOI: 10.1007/978-3-030-67835-7_6

Prezento

Dosieroj

preprint

slides

Se vi havas demandojn aŭ komentojn pri ĉi tiu esplorado, bonvolu lasi komenton sube aŭ sendi al mi retpoŝton. Mi respondos rapide.