Skip to main content
Fig. 8 | ROBOMECH Journal

Fig. 8

From: Lifelogging caption generation via fourth-person vision in a human–robot symbiotic environment

Fig. 8

Three types of top-20 frequent semantic tuples of our dataset. Same as in SPICE [30] described in Sect. “Evaluation metrics”, the “object”-, the “attribute”-, and the “relation”-elements are parsed from a set of reference captions, and they form the three types of tuples lined as the bins. As seen in “relation” frequency (right), our dataset contains several phrases related to the interaction between a person and household commodities

Back to article page