From: Lifelogging caption generation via fourth-person vision in a human–robot symbiotic environment
Input perspective | Method | BLEU-1 | BLEU-2 | BLEU-3 | BLEU-4 | ROUGE-L | METEOR | CIDEr-D | SPICE | ||
---|---|---|---|---|---|---|---|---|---|---|---|
First | UpDown [20] | 51.20 | 33.47 | 20.41 | 11.25 | 38.85 | 17.45 | 21.44 | 12.19 | ||
Second | UpDown [20] | 60.86 | 43.24 | 31.12 | 21.19 | 45.60 | 19.46 | 16.94 | 12.08 | ||
Third | UpDown [20] | 42.80 | 26.56 | 16.17 | 9.70 | 31.34 | 13.73 | 6.79 | 6.28 | ||
Second | Third | Ensemble | 59.14 | 41.97 | 30.45 | 21.06 | 44.09 | 19.13 | 15.18 | 11.40 | |
Second | Third | KMeans | 62.31 | 45.34 | 33.16 | 22.91 | 46.22 | 20.19 | 17.76 | 12.21 | |
First | Third | Ensemble | 59.06 | 42.78 | 30.47 | 20.28 | 45.16 | 20.33 | 27.71 | 14.37 | |
First | Third | KMeans | 60.83 | 44.71 | 32.03 | 21.48 | 46.27 | 21.16 | 30.10 | 15.02 | |
First | Second | Ensemble | 62.08 | 45.37 | 32.82 | 22.47 | 47.67 | 21.68 | 30.03 | 15.04 | |
First | Second | KMeans | 62.43 | 45.78 | 32.90 | 22.19 | 47.61 | 21.87 | 30.76 | 15.24 | |
First | Second | Third | Ensemble | 63.12 | 46.37 | 34.08 | 23.71 | 47.92 | 21.72 | 29.52 | 14.99 |
First | Second | Third | KMeans | 65.09 | 48.93 | 36.02 | 24.78 | 49.13 | 22.79 | 33.41 | 15.72 |