A Cross-Modal Variational Framework For Food Image Analysis

T. Theodoridis, V. Solachidis, K. Dimitropoulso, and P. Daras

 

Food analysis resides at the core of modern nutrition recommender systems, providing the foundation for a high-level understanding of users’ eating habits. This paper focuses on the sub-task of ingredient recognition from food images using a variational framework. The framework consists of two variational encoder-decoder branches, aimed at processing information from different modalities (images and text), as well as a variational mapper branch, which accomplishes the task of aligning the distributions of the individual branches. Experimental results on the Yummly-28K data-set showcase that the proposed framework performs better than similar variational frameworks, while it surpasses current state-of-the-art approaches on the large-scale Recipe1M data-set.

https://ieeexplore.ieee.org/document/9190758 https://zenodo.org/record/4249315

Share on facebook
Share on twitter
Share on linkedin

A Cross-Modal Variational Framework For Food Image Analysis

T. Theodoridis, V. Solachidis, K. Dimitropoulso, and P. Daras

 

Food analysis resides at the core of modern nutrition recommender systems, providing the foundation for a high-level understanding of users’ eating habits. This paper focuses on the sub-task of ingredient recognition from food images using a variational framework. The framework consists of two variational encoder-decoder branches, aimed at processing information from different modalities (images and text), as well as a variational mapper branch, which accomplishes the task of aligning the distributions of the individual branches. Experimental results on the Yummly-28K data-set showcase that the proposed framework performs better than similar variational frameworks, while it surpasses current state-of-the-art approaches on the large-scale Recipe1M data-set.

https://ieeexplore.ieee.org/document/9190758 https://zenodo.org/record/4249315

Scroll to Top