Ensemble CNN Model for Food Image Classification on Small Datasets

Abstract

This study addresses the challenge of food image classification using small-scale datasets by developing an ensemble Convolutional Neural Network (CNN) model. Focusing on a curated subset of the Food-101 dataset (5,000 images across 5 categories), the project evaluates three CNN architectures: ResNet, MobileNetV2, and InceptionV3. Through transfer learning and ensemble techniques, the model achieves 92.7% accuracy, demonstrating CNN's effectiveness even with limited data. The research implements advanced training strategies including learning rate scheduling and data augmentation, while providing model interpretability through Grad-CAM, LIME, and SHAP visualizations. A user-friendly GUI enables practical deployment, showcasing applications in dietary management and food quality assessment. The project highlights how carefully designed CNN ensembles can overcome data scarcity challenges in food recognition tasks.

Key Innovations

Model Performance Comparison

MobileNetV2: 87% accuracy | ResNet: 87% accuracy | InceptionV3: 85% accuracy

Ensemble Model: 92.7% accuracy (Adam optimizer, learning rate 0.001, batch size 32)

Technical Implementation

The system combines multiple advanced CNN architectures with careful preprocessing and augmentation strategies. Images are resized to 299x299 pixels and normalized using ImageNet statistics. The ensemble model weights predictions from each architecture based on their individual test accuracy (ResNet: 0.35, MobileNetV2: 0.33, InceptionV3: 0.32). Training utilizes the Adam optimizer with a learning rate scheduler that reduces the rate by 0.1 when validation loss plateaus. Data augmentation includes random resized crops, horizontal flips, rotations, and color jitter to improve generalization from the limited dataset.

Conclusion

This project successfully demonstrates that carefully designed CNN ensembles can achieve excellent classification performance even with small datasets. The combination of transfer learning, strategic data augmentation, and model interpretability techniques provides a robust framework for food image analysis. The implemented GUI makes these advanced capabilities accessible for practical applications in dietary monitoring and food quality assessment.

Limitations

The current model is limited to five food categories from the Food-101 dataset. Performance may decrease when applied to more diverse or complex food images not represented in the training data. The ensemble approach also requires more computational resources than single-model solutions. Future work could explore more efficient ensemble methods and expand the category coverage.

Future Directions

Future enhancements could include: expanding to more food categories while maintaining performance, developing mobile-friendly versions of the model, incorporating real-time classification capabilities, and integrating nutritional information with classification results. Additional work could also explore few-shot learning techniques to further reduce data requirements.

🔬 Student Innovator: Emil Chen is passionate about applying deep learning to solve real-world problems in food technology and dietary health. Under the mentorship of Dr. Happy Nkanta Monday, Emil developed this comprehensive food classification system that combines cutting-edge computer vision with practical application development. His work demonstrates how AI can transform our relationship with food through intelligent recognition systems.
📄 Download Poster 💻 View Code