LSTM-Based Multi-Head Attention Lightweight CNN Models for Pneumothorax Classification

📖 Abstract

This study develops a lightweight Convolutional Neural Network (CNN) model for pneumothorax detection in chest X-rays, targeting resource-limited healthcare environments. The proposed architecture integrates EfficientNetB2 for feature extraction, Long Short-Term Memory (LSTM) layers for sequential spatial modeling, and a Multi-Head Attention mechanism to enhance focus on critical regions. Trained on the SIIM-ACR Pneumothorax dataset (2,027 images), the model employs data augmentation, balancing, and preprocessing to address class imbalance and variability. The evaluation results demonstrate strong performance, achieving 86% accuracy, 93% recall, and a 0.91 AUC-ROC score, outperforming baseline models like ResNet-50 and MobileNet-v2. The model’s clinical applicability is further validated through Gradient-weighted Class Activation Mapping (Grad-CAM) visualizations, highlighting lesion-specific regions, and a user-friendly GUI for real-world deployment. By optimizing computational efficiency while maintaining diagnostic accuracy, this work bridges the gap between deep learning and practical medical applications, particularly in underserved regions. Ethical, legal, and environmental considerations, including GDPR compliance and energy-efficient design, are systematically addressed to ensure responsible AI deployment.

Student Contributor: Xu Danyue (Esme) – a dedicated machine learning enthusiast known for her precision and creativity in medical AI innovation.

🔑 Key Contributions

Designed a novel hybrid architecture combining EfficientNetB2, LSTM, and Multi-Head Attention mechanisms tailored for medical imaging.
Implemented a robust training pipeline using the SIIM-ACR Pneumothorax dataset, incorporating balancing strategies to tackle class imbalance.
Achieved outstanding results: 86% accuracy, 93% recall, and AUC-ROC of 0.91—outperforming established models like ResNet-50 and MobileNet-v2.
Integrated Grad-CAM to visualize the model’s focus on critical regions, enhancing transparency and trust in clinical environments.
Developed an intuitive, interactive GUI for hospital-friendly deployment of the model.

📌 Conclusion

This work underscores the feasibility of lightweight deep learning models for high-accuracy pneumothorax detection, particularly in resource-constrained settings. By leveraging attention-driven spatial modeling, the study provides a clinically relevant and ethically sound solution with great deployment potential.

⚠️ Limitation

While the model generalizes well, it was trained solely on SIIM-ACR datasets, which may not reflect the full diversity of global clinical cases.
The sensitivity (~70%) can be further improved for early-stage or subtle pneumothorax signs.

🚀 Future Work

Incorporate federated learning across hospital datasets to enhance model robustness and generalizability.
Integrate voice-guided assistance into the GUI for visually impaired clinicians.
Expand classification beyond binary to include severity grading of pneumothorax.

📄 Download Poster 📦 Download Dataset