Since the COVID-19 outbreak, face masks have played a crucial role in preventing virus transmission in public spaces. This project applies Faster R-CNN with a ResNet-50 + FPN backbone to develop an efficient facemask detection system. The model was trained on a custom dataset with three classes—mask, no mask, and false mask—using optimized hyperparameters (LR = 0.005, SGD). Evaluation on key metrics such as mAP, F1-score, and LAMR showed that the model achieved 98.40% mAP and nearly perfect detection for “false mask” class. The system was deployed with a PyQt GUI that supports both image and video inputs.
This project successfully demonstrates that Faster R-CNN with an enhanced backbone (ResNet50+FPN) is capable of accurately detecting mask-wearing status across various conditions. With robust performance and practical interpretability tools, the system offers real-world application potential in public health monitoring.
The system's video detection operates at a low frame rate (≈1.5 fps), limiting real-time deployment efficiency. Additionally, performance on diverse camera angles and lighting conditions may require further generalization.
Planned improvements include optimizing real-time video processing using lighter models (e.g., YOLOv7), deploying on edge devices, and extending the system for additional PPE detection (gloves, face shields) using multi-label object detection.