Hybrid Attention-Enhanced CNNs for Small Object Detection in Mammography, CT, and Fundus Imaging

Hewa Majeed Zangana; Marwan Omar; Shuai Li; Jamal N. Al-Karaki; Anik Vega Vitianingsih

doi:10.12928/biste.v7i3.14015

Authors

Hewa Majeed Zangana Duhok Polytechnic University https://orcid.org/0000-0001-7909-254X
Marwan Omar Illinois Institute of Technology https://orcid.org/0000-0002-3392-0052
Shuai Li University of Oulu
Jamal N. Al-Karaki Zayed University
Anik Vega Vitianingsih Universitas Dr. Soetomo https://orcid.org/0000-0002-1651-1521

DOI:

https://doi.org/10.12928/biste.v7i3.14015

Keywords:

Hybrid CNN, Multi-Scale Feature Fusion, Dilated Convolutions, Attention Mechanisms, Computational Efficiency, Dataset Bias, Early Disease Screening

Abstract

Early detection of subtle pathological features in medical images is critical for improving patient outcomes but remains challenging due to low contrast, small lesion size, and limited annotated data. The research contribution is a hybrid attention-enhanced CNN specifically tailored for small object detection across mammography, CT, and retinal fundus images. Our method integrates a ResNet-50 backbone with a modified Feature Pyramid Network, dilated convolutions for contextual scale expansion, and combined channel–spatial attention modules to preserve and amplify fine-grained features. We evaluate the model on public benchmarks (DDSM, LUNA16, IDRiD) using standardized preprocessing, extensive augmentation, and cross-validated training. Results show consistent gains in detection and localization: ECNN achieves an F1-score of 88.2% (95% CI: 87.4–89.0), mAP@0.5 of 86.8%, IoU of 78.6%, and a low false positives per image (FPPI = 0.12) versus baseline detectors. Ablation studies confirm the individual contributions of dilated convolutions, attention modules, and multi-scale fusion. However, these gains involve higher computational costs (≈2× training time and increased memory footprint), and limited dataset diversity suggests caution regarding generalizability. In conclusion, the proposed ECNN advances small-object sensitivity for early disease screening while highlighting the need for broader clinical validation and interpretability tools before deployment.

References

C. Chen, M.-Y. Liu, O. Tuzel, and J. Xiao, “R-CNN for small object detection,” in Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part V 13, pp. 214–230, 2017, https://doi.org/10.1007/978-3-319-54193-8_14.

J. Wang, S. Jiang, W. Song, and Y. Yang, “A comparative study of small object detection algorithms,” in 2019 Chinese control conference (CCC), pp. 8507–8512, 2019, https://doi.org/10.23919/ChiCC.2019.8865157.

W. Sun, L. Dai, X. Zhang, P. Chang, and X. He, “RSOD: Real-time small object detection algorithm in UAV-based traffic monitoring,” Applied Intelligence, pp. 1–16, 2021, https://doi.org/10.1007/s10489-021-02893-3.

J. Deng, X. Xuan, W. Wang, Z. Li, H. Yao, and Z. Wang, “A review of research on object detection based on deep learning,” in Journal of Physics: Conference Series, p. 012028, 2020, https://doi.org/10.1088/1742-6596/1684/1/012028.

L. Zhao and S. Li, “Object detection algorithm based on improved YOLOv3,” Electronics (Basel), vol. 9, no. 3, p. 537, 2020, https://doi.org/10.3390/electronics9030537.

M. Haris and A. Glowacz, “Road object detection: A comparative study of deep learning-based algorithms,” Electronics (Basel), vol. 10, no. 16, p. 1932, 2021, https://doi.org/10.3390/electronics10161932.

J. Ren and Y. Wang, “Overview of object detection algorithms using convolutional neural networks,” Journal of Computer and Communications, vol. 10, no. 1, pp. 115–132, 2022, https://doi.org/10.4236/jcc.2022.101006.

A. Bouguettaya, A. Kechida, and A. M. Taberkit, “A survey on lightweight CNN-based object detection algorithms for platforms with limited computational resources,” International Journal of Informatics and Applied Mathematics, vol. 2, no. 2, pp. 28–44, 2019, https://dergipark.org.tr/en/pub/ijiam/issue/52418/654318.

R. Zhao, X. Niu, Y. Wu, W. Luk, and Q. Liu, “Optimizing CNN-based object detection algorithms on embedded FPGA platforms,” in Applied Reconfigurable Computing: 13th International Symposium, ARC 2017, Delft, The Netherlands, April 3-7, 2017, Proceedings 13, pp. 255–267, 2017, https://doi.org/10.1007/978-3-319-56258-2_22.

X. Zou, “A review of object detection techniques,” in 2019 International conference on smart grid and electrical automation (ICSGEA), pp. 251–254, 2019, https://doi.org/10.1109/ICSGEA.2019.00065.

M. Li, H. Zhu, H. Chen, L. Xue, and T. Gao, “Research on object detection algorithm based on deep learning,” in Journal of Physics: Conference Series, p. 012046, 2021, https://doi.org/10.1088/1742-6596/1995/1/012046.

R. Huang, J. Pedoeem, and C. Chen, “YOLO-LITE: a real-time object detection algorithm optimized for non-GPU computers,” in 2018 IEEE international conference on big data (big data), pp. 2503–2510, 2018, https://doi.org/10.1109/BigData.2018.8621865.

L. Galteri, M. Bertini, L. Seidenari, and A. Del Bimbo, “Video compression for object detection algorithms,” in 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3007–3012, 2018, https://doi.org/10.1109/ICPR.2018.8546064.

H. M. Zangana, F. M. Mustafa, and M. Omar, “A Hybrid Approach for Robust Object Detection: Integrating Template Matching and Faster R-CNN,” EAI Endorsed Transactions on AI and Robotics, vol. 3, 2024, https://doi.org/10.4108/airo.6858.

R. Padilla, S. L. Netto, and E. A. B. Da Silva, “A survey on performance metrics for object-detection algorithms,” in 2020 international conference on systems, signals and image processing (IWSSIP), pp. 237–242, 2020, https://doi.org/10.1109/IWSSIP48289.2020.9145130.

L. Peng, H. Wang, and J. Li, “Uncertainty evaluation of object detection algorithms for autonomous vehicles,” Automotive Innovation, vol. 4, no. 3, pp. 241–252, 2021, https://doi.org/10.1007/s42154-021-00154-0.

Y. Amit, P. Felzenszwalb, and R. Girshick, “Object detection,” in Computer Vision: A Reference Guide, pp. 875–883, 2021, https://doi.org/10.1007/978-3-030-63416-2_660.

L. Du, R. Zhang, and X. Wang, “Overview of two-stage object detection algorithms,” in Journal of Physics: Conference Series, IOP Publishing, 2020, p. 012033, 2020, https://doi.org/10.1088/1742-6596/1544/1/012033.

B. Mahaur, N. Singh, and K. K. Mishra, “Road object detection: a comparative study of deep learning-based algorithms,” Multimed Tools Appl, vol. 81, no. 10, pp. 14247–14282, 2022, https://doi.org/10.1007/s11042-022-12447-5.

H. M. Zangana and F. M. Mustafa, “Hybrid Image Denoising Using Wavelet Transform and Deep Learning,” EAI Endorsed Transactions on AI and Robotics, vol. 3, no. 1, 2024, https://doi.org/10.4108/airo.7486.

A. John and D. Meva, “A comparative study of various object detection algorithms and performance analysis,” International Journal of Computer Sciences and Engineering, vol. 8, no. 10, pp. 158–163, 2020, https://doi.org/10.26438/ijcse/v8i10.158163.

K. Li and L. Cao, “A review of object detection techniques,” in 2020 5th International Conference on Electromechanical Control Technology and Transportation (ICECTT), pp. 385–390, 2020, https://doi.org/10.1109/ICECTT50890.2020.00091.

P. Malhotra and E. Garg, “Object detection techniques: a comparison,” in 2020 7th International Conference on Smart Structures and Systems (ICSSS), IEEE, 2020, pp. 1–4, 2020, https://doi.org/10.1109/ICSSS49621.2020.9202254.

C. Cuevas, E. M. Yáñez, and N. García, “Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA,” Computer Vision and Image Understanding, vol. 152, pp. 103–117, 2016, https://doi.org/10.1016/j.cviu.2016.08.005.

Y. Xiao et al., “A review of object detection based on deep learning,” Multimed Tools Appl, vol. 79, pp. 23729–23791, 2020, https://doi.org/10.1007/s11042-020-08976-6.

H. Luo And H. Chen, “Survey of object detection based on deep learning,” Acta Electonica Sinica, vol. 48, no. 6, p. 1230, 2020 F. Neha, D. Bhati, D. K. Shukla and M. Amiruzzaman, "From classical techniques to convolution-based models: A review of object detection algorithms," 2025 IEEE 6th International Conference on Image Processing, Applications and Systems (IPAS), pp. 1-6, 2025, https://doi.org/10.1109/IPAS63548.2025.10924494.

W. Chen, Y. Li, Z. Tian, and F. Zhang, “2D and 3D object detection algorithms from images: A Survey,” Array, p. 100305, 2023, https://doi.org/10.1016/j.array.2023.100305.

Z. Li, Y. Du, M. Zhu, S. Zhou, and L. Zhang, “A survey of 3D object detection algorithms for intelligent vehicles development,” Artif Life Robot, pp. 1–8, 2022, https://doi.org/10.1007/s10015-021-00711-0.

A. Raghunandan, P. Raghav, and H. V. R. Aradhya, “Object detection algorithms for video surveillance applications,” in 2018 International Conference on Communication and Signal Processing (ICCSP), pp. 563–568, 2018, https://doi.org/10.1109/ICCSP.2018.8524461.

Y. Zhou et al., “Mmrotate: A rotated object detection benchmark using pytorch,” in Proceedings of the 30th ACM International Conference on Multimedia, pp. 7331–7334, 2022, https://doi.org/10.1145/3503161.3548541.

K. S. Chahal and K. Dey, “A survey of modern object detection literature using deep learning,” arXiv preprint arXiv:1808.07256, 2018, https://doi.org/10.48550/arXiv.1808.07256.

P. Rajeshwari, P. Abhishek, P. Srikanth, and T. Vinod, “Object detection: an overview,” Int. J. Trend Sci. Res. Dev.(IJTSRD), vol. 3, no. 1, pp. 1663–1665, 2019, https://doi.org/10.31142/ijtsrd23422.

A. Kumar, Z. J. Zhang, and H. Lyu, “Object detection in real time based on improved single shot multi-box detector algorithm,” EURASIP J Wirel Commun Netw, vol. 2020, pp. 1–18, 2020, https://doi.org/10.1186/s13638-020-01826-x.

P. Kumar, A. Singhal, S. Mehta, and A. Mittal, “Real-time moving object detection algorithm on high-resolution videos using GPUs,” J Real Time Image Process, vol. 11, pp. 93–109, 2016, https://doi.org/10.1007/s11554-012-0309-y.