A Lightweight Hybrid Template-Matching–CNN Framework with Attention-Guided Fusion for Robust Small Object Detection

Hewa Majeed Zangana; Marwan Omar; Mohammed Aquil  Mirza; Xinwei Cao; Sharyar Wani

doi:10.12928/biste.v8i1.14751

Authors

Hewa Majeed Zangana Duhok Polytechnic University https://orcid.org/0000-0001-7909-254X
Marwan Omar Illinois Institute of Technology
Mohammed Aquil Mirza The Hong Kong Polytechnic University (PolyU)
Xinwei Cao Jiangnan University
Sharyar Wani International Islamic University Malaysia (IIUM)

DOI:

https://doi.org/10.12928/biste.v8i1.14751

Keywords:

Hybrid Object Detection, Template Matching, Feature Fusion, Attention Mechanisms, Small Object Detection, Deep Learning

Abstract

Small object detection in aerial and surveillance imagery remains challenging due to low resolution, occlusion, and background clutter. This study introduces a novel hybrid detection framework that fuses template matching with a deep learning detector (Faster R-CNN) through an attention-guided decision fusion mechanism. The novelty lies in (i) a dual-stage fusion pipeline that integrates precise structural cues from template matching with deep semantic features, and (ii) a custom scale-aware focal loss, adapted from Focal Loss to emphasize hard and small objects by dynamically increasing penalties for low-confidence predictions. Evaluated on a Pascal VOC subset (1000 images, 5 classes), the proposed system achieves an mAP improvement of 3.5% over the Faster R-CNN baseline and surpasses YOLO-Lite and R-CNN variants in precision and recall. The hybrid design adds only a minimal computational overhead (0.45 s/image vs. 0.42 s for Faster R-CNN), demonstrating favorable efficiency–accuracy trade-offs suitable for scalable deployment. These findings highlight the framework’s robustness, particularly in scenes containing occlusion, clutter, or visually small targets. Limitations regarding template dependency are discussed, along with future directions for automatic template generation and real-time video adaptation.

References

Y. Amit, P. Felzenszwalb, and R. Girshick, “Object detection,” in Computer Vision: A Reference Guide, pp. 875–883, 2021, https://doi.org/10.1007/978-3-030-63416-2_660.

K. Li and L. Cao, “A review of object detection techniques,” in 2020 5th International Conference on Electromechanical Control Technology and Transportation (ICECTT), pp. 385–390, 2020, https://doi.org/10.1109/ICECTT50890.2020.00091.

C. Chen, M.-Y. Liu, O. Tuzel, and J. Xiao, “R-CNN for small object detection,” in Computer Vision–ACCV 2016: 13th Asian Conference on Computer Vision, Taipei, Taiwan, November 20-24, 2016, Revised Selected Papers, Part V 13, pp. 214–230, 2017, https://doi.org/10.1007/978-3-319-54193-8_14.

J. Wang, S. Jiang, W. Song, and Y. Yang, “A comparative study of small object detection algorithms,” in 2019 Chinese control conference (CCC), pp. 8507–8512, 2019, https://doi.org/10.23919/ChiCC.2019.8865157.

B. Mahaur, N. Singh, and K. K. Mishra, “Road object detection: a comparative study of deep learning-based algorithms,” Multimed Tools Appl, vol. 81, no. 10, pp. 14247–14282, 2022, https://doi.org/10.1007/s11042-022-12447-5.

W. Sun, L. Dai, X. Zhang, P. Chang, and X. He, “RSOD: Real-time small object detection algorithm in UAV-based traffic monitoring,” Applied Intelligence, pp. 1–16, 2021, https://doi.org/10.1007/s10489-021-02893-3.

J. Deng, X. Xuan, W. Wang, Z. Li, H. Yao, and Z. Wang, “A review of research on object detection based on deep learning,” in Journal of Physics: Conference Series, p. 012028, 2020, https://doi.org/10.1088/1742-6596/1684/1/012028.

V. Kansal, U. Jain, B. Pant, and A. Kotiyal, “Comparative analysis of convolutional neural network in object detection,” In ICT Infrastructure and Computing: Proceedings of ICT4SD 2022, pp. 87-95, 2022, https://doi.org/10.1007/978-981-19-5331-6_10.

A. Bouguettaya, A. Kechida, and A. M. TABERKIT, “A survey on lightweight CNN-based object detection algorithms for platforms with limited computational resources,” International Journal of Informatics and Applied Mathematics, vol. 2, no. 2, pp. 28–44, 2019, https://izlik.org/JA88EG59FP.

F. Neha, D. Bhati, D. K. Shukla and M. Amiruzzaman, "From classical techniques to convolution-based models: A review of object detection algorithms," 2025 IEEE 6th International Conference on Image Processing, Applications and Systems (IPAS), Lyon, France, 2025, pp. 1-6, 2025, https://doi.org/10.1109/IPAS63548.2025.10924494.

R. Zhao, X. Niu, Y. Wu, W. Luk, and Q. Liu, “Optimizing CNN-based object detection algorithms on embedded FPGA platforms,” in Applied Reconfigurable Computing: 13th International Symposium, ARC 2017, Delft, The Netherlands, April 3-7, 2017, Proceedings 13, pp. 255–267, 2017, https://doi.org/10.1007/978-3-319-56258-2_22.

W. Chen, Y. Li, Z. Tian, and F. Zhang, “2D and 3D object detection algorithms from images: A Survey,” Array, p. 100305, 2023, https://doi.org/10.1016/j.array.2023.100305.

Y. Zhou et al., “Mmrotate: A rotated object detection benchmark using pytorch,” in Proceedings of the 30th ACM International Conference on Multimedia, pp. 7331–7334, 2022, https://doi.org/10.1145/3503161.3548541.

L. Peng, H. Wang, and J. Li, “Uncertainty evaluation of object detection algorithms for autonomous vehicles,” Automotive Innovation, vol. 4, no. 3, pp. 241–252, 2021, https://doi.org/10.1007/s42154-021-00154-0.

L. Galteri, M. Bertini, L. Seidenari, and A. Del Bimbo, “Video compression for object detection algorithms,” in 2018 24th International Conference on Pattern Recognition (ICPR), pp. 3007–3012, 2018, https://doi.org/10.1109/ICPR.2018.8546064.

A. Kumar, Z. J. Zhang, and H. Lyu, “Object detection in real time based on improved single shot multi-box detector algorithm,” EURASIP J Wirel Commun Netw, vol. 2020, pp. 1–18, 2020, https://doi.org/10.1186/s13638-020-01826-x.

H. M. Zangana, F. M. Mustafa, and M. Omar, “A Hybrid Approach for Robust Object Detection: Integrating Template Matching and Faster R-CNN,” EAI Endorsed Transactions on AI and Robotics, vol. 3, 2024, https://doi.org/10.4108/airo.6858.

R. Huang, J. Pedoeem, and C. Chen, “YOLO-LITE: a real-time object detection algorithm optimized for non-GPU computers,” in 2018 IEEE international conference on big data (big data), pp. 2503–2510, 2018, https://doi.org/10.1109/BigData.2018.8621865.

P. Malhotra and E. Garg, “Object detection techniques: a comparison,” in 2020 7th International Conference on Smart Structures and Systems (ICSSS), pp. 1–4, 2020, https://doi.org/10.1109/ICSSS49621.2020.9202254.

M. Li, H. Zhu, H. Chen, L. Xue, and T. Gao, “Research on object detection algorithm based on deep learning,” in Journal of Physics: Conference Series, p. 012046, 2021, https://doi.org/10.1088/1742-6596/1995/1/012046.

S. R. Waheed, N. M. Suaib, M. S. M. Rahim, M. M. Adnan, and A. A. Salim, “Deep learning algorithms-based object detection and localization revisited,” in journal of physics: conference series, p. 012001, 2021, https://doi.org/10.1088/1742-6596/1892/1/012001.

L. Du, R. Zhang, and X. Wang, “Overview of two-stage object detection algorithms,” in Journal of Physics: Conference Series, p. 012033, 2020, https://doi.org/10.1088/1742-6596/1544/1/012033.

P. Rajeshwari, P. Abhishek, P. Srikanth, and T. Vinod, “Object detection: an overview,” Int. J. Trend Sci. Res. Dev.(IJTSRD), vol. 3, no. 1, pp. 1663–1665, 2019, https://doi.org/10.31142/ijtsrd23422.

L. Zhao and S. Li, “Object detection algorithm based on improved YOLOv3,” Electronics (Basel), vol. 9, no. 3, p. 537, 2020, https://doi.org/10.3390/electronics9030537.

M. Haris and A. Glowacz, “Road object detection: A comparative study of deep learning-based algorithms,” Electronics (Basel), vol. 10, no. 16, p. 1932, 2021, https://doi.org/10.3390/electronics10161932.

Z. Li, Y. Du, M. Zhu, S. Zhou, and L. Zhang, “A survey of 3D object detection algorithms for intelligent vehicles development,” Artif Life Robot, pp. 1–8, 2022, https://doi.org/10.1007/s10015-021-00711-0.

C. Cuevas, E. M. Yáñez, and N. García, “Labeled dataset for integral evaluation of moving object detection algorithms: LASIESTA,” Computer Vision and Image Understanding, vol. 152, pp. 103–117, 2016, https://doi.org/10.1016/j.cviu.2016.08.005.

R. Padilla, S. L. Netto, and E. A. B. Da Silva, “A survey on performance metrics for object-detection algorithms,” in 2020 international conference on systems, signals and image processing (IWSSIP), pp. 237–242, 2020, https://doi.org/10.1109/IWSSIP48289.2020.9145130.

Y. Xiao et al., “A review of object detection based on deep learning,” Multimed Tools Appl, vol. 79, pp. 23729–23791, 2020, https://doi.org/10.1007/s11042-020-08976-6.

A. Dhillon and G. K. Verma, “Convolutional neural network: a review of models, methodologies and applications to object detection,” Progress in Artificial Intelligence, vol. 9, no. 2, pp. 85-112, 2020, https://doi.org/10.1007/s13748-019-00203-0.

P. Kumar, A. Singhal, S. Mehta, and A. Mittal, “Real-time moving object detection algorithm on high-resolution videos using GPUs,” J Real Time Image Process, vol. 11, pp. 93–109, 2016, https://doi.org/10.1007/s11554-012-0309-y.

A. Raghunandan, P. Raghav, and H. V. R. Aradhya, “Object detection algorithms for video surveillance applications,” in 2018 International Conference on Communication and Signal Processing (ICCSP), pp. 563–568, 2018, https://doi.org/10.1109/ICCSP.2018.8524461.