Estimation of Crowd Density Using Image Processing Techniques with Background Pixel Model and Visual Geometry Group

Ha Duyen Trung

doi:10.12928/biste.v6i2.10785

Authors

Ha Duyen Trung Hanoi University of Sciene and Technology

DOI:

https://doi.org/10.12928/biste.v6i2.10785

Keywords:

Crowd density Estimation, Bayesian Loss, Visual Geometry Group, CNN, Background Pixel Model

Abstract

Crowd density estimation in complex backgrounds using a single image has garnered significant attention in automatic monitoring systems. In this paper, we propose a novel approach to enhance crowd estimation by leveraging the Bayesian Loss algorithm in conjunction with monitoring points and datasets such as UCF-QNRF, UCF_CC_50, and ShanghaiTech. The proposed method is evaluated using standard metrics including Mean Square Error (MSE) and Mean Absolute Error (MAE). Experimental results demonstrate that the proposed method achieves significantly improved accuracy compared to existing estimation techniques. Specifically, the proposed technique showcases a 106.0 reduction in MSE and a 91.6 reduction in MAE over state-of-the-art methods, thereby validating its effectiveness in challenging crowd density estimation scenarios.

References

S. A. Velastin, J. H. Yin, A. C. Davies, M. A. Vicencio-Silva, R. E. Allsop, and A. Penn, “Analysis of crowd movements and densities in built-up environments using image processing,” in Proc. IEE Colloquium Image Processing for Transport Applications, 1993, pp. 8/1-8/6, 1993, https://ieeexplore.ieee.org/abstract/document/280223.

S. A. Velastin, J. H. Yin, A. C. Davies, M. A. Vicencio-Silva, R. E. Allsop and A. Penn, "Automated measurement of crowd density and motion using image processing," Seventh International Conference on Road Traffic Monitoring and Control, 1994., pp. 127-132, 1994, https://doi.org/10.1049/cp:19940440.

A. N. Marana, L. F. Costa, R. A. Lotufo and S. A. Velastin, "On the efficacy of texture analysis for crowd monitoring," Proceedings SIBGRAPI'98. International Symposium on Computer Graphics, Image Processing, and Vision (Cat. No.98EX237), pp. 354-361, 1998, https://doi.org/10.1109/SIBGRA.1998.722773.

A. N. Marana et.al., “Automatic estimation of crowd density using texture,” Safety Sci., vol. 28, pp. 165-175, 1998, https://doi.org/10.1016/S0925-7535(97)00081-7.

R. M. Haralick, “Statistical and structural approaches to texture,” Proc. IEEE, vol. 67, pp. 786-804, 1979, https://doi.org/10.1109/PROC.1979.11328.

A. N. Marana, L. Da Fontoura Costa, R. A. Lotufo and S. A. Velastin, "Estimating crowd density with Minkowski fractal dimension," 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258), vol. 6, pp. 3521-3524, 1999, https://doi.org/10.1109/ICASSP.1999.757602.

S.-Y. Cho, T. W. S. Chow, and C.-T. Leung, “A neural-based crowd estimation by hybrid global learning algorithm,” IEEE Trans. Syst., Man, Cybern. B, vol. 29, pp. 535-541, 1999, https://doi.org/10.1109/3477.775269.

H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-based face detection,” IEEE Trans. Pattern Anal. Machine Intell., vol. 20, pp. 23-38, 1998, https://doi.org/10.1109/34.655647.

M. Oren, C. Papageorgiou, P. Sinha, E. Osuna and T. Poggio, "Pedestrian detection using wavelet templates," Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 193-199, 1997, https://doi.org/10.1109/CVPR.1997.609319.

O. Chapelle, P. Haffner, and V. N. Vapnik, “Support vector machines for histogram-based image classification,” IEEE Trans. Neural Networks, vol. 10, pp. 1055–1064, Sept. 1999, https://doi.org/10.1109/72.788646.

J. L. Polus, Schofer, and A. Ushpiz, “Pedestrian flow and level of service,” J. Transp. Eng., vol. 109, pp. 46–56, 1983, https://doi.org/10.1061/(ASCE)0733-947X(1983)109:1(46).

S.-F. Lin, J.-Y. Chen, and H.-X. Chao, “Estimation of number of people in crowded scenes using perspective transformation,” IEEE Trans. Systems, Man, and Cybernetics, Part A, vol. 31, no. 6, pp. 645-654, 2001, https://doi.org/10.1109/3468.983420.

M. Li, Z. Zhang, K. Huang, and T. Tan, “Estimating the number of people in crowded scenes by MID based foreground segmentation and head-shoulder detection,” in Proc. International Conference on Pattern Recognition (ICPR), pp. 1-4, 2008, https://doi.org/10.1109/ICPR.2008.4761705.

W. Ge and Robert T. Collins, “Marked point processes for crowd counting,” in Computer Vision and Pattern Recognition (CVPR), 2009, pp. 2913-2920, 2009, https://doi.org/10.1109/CVPRW.2009.5206621.

S. Lazebnik, C. Schmid and J. Ponce, "Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories," 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06), pp. 2169-2178, 2006, https://doi.org/10.1109/CVPR.2006.68.

Pedro F Felzenszwalb, David A McAllester, Deva Ramanan, et al., “A discriminatively trained, multiscale, deformable part model,” in Proc. Computer Vision and Pattern Recognition (CVPR), pp. 1-8, 2008, https://doi.org/10.1109/CVPR.2008.4587597.

A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in Neural Information Processing Systems (NIPS), vol. 25, no. 2, pp. 84-90, 2012, https://doi.org/10.1145/3065386.

A. B. Chan, Zhang-Sheng John Liang and N. Vasconcelos, "Privacy preserving crowd monitoring: Counting people without people models or tracking," 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-7, 2008, https://doi.org/10.1109/CVPR.2008.4587569.

D. Ryan, S. Denman, C. Fookes, and S. Sridharan, “Crowd counting using multiple local features,” In DICTA, 2009, https://doi.org/10.1109/DICTA.2009.22.

B. Liu and N. Vasconcelos, "Bayesian Model Adaptation for Crowd Counts," 2015 IEEE International Conference on Computer Vision (ICCV), pp. 4175-4183, 2015, https://doi.org/10.1109/ICCV.2015.475.

H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S. Al-Maadeed, et. al., “Composition loss for counting density map estimation and localization in dense crowds,” in Proc. European Conference on Computer Vision (ECCV), pp. 544-559, 2018, https://doi.org/10.1007/978-3-030-01216-8_33.

S. Zhang, G. Wu, J. P. Costeira, and J. M. Moura, “FCN-rLSTM: Deep spatio-temporal neural networks for vehicle counting in city cameras,” in Proc. International Conference on Pattern Recognition (ICCV), 2017, pp.3687-3696, 2017, https://doi.org/10.1109/ICCV.2017.396.

K. Sirinukunwattana, S. E. A. Raza, Y. -W. Tsang, D. R. J. Snead, I. A. Cree and N. M. Rajpoot, "Locality Sensitive Deep Learning for Detection and Classification of Nuclei in Routine Colon Cancer Histology Images," in IEEE Transactions on Medical Imaging, vol. 35, no. 5, pp. 1196-1206, 2016, https://doi.org/10.1109/TMI.2016.2525803.

X. Liu, J. Weijer, and A. D. Bagdanov, “Leveraging unlabeled data for crowd counting by learning to rank,” in Computer Vision and Pattern Recognition (CVPR), pp. 7661-7669, 2018, https://doi.org/10.1109/CVPR.2018.00799.

V. Lempitsky and A. Zisserman, “Learning to count objects in images,” in Advances in Neural Information Processing Systems (NIPS), pp. 1324-1332, 2010, https://shorturl.at/BTkkk.

L. Fiaschi, U. Koethe, R. Nair and F. A. Hamprecht, "Learning to count with regression forest and structured labels," Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), pp. 2685-2688, 2012, https://ieeexplore.ieee.org/abstract/document/6460719.

V.-Q. Pham, T. Kozakaya, O. Yamaguchi, and R. Okada, “COUNT forest: Co-voting uncertain number of targets using random forest for crowd density estimation,” in Proc. International Conference on Pattern Recognition (ICCV), pp. 3253-3261, 2015, https://doi.org/10.1109/ICCV.2015.372.

Z. Shi, L. Zhang, Y. Liu, X. Cao, Y. Ye, M. M. Cheng, and G. Zheng, “Crowd counting with deep negative correlation learning,” in Proc. Computer Vision and Pattern Recognition (CVPR), pp. 5382-5390, 2018, https://doi.org/10.1109/CVPR.2018.00564.

L. Liu, H. Wang, G. Li, W. Ouyang, and L. Lin, “Crowd counting using deep recurrent spatial-aware network,” arXiv preprint arXiv:1807.00601, 2018, https://doi.org/10.48550/arXiv.1807.00601.

X. Cao, Z. Wang, Y. Zhao, and F. Su, “Scale aggregation network for accurate and efficient crowd counting,” in European Conference on Computer Vision (ECCV), pp. 757-773, 2018, https://doi.org/10.1007/978-3-030-01228-1_45.

V. Ranjan, H. Le, and M. Hoai, “Iterative crowd counting,” in European Conference on Computer Vision (ECCV), pp. 270-285, 2018, https://doi.org/10.1007/978-3-030-01234-2_17.

C. Zhang, H. Li, X. Wang, and X. Yang, “Cross-scene crowd counting via deep convolutional neural networks,” in Computer Vision and Pattern Recognition (CVPR), pp. 833-841, 2015, https://doi.org/10.1109/CVPR.2016.70.

K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556, 2014, https://doi.org/10.48550/arXiv.1409.1556.

H. Idrees, I. Saleemi, C. Seibert, and M. Shah, “Multi-source multi-scale counting in extremely dense crowd images,” in Proc. Computer Vision and Pattern Recognition (CVPR), pp. 730-734, 2013, https://doi.org/10.1109/CVPR.2013.329.

Y. Zhang, D. Zhou, S. Chen, S. Gao, and Y. Ma, “Single-image crowd counting via multi-column convolutional neural network,” in Proc. Computer Vision and Pattern Recognition (CVPR), pp. 589-597, 2016, https://doi.org/10.1109/CVPR.2016.70.

H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S.Al-Maadeed, N. Rajpoot, and M. Shah, “Composition loss for counting density map estimation and localization in dense crowds,” in European Conference on Computer Vision (ECCV), pp. 544–559, 2018, https://doi.org/10.1007/978-3-030-01216-8_33.

K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” in International Conference on Pattern Recognition (ICCV), pp. 1026-1034, 2015, https://doi.org/10.1109/ICCV.2015.123.