HUMAN POSE ESTIMATION SYSTEM USING DEEP LEARNING ALGORITHMS
DOI:
https://doi.org/10.26906/SUNZ.2023.2.075Keywords:
human pose estimation, classification of objects, object detection, convolutional neural networksAbstract
The purpose of this work is the software implementation of neural network that can solve problem of Human Pose Estimation. With rapid improvements of neural network models and computing resources over last 10 years it’s become possible to automate a lot of processes, carry out research and improve quality of life. One of the directions is Computer Vision: it allows to recognize objects, track motions, image segmentation, facial recognition etc. Human pose estimation is the part of Computer Vision area of research. It allows to capture human pose from a video or an image and have many uses in medicine, sport, augmented reality, video games etc. Therefore, the goal of this work is to find and optimize algorithm, that is relatively accurate, for identifying and classifying the joints in the human body. To achieve the goal, the following tasks were solved: current methods and technologies that is commonly used to solve problem of human pose estimation were reviewed and analyzed, artificial neural networks were used as a mathematical apparatus for the model, software implementation for human pose estimation was developed and tested, outputs from model were analyzed and evaluated, results and conclusion were formulated.Downloads
References
Krizhevsky, A., Sutskever, I., Hinton, G. E. (2017), "ImageNet classification with deep convolutional neural networks", Communications of the ACM, Vol. 60, No. 6
Alexander Toshev, Christian Szegedy (2014), “DeepPose: Human Pose Estimation via Deep Neural Networks”, DOI:https://doi.org/10.1109/CVPR.2014.214
Alejandro Newell, Kaiyu Yang, Jia Deng (2016), “Stacked Hourglass Networks for Human Pose Estimation”, DOI:https://doi.org/10.48550/arXiv.1603.06937
Bin Xiao, Haiping Wu, Yichen Wei (2018), “Simple Baselines for Human Pose Estimation and Tracking”, DOI:https://doi.org/10.48550/arXiv.1804.06208
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby (2020), “An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale”, DOI: https://doi.org/10.48550/arXiv.2010.11929
Girdhar, R., Gkioxari, G., Torresani, L., Paluri, M., Tran, D., “Detect-and-track: Efficient pose estimation in videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition”
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B. 20(14) ”2d human pose estimation: New benchmark and state of the art analysis. In: IEEE Conference on Computer Vision and Pattern Recognition “
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun (2015), "Deep Residual Learning for Image Recognition", DOI:https://doi.org/10.48550/arXiv.1512.03385
Huang, Gao; Liu, Zhuang; Van Der Maaten, Laurens; Weinberger, Kilian Q. (2017). Densely Connected Convolutional Networks. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), doi:10.1109/CVPR.2017.243
Ioffe, S., Szegedy, C., “Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning.”
Angjoo Kanazawa, Michael J. Black, David W. Jacobs, and Jitendra Malik, (2018) “End-to-end Recovery of Human Shape and Pose”, DOI: https://doi.org/10.48550/arXiv.1712.06584
COCO2017 dataset, available at https://cocodataset.org/?ref=blog.roboflow.com#download