An approach to temporal concept localization in videos

https://doi.org/10.53730/ijhs.v6nS1.7524

Authors

  • Dilipkumar A. Borikar Shri Ramdeobaba Colllge of Engineering and Management, Nagpur, Maharashtra, India
  • Sushant Kumar Ola Cabs, Bengaluru, Karnataka, India
  • Rakshit Bhagwat Kathawate VMware India Pvt. Ltd., Bengaluru, Karnataka, India
  • Sarthak Prakash Baiswar Compass Inc., Hyderabad, Telangana, India
  • Sourav Jagannath Roy p360 Solutions, Mumbai, Maharashtra, India

Keywords:

Video Understanding, NLP, Video Segmentation, Bi-LSTM, Random Forest, Concept Localization

Abstract

Localizing moments in the videos has been a new challenging task in the field of Computer Science to provide faster search time for video retrieval, query processing and also behavioral analysis. The process involves stages such as video understanding, video segmentation, query processing using NLP and generation of localization of the concepts in the video. Though there have been many attempts to Video understanding in the field of NLP and Computer Vision in past years, they lack to cover the large untrimmed videos in current real-life scenarios. We propose the deep learning-based solution with the use of Random Forest and Bi-LSTM approach to localize the labels in segments and also the time at which they pertain to the particular segments. We used the YouTube 8M dataset provided by YouTube in Kaggle's challenge to train our frame-based model and use it to classify the segments using sliding windows of size 5. Our approach tries to provide a naïve and robust approach to model this concept and provide a way to tackle this large problem. Further improvements in the Bi-LSTM based models and Random Forest models with VLAD would lead to better results.

Downloads

Download data is not yet available.

References

JiyangGao, Chen Sun, Zhenheng Yang, Ram Nevatia, “TALL: Temporal Activity Localization via Language Query” ICCV, 2017, pp. 5267-5275

Hendricks, Lisa Anne, et al., “Localizing Moments in Video with Temporal Language.” EMNLP (2018).

SimonyanK. and Zisserman A., “Two-stream convolutional networks for action recognition in videos”, NIPS, 2014.

Tran D., Bourdev L., Fergus R., Torresani L., and Paluri M., “Learning spatiotemporal features with 3d convolutional network”,ICCV, 2015.

Karpathy A., Toderici G., Shetty S., Leung T., Sukthankar R. and Fei-Fei L., “Large-Scale Video Classification with Convolutional Neural Networks,” 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1725-1732.

Oneata D., Verbeek J., andSchmid C., “Action and Event Recognition with Fisher Vectors on a Compact Feature Set”, ICCV, 2013.

Shou Z., Wang D., and Chang S. F., “Temporal action localization in untrimmed videos via multi-stage CNNs” ,CVPR, 2016.

Sun Chen, “Temporal Localization of Fine-Grained Actions in Videos by Domain Transfer from Web Images”, MM’15 Proceedings of the 23rd ACM international conference on Multimedia Pages 371-380.

Wu J., Yin B., and Qi W., “Video Motion Segmentation Based on Double Sliding Window”, IEEE, November 2011.

HochreiterS. and Schmidhuber J., 1997. “Long Short-Term Memory”. Neural Comput. 9, 8 (November 15, 1997), 1735–1780.

Donahue, Hendricks L. A., Guadarrama S., Rohrbach M., Venugopalan S., Saenko K. and Darrell T., “Long-term recurrent convolutional networks for visual recognition and description”, CVPR, 2011.

Srivastava, Mansimov E. and Salakhutdinov R., “Unsupervised learning of video representations using LSTMs”, ICML, 2015.

Wang and Schmid, “Action Recognition with Improved Trajectories”, ICCV, 2013.

Joe Yue-Hei Ng, Matthew Hausknecht, SudheendraVijayanarasimhan, OriolVinyals, RajatMonga, George Toderici, “Video Understanding with Deep Networks”, University of Cornell, 2015.

Published

18-05-2022

How to Cite

Borikar, D. A., Kumar, S., Kathawate, R. B., Baiswar, S. P. ., & Roy, S. J. (2022). An approach to temporal concept localization in videos. International Journal of Health Sciences, 6(S1), 10473–10483. https://doi.org/10.53730/ijhs.v6nS1.7524

Issue

Section

Peer Review Articles

Most read articles by the same author(s)