Ver registro no DEDALUS
Exportar registro bibliográfico



Correlation based feature fusion for the temporal video scene segmentation task (2018)

  • Authors:
  • USP affiliated authors: GOULARTE, RUDINEI - ICMC
  • USP Schools: ICMC
  • DOI: 10.1007/s11042-018-6959-4
  • Keywords: Temporal scene segmentation; Early fusion
  • Agências de fomento:
  • Language: Inglês
  • Imprenta:
  • Source:
  • Acesso online ao documento

    Online accessDOI or search this record in
    Informações sobre o DOI: 10.1007/s11042-018-6959-4 (Fonte: oaDOI API)
    • Este periódico é de assinatura
    • Este artigo NÃO é de acesso aberto

    How to cite
    A citação é gerada automaticamente e pode não estar totalmente de acordo com as normas

    • ABNT

      KISHI, Rodrigo Mitsuo; TROJAHN, Tiago Henrique; GOULARTE, Rudinei. Correlation based feature fusion for the temporal video scene segmentation task. Multimedia Tools and Applications, Amsterdam, Springer, 2018. Disponível em: < > DOI: 10.1007/s11042-018-6959-4.
    • APA

      Kishi, R. M., Trojahn, T. H., & Goularte, R. (2018). Correlation based feature fusion for the temporal video scene segmentation task. Multimedia Tools and Applications. doi:10.1007/s11042-018-6959-4
    • NLM

      Kishi RM, Trojahn TH, Goularte R. Correlation based feature fusion for the temporal video scene segmentation task [Internet]. Multimedia Tools and Applications. 2018 ;Available from:
    • Vancouver

      Kishi RM, Trojahn TH, Goularte R. Correlation based feature fusion for the temporal video scene segmentation task [Internet]. Multimedia Tools and Applications. 2018 ;Available from:

    Referências citadas na obra
    Atrey PK, Hossain MA, El Saddik A, Kankanhalli MS (2010) Multimodal fusion for multimedia analysis: a survey. Multimedia Syst 16(6):345–379.
    Baraldi L, Grana C, Cucchiara R (2015) A deep siamese network for scene detection in broadcast videos. In: Proceedings of the 23rd ACM international conference on multimedia, MM ’15, pp 1199–1202. ACM, New York.
    Baraldi L, Grana C, Cucchiara R (2015) Measuring scene detection performance, pp 395–403, Springer International Publishing, Cham
    BBC: Planet earth. (2006). [Online; accessed 25-may-2018]
    Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1993) Signature verification using a “siamese” time delay neural network. In: Proceedings of the 6th international conference on neural information processing systems, NIPS’93, pp 737–744. Morgan Kaufmann Publishers Inc., San Francisco.
    Chasanis V, Kalogeratos A, Likas A (2009) Movie segmentation into scenes and chapters using locally weighted bag of visual words. In: Proceedings of the ACM international conference on image and video retrieval, CIVR ’09, pp 35:1–35:7. ACM, New York
    Csurka G, Dance CR, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: Workshop on statistical learning in computer vision, ECCV, pp 1–22
    Davis SB, Mermelstein P (1980) Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE transactions on acoustics, speech and signal processing, pp 357–366
    Del Fabro M, Böszörmenyi L (2013) State-of-the-art and future challenges in video scene detection: a survey. Multimedia Syst 19(5):427–454.
    Ellouze M, Boujemaa N, Alimi AM (2010) Scene pathfinder: unsupervised clustering techniques for movie scenes extraction. Multimedia Tools Appl 47(2):325–346.
    Gao G, Ma H (2012) Multi-modality movie scene detection using kernel canonical correlation analysis. In: 2012 21st International Conference on Pattern recognition (ICPR), pp 3074–3077
    Gauch JM, Gauch S, Bouix S, Zhu X (1999) Real time video scene detection and classification. Inf Process Manag 35(3):381–400
    Haghighat M, Abdel-Mottaleb M, Alhalabi W (2016) Discriminant correlation analysis: Real-time feature level fusion for multimodal biometric recognition. IEEE Trans Inf Forensic Secur 11(9):1984–1996.
    Han B, Wu W (2011) Video scene segmentation using a novel boundary evaluation criterion and dynamic programming. In: 2011 IEEE International conference on multimedia and expo, pp 1–6.
    Hardoon DR, Szedmak S, Shawe-Taylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664.
    Hare J, Samangooei S, Dupplaw D (2011) Openimaj and imageterrier: Java libraries and tools for scalable multimedia analysis and indexing of images. In: ACM Multimedia 2011, pp 691–694. ACM. Event Dates: 28/11/2011 until 1/12/2011.
    Jhuo IH, Ye G, Gao S, Liu D, Jiang YG, Lee DT, Chang SF (2014) Discovering joint audio–visual codewords for video event detection. Mach Vis Appl 25 (1):33–47
    Kender JR, Yeo BL (1998) Video scene segmentation via continuous video coherence. In: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, CVPR ’98, pp 367–. IEEE Computer Society, Washington, DC, USA
    Koprinska I, Carrato S (2001) Temporal video segmentation: a survey. In: Signal processing: image communication, pp 477–500
    Kurcius JJ, Breckon TP (2014) Using compressed audio-visual words for multi-modal scene classification. In: 2014 International workshop on computational intelligence for multimedia understanding (IWCIM), pp 1–5.
    LeCun Y, Bengio Y (1998) The handbook of brain theory and neural networks. MIT Press, Cambridge.
    Lloyd SP (1982) Least squares quantization in pcm. IEEE Trans Inf Theory 28:129–137
    Lopes BL, Trojahn TH, Goularte R (2014) Video scene detection by multimodal bag of features. J Inf Data Manag 5(2):194
    Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
    Mikolov T, Sutskever I, Chen K, Corrado G, Dean J (2013) Distributed representations of words and phrases and their compositionality. In: Proceedings of the 26th international conference on neural information processing systems - Volume 2, NIPS’13, pp 3111–3119. Curran Associates Inc., USA.
    Rabiner L, Juang BH (1993) Fundamentals of speech recognition. Prentice-hall, inc., upper saddle river, NJ USA
    Rao KS, Koolagudi SG (2012) Emotion recognition using speech features. Springer Publishing Company, Incorporated, New York
    Rasheed Z, Shah M (2003) Scene detection in hollywood movies and tv shows. In: Proceedings of the 2003 IEEE computer society conference on computer vision and pattern recognition, 2003. vol 2, pp II–343–8 vol 2.
    Rasiwasia N, Mahajan D, Mahadevan V, Aggarwal G (2014) Cluster canonical correlation analysis. In: Kaski S, Corander J (eds) Proceedings of the seventeenth international conference on artificial intelligence and statistics, Proceedings of machine learning research, vol 33, pp 823-831. PMLR, Reykjavik, Iceland
    Saraceno C, Leonardi R (1997) Audio as a support to scene change detection and characterization of video sequences. In: 1997 IEEE international conference on acoustics, speech, and signal processing, 1997. ICASSP-97. vol 4, pp 2597–2600 vol 4.
    Sidiropoulos P, Mezaris V, Kompatsiaris I, Meinedo H, Bugalho M, Trancoso I (2011) Temporal video segmentation to scenes using high-level audiovisual features. IEEE Trans Cir Sys Video Technol 21(8):1163–1177.
    Smeulders AWM, Worring M, Santini S, Gupta A, Jain R (2000) Content-based image retrieval at the end of the early years. IEEE Trans Pattern Anal Mach Intell 22(12):1349–1380.
    Snoek CGM, Worring M (2002) A review on multimodal video indexing. In: Proceedings of the 2002 IEEE International Conference on Multimedia and expo, 2002. ICME ’02. vol 2, pp 21–24 vol 2.
    Swain MJ, Ballard DH (1991) Color indexing. Int J Comput Vis 7(1):11–32.
    Vendrig J, Worring M (2002) Systematic evaluation of logical story unit segmentation. IEEE Trans Multimedia 4(4):492–499.
    Wang X, Gao L, Song J, Shen H (2017) Beyond frame-level cnn: saliency-aware 3-d cnn with lstm for video action recognition. IEEE Sig Process Lett 24(4):510–514.
    Wang X, Gao L, Song J, Zhen X, Sebe N, Shen HT (2018) Deep appearance and motion learning for egocentric activity recognition. Neurocomputing 275:438–447.
    Wang X, Gao L, Wang P, Sun X, Liu X (2018) Two-stream 3-d convnet fusion for action recognition in videos with arbitrary size and length. IEEE Trans Multimedia 20(3):634–644.
    Wu S, Jin M (2015) Study on a new video scene segmentation algorithm. Appl Math Inf Sci 9 (1):361–368. Cited By 0
    Xi W, Fox EA, Fan W, Zhang B, Chen Z, Yan J, Zhuang D (2005) Simfusion: Measuring similarity using unified relationship matrix. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’05, pp 130–137. ACM, New York.
    Xie L, Shen J, Han J, Zhu L, Shao L (2017) Dynamic multi-view hashing for online image retrieval. In: Proceedings of the 26th international joint conference on artificial intelligence, IJCAI’17, pp 3133–3139. AAAI Press.
    Xie L, Shen J, Zhu L (2016) Online cross-modal hashing for web image retrieval. In: Proceedings of the thirtieth AAAI conference on artificial intelligence, AAAI’16, pp 294–300. AAAI Press.
    Xu S, Feng B, Ding P, Xu B (2012) Graph-based multi-modal scene detection for movie and teleplay. In: 2012 IEEE International Conference On Acoustics, Speech and Signal Processing (ICASSP), pp 1413–1416.
    Xu S, Feng B, Xu B (2013) Temporal video segmentation to scene based on conditional random fileds. In: Li S, El Saddik A, Wang M, Mei T, Sebe N, Yan S, Hong R, Gurrin C (eds) 2013 Proceedings of the 19th international conference on advances in multimedia modeling, MMM 2013, Huangshan, China, January 7-9, Part II, pp 374–384. Springer, Berlin.
    Yeung M, Yeo BL, Liu B (1998) Segmentation of video by clustering and graph analysis. Comput. Vis. Image Underst 71(1):94–109.
    Yu SX, Shi J (2001) Grouping with bias. In: Proceedings of the 14th international conference on neural information processing systems: natural and synthetic, NIPS’01, pp 1327–1334. MIT Press, Cambridge
    Zhu L, Shen J, Xie L, Cheng Z (2017) Unsupervised topic hypergraph hashing for efficient mobile image retrieval. IEEE Trans Cybern 47(11):3941–3954.