文章摘要
CHEN Rui(陈瑞)*,TONG Ying*,ZHANG Yiye**,XU Bo**.[J].高技术通讯(英文),2023,29(2):130~139
Video expression recognition based on frame level attention mechanism
  
DOI:10. 3772/ j. issn. 1006-6748. 2023. 02. 003
中文关键词: 
英文关键词: facial expression recognition(FER), video sequence, attention mechanism, feature extraction, enhanced feature, VGG network, image classification, neural network
基金项目:
Author NameAffiliation
CHEN Rui(陈瑞)* (*College of Information & Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, PRChina) (**Jiangsu Future Network Innovation Research Institute, Nanjing 211111, PRChina) 
TONG Ying* (*College of Information & Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, PRChina) (**Jiangsu Future Network Innovation Research Institute, Nanjing 211111, PRChina) 
ZHANG Yiye** (*College of Information & Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, PRChina) (**Jiangsu Future Network Innovation Research Institute, Nanjing 211111, PRChina) 
XU Bo** (*College of Information & Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, PRChina) (**Jiangsu Future Network Innovation Research Institute, Nanjing 211111, PRChina) 
Hits: 391
Download times: 467
中文摘要:
      
英文摘要:
      Facial expression recognition(FER) in video has attracted the increasing interest and many approaches have been made.The crucial problem of classifying a given video sequence into several basic emotions is how to fuse facial features of individual frames.In this paper, a frame level attention module is integrated into an improved VGG based frame work and a lightweight facial expression recognition method is proposed.The proposed network takes a sub video cut from an experimental video sequence as its input and generates a fixed dimension representation.The VGG based network with an enhanced branch embeds face images into feature vectors.The frame level attention module learns weights which are used to adaptively aggregate the feature vectors to form a single discriminative video representation.Finally, a regression module outputs the classification results.The experimental results on CK+and AFEW databases show that the recognition rates of the proposed method can achieve the state of the art performance.
View Full Text   View/Add Comment  Download reader
Close

分享按钮