ISSN 1671-3710
CN 11-4766/R

心理科学进展 ›› 2024, Vol. 32 ›› Issue (1): 1-13.doi: 10.3724/SP.J.1042.2024.00001

• 研究构想 •    下一篇


王甦菁1,2(), 王俨1,2, 李婧婷1,2, 东子朝1,2, 张建行3, 刘烨2,4   

  1. 1中国科学院行为科学重点实验室(中国科学院心理研究所), 北京 100101
    2中国科学院大学心理学系, 北京 100039
    3江苏科技大学计算机科学与工程学院, 镇江 212003
    4中国科学院心理研究所, 脑与认知科学国家重点实验室, 北京 100039
  • 收稿日期:2023-06-25 出版日期:2024-01-15 发布日期:2023-10-25
  • 通讯作者: 王甦菁
  • 基金资助:

Cross-modal analysis of facial EMG in micro-expressions and data annotation algorithm

WANG Su-Jing1,2(), WANG Yan1,2, Li Jingting1,2, DONG Zizhao1,2, ZHANG Jianhang3, LIU Ye2,4   

  1. 1CAS Key Laboratory of Behavioral Science, Institute of Psychology, Beijing 100101, China
    2Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China
    3School of Computer Science, Jiangsu University of Science and Technology, Zhenjiang 212003, China
    4State Key Laboratory of Brain and Cognitive Science, Institute of Psychology, Chinese Academy of Sciences, Beijing 100039, China
  • Received:2023-06-25 Online:2024-01-15 Published:2023-10-25
  • Contact: WANG Su-Jing


长久以来, 微表情的小样本问题始终制约着微表情分析的发展, 而小样本问题归根到底是因为微表情的数据标注十分困难。本研究希望借助面部肌电作为技术手段, 从微表情数据自动标注、半自动标注和无标注三个方面各提出一套解决方案。对于自动标注, 提出基于面部远端肌电的微表情自动标注方案; 对于半自动标注, 提出基于单帧标注的微表情起止帧自动标注; 对于无标注, 提出了基于肌电信号的跨模态自监督学习算法。同时, 本研究还希望借助肌电模态, 对微表情的呈现时间和幅度等机理特征进行拓展研究。

关键词: 图像标注, 微表情分析, 远端面部肌电, 微表情数据标注


Micro-expression analysis combined with deep learning has become a major trend. However, the small sample size problem has always hindered the further development of micro-expression analysis relying on deep learning. Micro-expressions are brief, subtle facial expressions, so the time cost and labor cost of micro-expression data annotation are very high, which leads to the problem of small sample size. To further improve the performance of micro-expression spotting and recognition, a huge amount of micro-expression samples is still needed for deep learning model training. Consequently, this research direction has an urgent desire to solve the problem of micro-expression data annotation. To address this issue, our research uses facial electromyographic (EMG) signals as a technical means to propose a set of solutions to the problem of micro-expression annotation from three aspects: automatic annotation, semi-automatic annotation, and unsupervised annotation of micro-expression data.

First, we use physiological psychology methods to combine facial EMG signals and behavioral cognitive psychology experiments to explore the physiological characteristics of micro-expressions. In this study, we recorded the signal frequency and amplitude during the contraction of facial muscles or muscle groups. And relevant EMG metrics were used to accurately and objectively quantify the three features of micro-expressions, namely, short presentation time, small movement amplitude, and asymmetry, to provide a theoretical basis for subsequent research on annotation and intelligent analysis of micro-expressions.

Second, for automatic annotation, this study proposes an automatic annotation scheme for micro-expressions based on distal facial electromyography. Specifically, we deploy EMG electrodes around the face without obscuring the facial expression being expressed. In this way, automatic annotation of micro-expression data by combining EMG information with video is implemented. Meantime, we design a psychological paradigm for inducing facial muscle movements. And based on the electromyographic signal pattern of micro-expressions, we develop an algorithm for automatic micro-expression annotation. Finally, we integrated the automatic annotation process and designed an automated annotation interactive software, which can greatly save the time of micro-expression annotation, reduce the workload of micro-expression coders, and solve the problem of small samples in micro-expression database to a certain extent.

Third, for semi-automatic annotation, we focus on the temporal action localization of micro-expressions (METL), i.e., the process of inferring the onset and offset frames of a micro-expression segment, based on the manual annotation of a single frame within that micro-expression. In particular, we propose a Micro-Expression Contrastive Identification Annotation (MECIA) method as a solution to METL. The backbone of the proposed MECIA method is a deep learning network. The network contains three modules: a contrastive module, an identification module, and an annotation module, corresponding to the three steps of manual annotation. The network's outputs infer the temporal localization of micro-expression clips. The experiments demonstrate that our inferred micro-expression intervals can correspond well to ground-truth intervals, demonstrating the potential of this approach to improve the efficiency of vision-based micro-expression annotation.

Fourth, for unsupervised annotation, due to the limited number of annotated micro-expression samples, we propose a self-supervised learning-based micro-expression analysis algorithm implemented in massive unsupervised annotation face and expression videos. Precisely, we provide time-domain supervised information for unsupervised annotation face videos based on the correspondence between facial EMG and facial expressions. And we design a Transformer-based self-supervised model for cross-modal contrastive learning, which utilizes EMG signals to enhance network learning of features targeting micro-expression action change patterns. Specifically, the introduction of EMG signals enhances the contrastive learning model to capture the weak dynamic facial changes in the time domain. This self-supervised learning model incorporating EMG signals can strengthen the model's understanding of visual features. In addition, cross-modal learning allows the model to learn more generalized features and enhance the robustness of the system.

Key words: image annotation, micro-expression analysis, distal facial electromyography, micro-expression data annotation