言语想象的神经机制
Neural mechanism of speech imagery
Received: 2022-06-14
言语想象不仅在大脑预处理机制方面起到重要的作用, 还是目前脑机接口领域研究的热点。与正常言语产生过程相比, 言语想象的理论模型、激活脑区、神经传导路径等均与其有较多相似之处。而言语障碍群体的言语想象、想象有意义的词语和句子时的脑神经机制与正常言语产生存在差异。鉴于人类言语系统的复杂性, 言语想象的神经机制研究还面临一系列挑战, 未来研究可在言语想象质量评价工具及神经解码范式、脑控制回路、激活通路、言语障碍群体的言语想象机制、词语和句子想象的脑神经信号等方面进一步探索, 为有效提高脑机接口的识别率提供依据, 为言语障碍群体的沟通提供便利。
关键词:
Speech imagery is an internal perceptual experience of one's own speech or others' speech. It not only plays an important role in the pre-processing mechanism of the brain, but it is also the latest technology in the field of brain-computer interface (BCI) research. Firstly, the theoretical model, activation of brain regions, and neural conduction pathways of speech imagery have many similarities with speech production, but there are still some controversies. Theoretical models of speech production, such as the Directions Into Velocities of Articulators (DIVA) model and the State Feedback Control (SFC) model, show that the process of speech imagery and speech production are highly overlapping in the motor planning of articulatory organs and the prediction of somatosensory and auditory results. The difference is that speech imagery does not activate the last step in speech production, that is, the execution of articulatory movement. There are many similarities between the brain regions activated by speech imagery and speech production, which are mainly reflected in the brain regions of the speech motor planning and auditory center. However, the signal strength of the brain regions activated by speech imagery is weaker. The neural pathways of speech imagery and speech production are also similar. However, current research is only limited to the connections between some brain regions, such as the connection and integration of auditory cortex and motor cortex. Whether or not there are direct or indirect conduction pathways in speech imagery consistent with speech production needs further study. Secondly, for people with speech disorders, the severity of speech imagery ability is not completely positively related to the severity of speech production impairment. For example, some patients with aphasia or stuttering have no limitation in their ability of speech imagery. Due to the similarities between neural mechanisms on speech imagery and speech production, speech imagery therapy is considered a new technology for the rehabilitation of people with speech impairments caused by brain injury. However, there is still a lack of mechanism and application of speech imagery for dysarthria caused by the damage to speech motor planning and auditory cortex and its neural conduction pathways. Thirdly, when imagining meaningful words and sentences, the EEG signals are different from those in speech production. The complexity, length, and meaning of the speech samples will affect the brain activation during speech imagery. Moreover, the identification of brain regions with weak signals, stimulation methods, and brain signal detection techniques will affect the identification of cranial nerve signals. The neural mechanism of speech imagery plays an important role in both the brain preprocessing mechanism and BCI technology. Given the complexity of the speech system, research on the neural mechanism of speech imagery still faces a series of challenges. Further research can focus on speech imagery quality evaluation tools, neural decoding paradigms, brain control circuits, activation pathways, and speech imagery mechanisms in speech disorders. Further exploration of the cranial nerve signals of word and sentence imagery will help provide a basis for effectively improving the recognition rate of BCI and to facilitate the communication for people with speech disorders.
Keywords:
本文引用格式
王勇丽, 葛胜男, Lancy Lantin Huang, 万勤, 卢海丹.
WANG Yongli, GE Shengnan, Lancy Lantin Huang, WAN Qin, LU Haidan.
1 引言
言语想象(speech imagery)是一种对自己或他人言语事件产生的内部类知觉(quasi-perceptual)体验(Orpella et al., 2022), 是不发出任何声音或做出面部动作的内在发音的一种状态(Torres-García et al., 2016)。言语想象是“我们内心的小声音”, 是自上而下的认知加工过程, 在科学和哲学等领域都有悠久的历史(MacKay, 1992; Solms, 2000; Weber & Bach, 1969)。由于言语想象不会受到肌肉运动的干扰, 便于脑功能检测技术的信号采集, 被认为是一个潜在的研究大脑预处理机制的宝贵工具(Keller & Mrsic-Flogel, 2018)。
视觉想象(Pearson, 2019)、听觉想象(Gu et al., 2019)、触觉想象(Yoo et al., 2003)、嗅觉想象(Djordjevic et al., 2005)和运动想象(白学军 等, 2016; Souto et al., 2020)等感官或运动方面的研究表明, 想象与实际感官执行的神经表征有高度重叠。但是言语想象既包含了对发音器官的运动想象(想象说), 又包含了对语音感知的想象(想象听), 是有别于单纯的感知想象和运动想象的一种特殊的双重机制的想象方式(Tian et al., 2016)。近年来, 神经心理学家和语言学家探索了言语想象的神经机制, 有些学者认为言语想象可以激活与实际言语产生中发音运动计划相似的脑区(Cooney et al., 2018; Naito et al., 2002; Tian & Poeppel, 2012), 与言语执行中的自我监控(Tian & Poeppel, 2015)、感觉预测和反馈机制(Hickok et al., 2011; Jones & Fernyhough, 2007)有相同之处, 有些学者甚至认为言语想象可能从高层次的语言表征开始就与言语产生有相似之处(Oppenheim & Dell, 2010)。另外, 言语想象神经传导路径方面的研究检测到了微弱的面部肌肉表面肌电信号(Ma et al., 2019; Orpella et al., 2022), 认为言语想象与言语执行的神经传导路径也相似。基于言语想象与言语执行神经机制的相似性, 言语想象疗法被认为是脑损伤导致的言语障碍群体康复治疗的新技术(窦智 等, 2021; Durand et al., 2018; Laures-Gore et al., 2021)。而有些研究则认为言语想象与外显言语产生(overt speech)具有不同的神经信号特征, 不能简单的将言语想象认为是言语执行中的一部分过程(Proix et al., 2021; Stark et al., 2017)。厘清言语想象的神经机制, 有助于我们对言语产生机制有更清楚的认识(Keller & Mrsic-Flogel, 2018; Orpella et al., 2022), 也能够为言语障碍群体提供更加高效的康复治疗技术。因此, 本文的第一个主要内容是梳理言语想象与言语产生神经机制的异同。
不同言语想象内容的神经机制是当前言语想象需要突破的技术难题, 该部分的研究主要集中在脑机接口领域(brain-computer interface, BCI) (Cooney et al., 2018)。BCI是通过对脑活动信号的采集与处理来实现人或动物的大脑控制外接设备的一种通讯系统, 为言语障碍和肢体障碍群体与外界沟通带来了极大的便利(陈霏, 潘昌杰, 2020; Cooney et al., 2018)。言语想象应用于BCI的识别率和舒适性优于运动想象和视觉想象(Nguyen et al., 2017), 是一种新型的交际型脑机接口方式。BCI领域对言语想象的研究, 强调对复杂言语想象内容脑神经信号解码的准确性。提高神经信号解码率的前提是了解言语想象时具有最佳解码能力的大脑激活区域(刘艳鹏 等, 2022)和采用提取到最大信息量的技术或方法(Cooney et al., 2018; Proix et al., 2021)。皮层脑电(electrocorticographic, ECoG)、脑电图(electroencephalography, EEG) (Cooney, et al., 2018)、脑磁图(magnetoencephalography, MEG) (Orpella et al., 2022)和功能磁共振(functional magnetic resonance imaging, fMRI)等技术促进了言语想象状态下神经信号的采集和解码。已有文献综述指出言语想象的内容也可能是影响其神经信号的主要因素之一。常见的想象内容有元音想象(vowel imagery)、音节想象(syllable imagery)、单词想象(word imagery)和句子想象(sentence imagey)等(陈霏, 潘昌杰, 2020; Cooney et al., 2018), 内容的分类特征和其附加的语言信息会产生不同的神经信号。本文的第二个主要内容是梳理不同想象内容对应的脑神经信号特征。
2 言语想象与言语产生神经机制的异同
言语产生是一个极其复杂的过程, 言语想象的神经机制研究多是以正常言语产生的神经机制作为对照(Cooney et al., 2018)。学者们分别从言语产生的理论模型、脑区激活和神经传导路径、言语障碍群体等方面对比了言语想象的神经机制。
2.1 言语想象与言语产生有相似的的理论模型
当前流行的言语产生模型有DIVA (Directions Into Velocities of Articulators)模型(Guenther, 1994, 1995, 2006)和状态反馈控制(state feedback control, SFC)模型(Heremans et al., 2011; Hickok, 2012; Houde & Nagarajan, 2011)。DIVA模型阐述了人体言语产生的过程, 包括前馈系统的运动指令(自上而下)、反馈系统的听觉和体感反馈错误纠正指令 (自下而上), 以及联系前馈和反馈系统的内部前馈模型(internal forward models)三个环节(Guenther, 1994, 1995, 2006)。其中, 内部前馈模型是将运动指令副本传递中枢神经系统预测即将接受到的体感和听感刺激的信息传递至反馈系统(蔡笑, 张清芳, 2020; Guenther & Vladusich, 2012)。大量研究表明, 言语想象与实际言语产生过程有部分的重叠, 强调了与发音器官的运动计划和体感、听感结果预测的过程高度重叠。一项MEG和fMRI的研究认为, 言语想象的过程类似于DIVA模型中言语产生的前馈和反馈系统均暂停工作, 只有内部前馈模型发挥重要作用的过程(Tian & Poeppel, 2010), 即预测了体感和语音听感的结果(Kilteni et al., 2018)。而这个预测的过程是按照“运动计划(motor planning)→运动指令副本(motor efference copy)→第一前馈模型运动估计(顶叶皮层)→听觉指令副本(perceptual efference copy)→第二前馈模型听觉估计(感觉皮层)”的顺序串行(Tian & Poeppel, 2010; Tian et al., 2016)。运动假说(motor hypothesis)认为, 言语想象是实际言语产生过程的衰减版, 具有明确的发音运动计划(articulatory plan) (Cooney et al., 2018)。行为学的研究也表明, 言语想象与言语产生的过程高度相似, 是运动系统对感觉结果的预测(Scott et al., 2013)。SFC模型强调了言语产生过程中存在一个内部言语计划反馈回路和一个外部错误监测反馈回路(Hickok et al., 2011; Hickok, 2012; Houde & Nagarajan, 2011)。Orpella等人(2022)采用MEG技术对言语想象状态下的脑功能活动进行解码, 发现在言语想象的开始阶段检测到运动前区和颞顶区的功能活动, 这与SFC理论的内部言语计划反馈回路结果一致, 接着在想象500ms以后听觉和双侧运动区的活动明显激活, 与SFC外部错误监测反馈回路一致(Orpella et al., 2022)。值得注意的是, 以上研究强调了言语想象与言语执行的相似之处, 均未提及言语想象是否包含正常言语产生的概念化、言语组织等高层次认知加工过程。
以上研究均表明了言语想象与实际言语产生过程相似或部分重叠, 不同之处在于言语想象并未激活最后一步发音运动执行的环节。这些理论支持了将言语想象应用于BCI的可能性, 也是当前替代运动想象和视觉想象而成为BCI的主要手段的依据(Nguyen et al., 2017)。然而, 目前较少有理论来阐述二者之间的差异。在言语障碍群体的应用和BCI的研究中, 学者依据行为学和脑电信号的差异提出二者可能存在不同的加工机制。例如, 失语症患者的实际言语产出障碍明显, 但其仍旧保留较好的言语想象的能力(Fama et al., 2017; Stark et al., 2017)。一项对比言语想象和言语产生状态脑电信号识别率的研究表明, 在实际言语产生时高频宽带(broadband high-frequency, BHA)信号识别率最高, 而在言语想象时低θ波、低β波和低γ波的识别率最高, 甚至超过了言语执行, 这说明言语执行和言语想象在某些加工机制方面有一定的差异(Proix et al., 2021)。甚至有研究表明, 言语想象比实际言语产出更容易受到左岛盖部的影响(Geva et al., 2011)。
以上争议可能与言语想象的研究还面临很多挑战有关。在研究方法上的不足, 体现在由于言语想象没有明确的行为表现, 通过行为学的比较研究较为局限(Alderson-Day & Fernyhough, 2015; Martin et al., 2018), 后续可以探索言语想象质量评价的行为学方法。在神经信号解码上的不足, 有些脑损伤患者大脑语言中枢皮层网络受损, 只能对残存的其他区域的神经信号进行解码(Guenther et al., 2009; Wilson et al., 2020)。某种情况下这些区域可能存在一些代偿功能, 或者解剖位置使得神经信号的解码很难实现, 导致与正常群体的神经信号有差异。
2.2 言语想象与言语产生激活的脑区相似
言语想象激活的脑区与实际言语产生有很多相似之处, 主要体现在发音运动计划和听觉中枢的区域, 但言语想象激活的脑区信号强度较弱。
言语想象的fMRI、EEG和MEG等研究表明, 言语想象激活了与发音运动计划相似的脑区, 包括布洛卡区(Broca’s area)、辅助运动区(supplementary motor ares, SMA)、运动前区(premotor cortex)、脑岛(insula)和小脑(cerebellum)等区域。早期, 研究者为了降低发音时面部肌肉运动造成的fMRI伪影, 尝试采用事件相关的功能磁共振(event- related fMRI)方法来降低言语状态下脑激活信号的伪影(Birn et al., 1999; Palmer et al., 2001), 结果表明受试者想象发单词与实际大声读出单词的大脑激活信号有相似的地方(Palmer et al., 2001)。后续有大量fMRI的研究表明言语想象激活了左额下回后部(布洛卡区) (André et al., 2005; Hurlburt et al., 2016; Shergill et al., 2001), 辅助运动区、运动前区、岛叶(André et al., 2005; Cooney et al., 2018; Shergill et al., 2001)等。MEG相关的研究也表明言语想象会激活以上区域(Tian & Poeppel, 2010, 2012), 这些区域是正常言语产生时主要负责发音运动计划的区域(Bohland & Guenther, 2006; Guenther et al., 2006; Guenther & Vladusich, 2012; Kearney & Guenther, 2019)。另外, 研究表明言语想象和实际言语产生时均激活了小脑(Ackermann & Hertrich, 2003; Naito et al., 2002), 且小脑在发音运动计划中也扮演了重要的角色(Kearney & Guenther, 2019; Parrell et al., 2017)。这些结论可以用理论模型中提及的言语想象有明确的发音运动计划来解释。“第一第二前馈模型”、Orpella等人的研究, 均强调了言语想象过程中涉及对构音器官运动的体感预测结果, 且运动假说和抽象假说也强调了言语想象涉及了言语产生中的发音运动计划阶段。
正常言语产生过程中颞叶皮层主要负责将听觉反馈和运动信息进行整合(Hickok, 2012), 言语想象同时包括基于体感−运动的“想象说”和基于记忆提取的“想象听”, 其过程同样激活了颞叶皮层。Riaz等人(2015)探索了不同数据分析方法下元音实际发音和言语想象状态的EEG, 发现二者在位于韦尼克区的信号相似性较高。Courson和Tremblay (2020)的荟萃分析表明尤其是颞中回后部参与了言语想象。言语想象的近红外光谱成像(functional near infrared spectroscopy, fNIRS)研究也表明, 通过言语想象来操控脑机接口时左侧颞叶和颞顶叶皮层被激活(Sereshkeh et al., 2018)。Tian等人的研究采用先进的MEG和fMRI技术, 且在实验设计上更加精细地区分了言语构音动作想象(articulation imagery, AI)和听觉想象(hearing imagery, HI)状态脑区激活的异同, 发现在额叶−顶叶感觉运动系统中, AI时更多激活了与发音运动计划相关的脑区, 激活顶叶的150~170ms后, 还观察到双侧颞叶的活动(Tian & Poeppel, 2012); HI时更多激活了与听觉记忆恢复有关的额中叶、顶叶下皮层和顶叶内沟等区域(Tian et al., 2016)。AI和HI的双侧颞叶活动模式, 类似于实际听到声音时听觉相应的脑区激活(Tian & Poeppel, 2012)。因此, AI和HI的神经表征既有分离又有一定范围的重叠, Tian等学者将此称为言语想象的“双流预测模型” (Tian & Poeppel, 2012)。言语想象激活了听觉皮层与言语想象理论模型中“第一第二前馈模型”中提及的第二前馈模型听觉估计、Orpella等人研究中提及想象500ms后听觉皮层激活一致。
言语想象可以激活与发音运动计划和听觉中枢的脑区, 而与实际言语产生相比激活信号强度较弱(Alderson-Day & Fernyhough, 2015; Pei et al., 2011; Proix et al., 2021)。Jahangiri和Sepulveda (2018)的研究表明言语想象时激活的脑电信号弱于言语产生, 且不足以解释言语信息的复杂性。另外, 言语想象并未引起肉眼可见的发音器官运动, 其是否激活了言语产生中运动执行的主要运动皮层(又称初级运动皮层M1)还存在比较大的争议。有研究者认为, 言语想象激活了M1, 但是激活信号比较弱, 因此不足以引起真正的运动(Ehrsson et al., 2003)。然而, 有些研究则认为想象过程中辅助运动区对M1区发挥了抑制作用, 并未检测到M1区的激活信号(高晴, 陈华富, 2010; Tak et al., 2015)。首先, M1是否被激活, 可能与检测技术有关(Martin et al., 2018), 检测技术之间的敏感性存在一定的差异, 后续可以进一步进行检测技术的对照研究。其次, 关于M1区被抑制的证据, 也可能与受试者刻意控制运动有关。受试者被要求不引起实际运动, 言语想象实验过程中受试者内心会刻意控制运动执行, 因此M1区被抑制。再者, 言语想象同时包括了想象听和想象说, 当二者同时想象时, 可能会分散受试者的注意力, 从而影响了M1的激活, 后续可能需要严格的实验设计控制注意力的影响(Bruno et al., 2018)。
综上所述, 在脑区激活方面, 言语想象与实际言语产生有很多相似之处, 包括发音运动计划相关脑区和听觉皮层, 在进行言语想象相关的实验研究或BCI的识别时考虑应覆盖以上区域。与实际言语产生相比, 言语想象激活的信号强度较弱, 尤其在初级运动皮层的激活上争论较大, 后续可能需要进行不同检测技术之间的对照研究和严格的实验设计控制无关变量。
2.3 言语想象与部分言语产生的神经传导路径相似
言语产生过程的运动执行阶段, 需要将大脑皮层的信息借助相应的神经传导通路向下传递, 才能引起言语器官的协调运动(Bohland & Guenther, 2006; 蔡笑, 张清芳, 2020; Hickok, 2012)。学者们也探索了言语想象的神经传导路径, 多集中在采用EEG、MEG等检测大脑皮层之间的功能连接(function connectivity) (Sandhya et al., 2015)、采用肌电图(EMG)检测肌群的肌肉电信号(Lebon et al., 2008), 以及采用弥散张量成像技术(diffusion tensor imaging, DTI)追踪脑白质纤维束的通路(Li et al., 2018)。
MEG的研究探索了言语想象过程中不同脑区之间的神经传导路径, 结果显示从左侧颞叶的活动, 到顶叶、前额岛叶、双侧运动前区的变化, 反映了言语想象时也存在与言语产生相似的听觉−运动皮层的功能连接和整合过程(Orpella et al., 2022)。Sandhya等学者(2015)采用EEG检测了言语想象状态下脑区之间的神经传导路径, 表明额叶和颞叶大脑区域之间存在与言语产生相似的左右半球之间的相互作用, 同样是来自左半球的信息流比右半球大, 且左侧颞叶跨电极相关性较高, 但在言语执行时是左侧额叶跨电极相关性较高。采用表面肌电(sEMG)的研究则表明, 在言语想象时可以精准地采集到面部肌肉的微弱电信号(Ma et al., 2019; Orpella et al., 2022), 这似乎说明言语想象存在与发音运动执行相似的神经传导路径。但有学者认为, sEMG检测到的微运动是心理语言常见的现象, 是运动执行的副产品而不能完全抑制(Perrone-Bertolotti et al., 2014), 是否可以用发音运动执行的神经通路解释还存在疑问。Orpella等学者(2022)则认为, 言语想象同样存在发音运动计划被执行的过程, 只不过是到外周神经肌肉系统时被抑制了。因此, 言语想象的神经传导路径研究还需要进一步探索。
2.4 言语障碍群体言语想象的神经损伤机制
言语想象是认知心理加工过程, 以上所涉及的发音运动计划、听觉皮层和神经传导路径的神经解剖结构受损, 可能会影响言语想象的能力, 研究表明言语障碍群体的言语想象受损与言语产生的损伤程度并非完全正相关。
失语症通常是由于左侧颞叶—顶叶—额叶等网络受损导致的获得性语言障碍(Klingbeil et al., 2019; Schumacher et al., 2019), 这不仅影响局灶部位的功能, 还会影响该部位与其他脑网络之间的联系(姚婧璠 等, 2021)。研究表明, 失语症患者言语想象能力有一定程度的受损, 尤其是想象同音词或押韵词方面存在困难, 但很多其他内容的言语想象并不存在困难(Fama et al., 2019; Geva et al., 2011; Langland-Hassan et al., 2015)。例如, 左侧大脑中动脉区域受损的失语症患者中, 言语功能损伤较重的患者更依赖于言语想象(内心语言)来进行命名和书面图片的描述, 表明言语想象还可能和言语产生中的工作记忆有相似的机制(Geva et al., 2011), 且有临床干预的研究通过言语想象来改善失语症患者的动词命名能力(Durand et al., 2018)和名词命名能力(Laures-Gore et al., 2021), 以及改善失语症的负面的心理状况(Barrows et al., 2021)、语言表达的流畅性(Barrows et al., 2021; Haire et al., 2021)和言语的流利性(窦智 等, 2021; 吴金香, 2019)等。Langland-Hassan等人(2015)的研究则表明失语症(布洛卡失语、命名性失语为主)的患者名词命名能力要优于无声命名(言语想象)能力, 且二者之间相关性不明显。而Fama等人(2019)的研究表明, 失语症患者自我报告的言语想象与失语的严重程度相关, MRI检测言语想象能力与大脑腹外侧尤其是感觉运动皮层的血流病变相关。以上失语症患者的言语想象与实际言语产出之间的差异, 可能是与受损的神经解剖结构位置和严重程度不同有关(Fama et al., 2017), 目前尚缺乏研究报道不同类型和损伤程度失语症之间言语想象能力的差异和机制(Langland-Hassan et al., 2015)。
口吃患者的神经病理机制为脑44区与左顶叶之间的功能耦合缺陷(Neef et al., 2016), 调查研究表明, 65%的口吃受访者表示在内心想象言语时没有口吃的情况(Netsell & Bakker, 2017)。临床应用研究表明想象哼唱旋律可以促进口吃患者的言语流利性(Neef et al., 2016; Neef et al., 2018), 而且想象流畅言语可以提高口吃患者的阅读广度(Arongna et al., 2020)。目前尚缺乏口吃患者言语想象和言语产生机制差异的研究, 这些应用研究并不能很好地解释言语想象的机制, 并且以上研究也多是将言语想象与其他疗法联合起来应用, 缺乏更有价值的临床随机对照试验和循证医学证据证明言语想象疗法的治疗效果。
依据以上言语障碍群体的结论, 结合上文所介绍言语想象时会涉及多个脑区和神经传导通路, 相关神经的损伤可能使言语想象能力变慢、变弱, 结合部分失语症患者言语想象能力并未受限等结论, 说明局部性的神经损伤并不能使患者完全丧失言语想象能力。想象可以重塑脑功能(吴拾瑶 等, 2021), 言语想象治疗已经用于失语症和口吃人群的言语障碍, 而言语想象涉及言语运动计划和听觉皮层, 以及神经传导通路, 这些区域受损会导致运动性言语障碍, 对于这类群体的言语想象机制和应用研究还较为缺乏, 值得进一步探索。
3 不同言语想象内容的脑神经信号
3.1 想象元音/音节时的脑神经信号
元音/音节是正常语言学习时较为简单的语音单位, 早期大量言语想象相关的研究倾向于采用元音或无意义音节作为想象内容, 其与正常言语产生有相似的脑神经信号。DaSalla等人(2009)的研究指出, 在执行元音/a/和/u/想象的起始时, C3、CZ和C4电极(国际10~20系统)位置出现负波趋势, 在300 ms左右出现正波, 并且这些波形与真实言语产生时的事件相关脑诱发电位(event-related potential, ERP)非常类似。杨晓芳和江铭虎(2014)的研究发现在执行4个元音和4个辅音想象任务时的ERP波形与真实发音器官运动引起的颅内及头皮电位时间进程相似。Orpella等学者(2022)通过MEG探索了无意义音节层面(/pa/, /ta/, /ka/)言语想象的脑功能信号, 结果表明其与正常言语产生的差别不大。这些结论与前述言语想象的理论以及激活的言语计划和听觉相关脑区的结论一致, 且元音或无意义音节不涉及语义加工成分, 降低了大脑活动时脑电信号的复杂性(Ramirez-Quintana et al., 2021), 使其言语想象的脑电信号与正常言语产生相似。
3.2 想象汉语词语/英语单词时的脑神经信号
在进行赋有语义的汉语词语/英语单词的想象时, 与正常言语产生时的脑神经信号存在某种程度的差异。Proix等人(2021)最大限度地增加语音表征、语义类别和数量的可变性, 包括对实词动词、抽象动词、实词名词和抽象名词等单词的语料进行言语想象, 结果表明正常言语产生时感觉运动皮层、颞叶皮层的BHA激活比较明显, 而言语想象时θ波、β波、γ波比较显著, BHA不显著。Chengaiyan等人(2020)探索了“辅音−元音−辅音”结构(consonant vowel consonant, CVC)单词的言语想象, 表明言语想象时颞叶的θ波占主导, 而在言语产生(发音)时, 额叶的γ波占主导。Nguyen等人(2017)对比了想象状态下长单词、短单词和元音的脑信号来探究影响言语想象分类效果的因素, 包括发音时频特征和语料复杂程度等, 结果表明短单词和长单词在布洛卡区和韦尼克区出现时频差异, 短单词之间和音位之间的分类性能相似, 这表明影响言语想象分类效果的是发音时频特征; 短单词比长单词更容易在高频段(31~70Hz)和布洛卡区受到抑制, 但是长单词相比于短单词能提供更高的Kappa 系数, 表明复杂程度越高的单词越容易利用脑神经信号进行区分。郭苗苗等人(2018)分别探索了想象4个不同词性和语义的汉字(喝, 右, 吃, 冷)时的脑电信号, 结果表明想象不同意义的汉字时在α波和β波随时间动态变化上有不同的表现, 反映了想象不同语义的汉字时大脑思维活动的差异。不同言语想象内容与脑神经信号的差异见表1。这些差异一方面可能是由于语料的复杂程度和词性差异等涉及复杂的语言加工引起的, 另一方面也可能是研究者选择神经信号采集的脑区不同造成了差异。一些学者在研究言语想象的BCI信号采集时, 多考虑覆盖左侧额叶、颞叶和双侧感觉运动区(Jahangiri & Sepulveda, 2017; Pei et al., 2011; Proix et al., 2021; Sereshkeh et al., 2018; Wang et al., 2013)等区域, 也有多数研究者为了不遗漏较弱信号的区域而选择全脑区(Dash et al., 2020; 郭苗苗 等, 2018; Orpella et al., 2022)。虽然有研究表明仅利用左半脑通道信息的BCI系统的性能不差于利用全脑通道信息的 BCI 系统, 利用左半脑通道的信息足够可以提取出汉字发音想象的脑电特征(Wang et al., 2013), 但值得注意的是, 大部分研究均重点关注了激活信号较强的区域, 部分信号较弱的区域往往都不进行结果的报道, 信号较弱的脑区激活也可能是区分语义加工重要的区域。
表1 不同言语想象内容与脑神经信号的差异
| 研究 | 被试 | 想象对比 | 内容 | 覆盖脑区 | 工具 | 信号 | 结果 |
|---|---|---|---|---|---|---|---|
| Proix et al. (2021) | 13例 癫痫患者 | 言语想象 VS 言语产生(发音) | 6个单词: spoon, cowboy, battlefield, swimming, python, telephone | 感觉运动皮层、颞上回、颞中回、颞下回、下额叶 | ECoG (侵入性) | BHA、θ波、β波、γ波 | ①言语想象时BHA不显著, θ波、β波、γ波显著; ②言语执行时以上脑区显著。 |
| Chengaiyan et al. (2020) | 6例 健康成人 | 言语想象 VS 言语产生(发音) | 50个 CVC (辅音−元音−辅音)单词, 包含/a/, /i/, /u/, /e/, /o/: can, car, cat, bad, dad, gas, lab, man, rat, tap; did, fit, kit, lip, pig, pin, rip, sim, sit, zip; bun, bus, cup, gum, hug, hut, jug, pip, sum, sun; bed, den, hen, her, jet, led, let, net, red, vex; box, cop, dog, fog, jog, lot, not, pot, rod, sob. | 全脑区 (额叶、颞叶、顶叶、枕叶) | EEG | α波、δ波、θ波、β波、γ波 | ①言语想象时, 颞叶的θ波占主导; ②言语产生(发音)时, 额叶的γ波占主导 |
| Nguyen et al. (2017) | 15例 健康成人 | 言语想象状态下长单词、短单词和元音的对比 | 长单词:cooperate, independent 短单词:in, out, up 元音:/a/, /i/, /u/ | 全脑区 | EEG | 脑电波 | 脑活动信号确实均集中在位于布洛卡区上方的左前额叶, 中额叶和顶叶, 运动皮层和韦尼克区。想象时短单词和长单词在布洛卡区和韦尼克区出现时频差异; 短单词比长单词更容易在高频段(31~70 Hz)和布洛卡区受到抑制。 | 郭苗苗 等(2018) | 9例 健康成人 | 言语想象 VS 空闲期 | 4个字:喝, 右, 吃, 冷 | 全脑区 | EEG | α波、δ波、θ波、β波、γ波 | 想象“喝”时, F5与F6电极的脑电信号能量在9~16 Hz频率区域, 500 ms到2100 ms时间范围内较基线有明显的增强。 想象“右”时, EEG信号在1500 ms之后, 8~14 Hz频率区域能量较基线明显增强。 想象“吃”时, EEG信号在300~1800 ms时间内, 8~14 Hz频率区域内能量较基线明显减弱。 想象“冷”时, EEG信号在300~2000 ms时间内, 6~13 Hz和16~22 Hz频率区域内能量较基线均有明显减弱。 |
| Orpella et al. (2022) | 21例 健康成人 | 言语想象 VS 阅读 | 3个音节:/pa/, /ta/, /ka/ | 全脑区 | MEG 面部sEMG | 脑磁信号 | ①起始阶段的视觉处理信息两种状态共享枕叶视觉皮层; ②言语想象时脑区激活的时间进程按照前120 ms的视觉区域活动, 180 ms时以颞外侧皮质(左侧)为主, 在位置和时间上都与预期的语音编码一致。260~300 ms时到达顶叶(听觉记忆皮层)、岛叶、双侧运动前区, 反映听觉运动整合过程和发音运动计划过程。440 ms后为广泛性左侧听觉皮层活动; 阅读时以视觉信息解码为主; ③言语想象和阅读时均出现面部肌肉微弱电信号, 不同音节/pa/、/ta/、/ka/之间的面部电流无差异。 |
从赋有语义的词汇产生时脑区激活机制可知, 言语想象的大脑加工机制非常复杂。BCI的目的是将想象的言语尽可能正确的识别出来, 激活信号相对较弱的脑区可能也在BCI识别中扮演了重要的角色。刺激方式会影响到脑神经电信号的采集, Zhang等(2020)人探索了普通话四声调的言语想象, 结果表明视听结合刺激的神经信号解码率(80.1%)比单纯视觉刺激时(67.7%)的高。后续可以对比信号特征提取的脑区和刺激方式方面进行进一步研究。
3.3 想象句子时的脑神经信号
句子是人们日常交流最常使用的语言方式, 对句子解码的研究具有更加深远的意义。目前关于句子想象的神经机制研究还较少, 多是以想象时脑功能信号的识别率为主。Dash等人(2020)尝试采用MEG探索了想象5个简单句子(例如:How are you?I need help.等)的脑功能信号的识别率, 认为采用卷积神经网络模型(convolutional neural networks, CNN)脑信号解码识别率高达93%。Lee等人(2019)采用EEG识别想象句子的全脑区脑电信号的准确率为34.2%。但两项研究均未汇报具体脑区的活动特征。进一步分析其研究的差异可能是由于脑功能神经信号采集技术和解码技术的差异引起的。MEG对垂直于头皮表面(脑沟)的皮层信号最敏感, 而EEG对平行表面(脑回)的信号敏感, 且MEG可以很好的平衡时间和空间分辨率, 这均有助于脑神经信号的采集和解码(Fyshe, 2020; Orpella et al., 2022)。在解码技术上, Lee的研究采用公共空间特征提取方法和线性判别分析的分类方法, 判断句子的十三分类范式识别的正确率(Lee et al., 2019), 而Dash的研究采用小波变换的特征提取和卷积神经网络计算五分类范式的识别率(Dash et al., 2020), 十三分类相对五分类更复杂。然而, 由于MEG价格昂贵和不便携的特性, 当前言语想象的主要技术手段还是以ECoG和EEG为主, 但MEG研究的实验范式和解码方法值得借鉴。
综上所述, 想象元音和音节时大脑的活动与实际言语产生相似, 然而想象有意义的单词、句子时, 则与实际言语产生有不同的脑电信号, 语料的复杂性、词的长度、意义等会影响想象时的脑区活动。另外, 信号较弱的脑区识别、刺激方式、脑信号检测技术等均会影响脑神经信号的识别。
4 总结与展望
综上所述, 本文主要从两个方面梳理了言语想象的神经机制:一是言语想象与言语产生神经机制的异同; 二是不同想象内容脑神经信号的特征。言语想象与实际言语产生在理论模型、激活脑区、神经传导路径等方面均有较多的相似性。然而, 言语想象神经机制的研究还面临很多挑战。在BCI领域的研究中, 对元音想象和音节想象的脑神经信号解码率较高, 而对单词、辅音和句子方面的研究还处于初级阶段, 想象具有语义性质的单词与实际言语产生时的大脑功能活动存在差异, 且想象不同长度和语义的单词时大脑活动加工也存在差异。另外, 相关神经损伤会一定程度上影响言语想象的能力, 言语想象干预可以改善患者的言语功能。虽然言语想象的神经机制研究已经取得了显著的成果, 但在如下方面值得进一步探索。
4.1 探索评价言语想象质量的工具及神经解码范式
依据前述言语想象理论模型研究的不足, 评价言语想象质量的工具及神经解码范式还需进一步探索。针对言语想象没有明确的行为表现的问题, 后续可以通过想象质量问卷或想象内容转录等方法判断受试者想象的质量。各种形式的想象质量问卷已在运动想象的基础和行为研究中普遍应用(白学军 等, 2016; 刘华 等, 2017), 强调对动作想象的清晰度进行判断。言语想象同时包括“想象听”和“想象说”, 以及会涉及到语义的内容, 关于言语想象的质量问卷可能需要兼顾这些方面的内容。在神经损伤群体的研究中, 需要提供可让其顺利进行言语想象的工具。在神经信号解码技术方面, 已有研究多关注激活信号比较显著的脑功能区, 而忽略了信号较弱的脑区的功能活动, 这些区域的解码也许可以为解释言语想象加工机制做贡献。参考脑神经可塑性相关理论, 还可增加相应脑损伤后可能的代偿区域的神经解码(Wilson et al., 2020)。BCI的主要目的是改善神经损伤后患者的大脑与外界沟通和控制, 关注患者言语想象时代偿脑区的神经解码有助于BCI的研究。例如通过改进脑神经信号解码的算法来解码失语症患者颞顶枕交界处、颞前腹区的神经信号(Gajardo-Vidal et al., 2021), 虽然在健康人群中该区域激活不明显, 但可以有助于失语症患者BCI解码的正确率。
4.2 言语想象的神经传导路径可进一步扩展到脑控制回路、激活通路的研究
言语产生过程中神经传导路径包括控制回路(the control circuits)、直接激活通路(the direct activation pathway)、间接激活通路(the indirect activation pathway)和外周神经的最后共同通路(the final common pathway) (Duffy, 2013), 而当前关于言语想象的神经传导路径只涉及了额叶与颞叶之间的联系, 以及面部肌电的检测。目前已有采用DTI追踪脑白质纤维束通路方面的研究证明, 运动想象治疗后脑卒中患者患侧大脑白质纤维束的各向异性系数(fractional anisotropy, FA)升高, 说明运动想象可以促进患者受损脑白质纤维的恢复(杨帆 等, 2017; Li et al., 2018), 后续可采用DTI技术结合合理的实验设计, 进一步探索言语想象过程中脑控制回路、激活通路的机制研究。虽然当前BCI技术也仅是涉及了脑皮层的信号识别, 言语想象的神经传导路径研究或许能够为BCI技术探索新的路径。另外, 言语想象激活的信号强度较弱, 但激活信号相对较弱的脑区也可能对BCI的识别产生影响, 尤其在初级运动皮层是否被激活及其激活程度还存在一定的争议, 这可能与不同研究所采用的脑功能检测技术不同有关, 后续可能需要进行不同检测技术之间的对照研究和严格的实验设计控制无关变量。虽然MEG在识别言语想象脑神经信号方面的功能优于EEG (Dash et al., 2020), 但MEG价格昂贵难以大规模使用。近年来流行的一种新型无创的近红外脑功能成像(fNIRS), 相比fMRI、PET和EEG, fNIRS对运动不敏感且具备较高的时间分辨率(安娟, 牟海荣, 2019; 白学军 等, 2016; 高晨阳 等, 2022), 可以考虑作为言语想象研究的检测技术。
4.3 探索言语障碍群体的言语想象机制
当前言语想象神经机制的研究成果多以健康成人为研究对象, 不能说明言语障碍群体的言语想象机制。诸如前述不同类型失语症患者言语想象的脑区活动与言语产生存在差异, 口吃患者言语想象与言语产生行为学上的差异, 均提示神经损伤患者的言语想象机制与健康人群有区别, 目前关于障碍群体言语想象的机制研究还较少。除外失语症和口吃, 运动性言语障碍也是常见的神经损伤导致的言语障碍群体, 但目前较少有研究关注这类群体。运动性言语障碍是神经损伤导致言语产生的计划、编程和执行功能失调(Duffy, 2013), 使言语产生的呼吸、发声、共鸣、构音、语音所需的肌肉控制失调导致言语的可懂度降低, 这是脑卒中、帕金森病、脑外伤、脑瘫、肌萎缩侧索硬化症和重症肌无力等神经疾病的常见伴随症状。这些患者是应用BCI技术的主要群体之一, 基于以往研究结论局灶性脑区受损的言语想象功能仍有所保留, 后续可以进一步探索特定言语障碍群体下的言语想象机制, 进而为言语想象的BCI研究提供理论指导。另外, 后续可以进一步探索言语想象在以上言语障碍群体中的随机对照研究, 以及对比治疗前后脑神经信号在脑区激活、神经传导路径等方面变化的差异, 为言语障碍的康复治疗新技术提供理论和实践支撑。
4.4 探索赋有语义性质的词语和句子的言语想象的脑神经信号特征提取技术
赋有语义性质的言语想象研究多集中在BCI交互准确率方面, 言语想象会受到想象内容的复杂程度和语义的影响(Nguyen et al., 2017), 词语和句子的想象脑机制还处于初级阶段, 这是受多方面因素影响的。首先, 信息技术领域对于脑神经信号特征提取的技术需进一步探索。通过EEG、fNIRS提取了言语想象的脑神经信号后, BCI技术的核心是脑神经信号的特征提取、分类与解码等数据处理技术(刘艳鹏 等, 2022), 目前对元音想象中二分类、三分类的各类算法的准确率较高, 对于有语义性质的词汇或句子的算法较少, 多分类的准确性较低(陈霏, 潘昌杰, 2020), 这些信息技术的发展无疑能够促进复杂言语想象的研究。其次, 言语想象的刺激方式会影响复杂语义言语想象的效果。当前言语想象的刺激方式多以听觉刺激、视觉刺激为主, 复杂语义的加工机制较为复杂, 后续研究中可以进一步丰富触觉(Yoo et al., 2003)、嗅觉(Djordjevic et al., 2005)、味觉等其他刺激形式来促进复杂内容言语想象的脑信号提取。言语想象涉及的记忆加工理论, 要求受试者有正确想象内容的经验, 而诸如脑瘫等先天性的神经源性言语障碍群体来说, 其并未有正确构音的经验, 可能需要清晰、明确的动作或语音示范才能有助于其完成言语想象(Lust et al., 2016), 这可能需要先制作矢状面和冠状面的动态发音模型, 给患者呈现正确的发音动作示范, 便于其想象正确的发音动作。
5 结论
言语想象的神经机制对大脑预处理机制和BCI技术均起到了重要的作用。言语想象在理论模型、激活脑区、神经传导路径等均与正常言语产生有较多的相似之处, 尤其是简单元音和无意义音的想象机制促进了BCI技术的发展。复杂想象内容、神经损伤群体的言语想象与正常言语产生差异较大, 但目前该方向的研究多集中在BCI领域的研究, 在神经心理学方向的研究还较少。总体来讲言语想象的神经机制研究取得了一定的成果, 未来研究在言语想象质量评价工具及神经解码范式、脑控制回路、激活通路、词语和句子想象的脑神经信号、言语障碍群体的言语想象机制等方面进一步探索, 为有效提高BCI的识别率提供依据, 为言语障碍群体的沟通提供便利。
参考文献
中文版运动觉-视觉想象问卷在脑卒中患者中的信度
运动想象疗法对脑卒中患者运动功能康复的效果
DOI:10.3969/j.issn.1006-9771.2017.09.019
[本文引用: 1]
目的探讨运动想象疗法对脑卒中患者运动功能康复的疗效。方法2015年5月至2016年10月,40例脑卒中偏瘫患者随机分为对照组(常规康复疗法)和运动想象组(运动想象+常规康复疗法),每组20例。治疗前及治疗6周后,分别采用Fugl-Meyer评定量表(FMA)、改良Barthel指数(MBI)进行评定,采用磁共振弥散张量成像(DTI)测量患者各向异性分数(FA)。结果治疗后,两组患者FMA评分、MBI评分均较治疗前显著提高(t>5.088, P<0.001),运动想象组优于对照组(t>2.124, P<0.05)。治疗前,两组病灶侧FA均较对侧明显减小(t>3.892, P<0.01),两组间无显著性差异(t<1.144, P>0.05);治疗后,运动想象组有较多患者(5/5 vs. 2/4) FA升高。结论运动想象疗法能促进脑卒中恢复期患者运动功能及日常生活活动能力恢复,对受损白质纤维的恢复可能有帮助。
Inner speech: Development, cognitive functions, phenomenology, and neurobiology
DOI:10.1037/bul0000021
PMID:26011789
[本文引用: 3]
Inner speech-also known as covert speech or verbal thinking-has been implicated in theories of cognitive development, speech monitoring, executive function, and psychopathology. Despite a growing body of knowledge on its phenomenology, development, and function, approaches to the scientific study of inner speech have remained diffuse and largely unintegrated. This review examines prominent theoretical approaches to inner speech and methodological challenges in its study, before reviewing current evidence on inner speech in children and adults from both typical and atypical populations. We conclude by considering prospects for an integrated cognitive science of inner speech, and present a multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations. Despite its variability among individuals and across the life span, inner speech appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes.(c) 2015 APA, all rights reserved).
The functional neuroanatomy of metrical stress evaluation of perceived and imagined spoken words
We hypothesized that areas in the temporal lobe that have been implicated in the phonological processing of spoken words would also be activated during the generation and phonological processing of imagined speech. We tested this hypothesis using functional magnetic resonance imaging during a behaviorally controlled task of metrical stress evaluation. Subjects were presented with bisyllabic words and had to determine the alternation of strong and weak syllables. Thus, they were required to discriminate between weak-initial words and strong-initial words. In one condition, the stimuli were presented auditorily to the subjects (by headphones). In the other condition the stimuli were presented visually on a screen and subjects were asked to imagine hearing the word. Results showed activation of the supplementary motor area, inferior frontal gyrus (Broca's area) and insula in both conditions. In the superior temporal gyrus (STG) and in the superior temporal sulcus (STS) strong activation was observed during the auditory (perceptual) condition. However, a region located in the posterior part of the STS/STG also responded during the imagery condition. No activation of this same region of the STS was observed during a control condition which also involved processing of visually presented words, but which required a semantic decision from the subject. We suggest that processing of metrical stress, with or without auditory input, relies in part on cortical interface systems located in the posterior part of STS/STG. These results corroborate behavioral evidence regarding phonological loop involvement in auditory-verbal imagery.
Disfluencies and strategies used by people who stutter during a working memory task
Assessing self-reported mood in aphasia following stroke: Challenges, innovations and future directions
DOI:10.1016/j.jstrokecerebrovasdis.2020.105425 URL [本文引用: 2]
Event-related fMRI of tasks involving brief motion
The assessment of brain function by blood oxygenation level dependent (BOLD) functional magnetic resonance imaging (fMRI) for tasks involving motion near the field of view is compromised by artifacts arising from the motion. The aim of this study is to demonstrate that these artifacts can be reduced by acquiring the average response from a brief stimulus (a "single-trial," or "event-related," paradigm) as opposed to alternating blocks of repeated task with rest (a "block-trial" paradigm). The basis of this technique is that the NMR signal changes from neuronal activation are delayed relative to the motion due to a slow hemodynamic response. By acquiring the average response from a brief stimulus, motion-induced signal changes occur prior to neuronal activation-induced signal changes, and the two can thus be distinguished. This technique is applied to the tasks of speaking out loud, swallowing, jaw clenching, and tongue movement. Functional activation maps derived from the single-trial paradigm contain significantly less artifact than functional activation maps derived from a more traditional block-trial paradigm.
An fMRI investigation of syllable sequence production
DOI:10.1016/j.neuroimage.2006.04.173
PMID:16730195
[本文引用: 2]
Fluent speech comprises sequences that are composed from a finite alphabet of learned words, syllables, and phonemes. The sequencing of discrete motor behaviors has received much attention in the motor control literature, but relatively little has been focused directly on speech production. In this paper, we investigate the cortical and subcortical regions involved in organizing and enacting sequences of simple speech sounds. Sparse event-triggered functional magnetic resonance imaging (fMRI) was used to measure responses to preparation and overt production of non-lexical three-syllable utterances, parameterized by two factors: syllable complexity and sequence complexity. The comparison of overt production trials to preparation only trials revealed a network related to the initiation of a speech plan, control of the articulators, and to hearing one's own voice. This network included the primary motor and somatosensory cortices, auditory cortical areas, supplementary motor area (SMA), the precentral gyrus of the insula, and portions of the thalamus, basal ganglia, and cerebellum. Additional stimulus complexity led to increased engagement of the basic speech network and recruitment of additional areas known to be involved in sequencing non-speech motor acts. In particular, the left hemisphere inferior frontal sulcus and posterior parietal cortex, and bilateral regions at the junction of the anterior insula and frontal operculum, the SMA and pre-SMA, the basal ganglia, anterior thalamus, and the cerebellum showed increased activity for more complex stimuli. We hypothesize mechanistic roles for the extended speech production network in the organization and execution of sequences of speech sounds.
Inhibition or facilitation? Modulation of corticospinal excitability during motor imagery
DOI:S0028-3932(18)30077-0
PMID:29462639
[本文引用: 1]
Motor imagery (MI) is the mental simulation of an action without any overt movement. Functional evidences show that brain activity during MI and motor execution (ME) largely overlaps. However, the role of the primary motor cortex (M1) during MI is controversial. Effective connectivity techniques show a facilitation on M1 during ME and an inhibition during MI, depending on whether an action should be performed or suppressed. Conversely, Transcranial Magnetic Stimulation (TMS) studies report facilitatory effects during both ME and MI. The present TMS study shed light on MI mechanisms, by manipulating the instructions given to the participants. In both Experimental and Control groups, participants were asked to mentally simulate a finger-thumb opposition task, but only the Experimental group received the explicit instruction to avoid any unwanted fingers movements. The amplitude of motor evoked potentials (MEPs) to TMS during MI was compared between the two groups. If the M1 facilitation actually pertains to MI per se, we should have expected to find it, irrespective of the instructions. Contrariwise, we found opposite results, showing facilitatory effects (increased MEPs amplitude) in the Control group and inhibitory effects (decreased MEPs amplitude) in the Experimental group. Control experiments demonstrated that the inhibitory effect was specific for the M1 contralateral to the hand performing the MI task and that the given instructions did not compromise the subjects' MI abilities. The present findings suggest a crucial role of motor inhibition when a "pure" MI task is performed and the subjects are explicitly instructed to avoid overt movements.Copyright © 2018 Elsevier Ltd. All rights reserved.
Identification of vowels in consonant-vowel-consonant words from speech imagery based EEG signals
DOI:10.1007/s11571-019-09558-5
PMID:32015764
[本文引用: 2]
Retrieval of unintelligible speech is a basic need for speech impaired and is under research for several decades. But retrieval of random words from thoughts needs a substantial and consistent approach. This work focuses on the preliminary steps of retrieving vowels from Electroencephalography (EEG) signals acquired while speaking and imagining of speaking a consonant-vowel-consonant (CVC) word. The process, referred to as Speech imagery is imagining of speaking to oneself silently in the mind. Speech imagery is a form of mental imagery. Brain connectivity estimators such as EEG coherence, Partial Directed Coherence, Directed Transfer Function and Transfer Entropy have been used to estimate the concurrency and causal dependence (direction and strength) between different brain regions. From brain connectivity results it has been observed that the left frontal and left temporal electrodes were activated for speech and speech imagery processes. These brain connectivity estimators have been used for training Recurrent Neural Networks (RNN) and Deep Belief Networks (DBN) for identifying the vowel from the subject's thought. Though the accuracy level was found to be varying for each vowel while speaking and imagining of speaking the CVC word, the overall classification accuracy was found to be 72% while using RNN whereas a classification accuracy of 80% was observed while using DBN. DBN was found to outperform RNN in both the speech and speech imagery processes. Thus, the combination of brain connectivity estimators and deep learning techniques appear to be effective in identifying the vowel from EEG signals of subjects' thought.© Springer Nature B.V. 2019.
Neurolinguistics research advancing development of a direct-speech brain- computer interface
DOI:10.1016/j.isci.2018.09.016 URL [本文引用: 9]
Neural correlates of manual action language: Comparative review, ALE meta- analysis and ROI meta-analysis
DOI:10.1016/j.neubiorev.2020.06.025 URL [本文引用: 1]
Single-trial classification of vowel speech imagery using common spatial patterns
DOI:10.1016/j.neunet.2009.05.008
PMID:19497710
[本文引用: 1]
With the goal of providing a speech prosthesis for individuals with severe communication impairments, we propose a control scheme for brain-computer interfaces using vowel speech imagery. Electroencephalography was recorded in three healthy subjects for three tasks, imaginary speech of the English vowels /a/ and /u/, and a no action state as control. Trial averages revealed readiness potentials at 200 ms after stimulus and speech related potentials peaking after 350 ms. Spatial filters optimized for task discrimination were designed using the common spatial patterns method, and the resultant feature vectors were classified using a nonlinear support vector machine. Overall classification accuracies ranged from 68% to 78%. Results indicate significant potential for the use of vowel speech imagery as a speech prosthesis controller.
Decoding imagined and spoken phrases from non-invasive neural (MEG) signals
DOI:10.3389/fnins.2020.00290
PMID:32317917
[本文引用: 4]
Speech production is a hierarchical mechanism involving the synchronization of the brain and the oral articulators, where the intention of linguistic concepts is transformed into meaningful sounds. Individuals with locked-in syndrome (fully paralyzed but aware) lose their motor ability completely including articulation and even eyeball movement. The neural pathway may be the only option to resume a certain level of communication for these patients. Current brain-computer interfaces (BCIs) use patients' visual and attentional correlates to build communication, resulting in a slow communication rate (a few words per minute). Direct decoding of imagined speech from the neural signals (and then driving a speech synthesizer) has the potential for a higher communication rate. In this study, we investigated the decoding of five imagined and spoken phrases from single-trial, non-invasive magnetoencephalography (MEG) signals collected from eight adult subjects. Two machine learning algorithms were used. One was an artificial neural network (ANN) with statistical features as the baseline approach. The other was convolutional neural networks (CNNs) applied on the spatial, spectral and temporal features extracted from the MEG signals. Experimental results indicated the possibility to decode imagined and spoken phrases directly from neuromagnetic signals. CNNs were found to be highly effective with an average decoding accuracy of up to 93% for the imagined and 96% for the spoken phrases.Copyright © 2020 Dash, Ferrari and Wang.
Functional neuroimaging of odor imagery
We used positron emission tomography (PET) to investigate brain regions associated with odor imagery. Changes in regional cerebral blood flow (CBF) during odor imagery were compared with changes during nonspecific expectation of olfactory stimuli and with those during odor perception. Sixty-seven healthy volunteers were screened for their odor imagery (with a paradigm developed in a previous study), and 12 of them, assessed to be "good odor imagers," participated in the neuroimaging part of the study. Imagination of odors was associated with increased activation in several olfactory regions in the brain: the left primary olfactory cortical (POC) region including piriform cortex, the left secondary olfactory cortex or posterior orbitofrontal cortex (OFC), and the rostral insula bilaterally. Furthermore, blood flow in two regions within the right orbitofrontal cortex correlated significantly with the behavioral measure of odor imagery during scanning. Overall, the findings indicated that neural networks engaged during odor perception and imagery overlap partially.
The neural and behavioral correlates of anomia recovery following personalized observation, execution, and mental imagery therapy: A proof of concept
Imagery of voluntary movement of fingers, toes, and tongue activates corresponding body-part-specific motor representations
DOI:10.1152/jn.01113.2002
PMID:14615433
[本文引用: 1]
We investigate whether imagery of voluntary movements of different body parts activates somatotopical sections of the human motor cortices. We used functional magnetic resonance imaging to detect the cortical activity when 7 healthy subjects imagine performing repetitive (0.5-Hz) flexion/extension movements of the right fingers or right toes, or horizontal movements of the tongue. We also collected functional images when the subjects actually executed these movements and used these data to define somatotopical representations in the motor areas. In this study, we relate the functional activation maps to cytoarchitectural population maps of areas 4a, 4p, and 6 in the same standard anatomical space. The important novel findings are 1). that imagery of hand movements specifically activates the hand sections of the contralateral primary motor cortex (area 4a) and the contralateral dorsal premotor cortex (area 6) and a hand representation located in the caudal cingulate motor area and the most ventral part of the supplementary motor area; 2). that when imagining making foot movements, the foot zones of the posterior part of the contralateral supplementary motor area (area 6) and the contralateral primary motor cortex (area 4a) are active; and 3). that imagery of tongue movements activates the tongue region of the primary motor cortex and the premotor cortex bilaterally (areas 4a, 4p, and 6). These results demonstrate that imagery of action engages the somatotopically organized sections of the primary motor cortex in a systematic manner as well as activating some body-part-specific representations in the nonprimary motor areas. Thus the content of the mental motor image, in this case the body part, is reflected in the pattern of motor cortical activation.
Subjective experience of inner speech in aphasia: Preliminary behavioral relationships and neural correlates
DOI:S0093-934X(16)30015-3
PMID:27694017
[本文引用: 2]
Many individuals with aphasia describe anomia with comments like "I know it but I can't say it." The exact meaning of such phrases is unclear. We hypothesize that at least two discrete experiences exist: the sense of (1) knowing a concept, but failing to find the right word, and (2) saying the correct word internally but not aloud (successful inner speech, sIS). We propose that sIS reflects successful lexical access; subsequent overt anomia indicates post-lexical output deficits. In this pilot study, we probed the subjective experience of anomia in 37 persons with aphasia. Self-reported sIS related to aphasia severity and phonological output deficits. In multivariate lesion-symptom mapping, sIS was associated with dorsal stream lesions, particularly in ventral sensorimotor cortex. These preliminary results suggest that people with aphasia can often provide meaningful insights about their experience of anomia and that reports of sIS relate to specific lesion locations and language deficits.Copyright © 2016 Elsevier Inc. All rights reserved.
The subjective experience of inner speech in aphasia is a meaningful reflection of lexical retrieval
Studying language in context using the temporal generalization method
Damage to Broca’s area does not contribute to long-term speech production outcome after stroke
DOI:10.1093/brain/awaa460
PMID:33517378
[本文引用: 1]
Broca's area in the posterior half of the left inferior frontal gyrus has long been thought to be critical for speech production. The current view is that long-term speech production outcome in patients with Broca's area damage is best explained by the combination of damage to Broca's area and neighbouring regions including the underlying white matter, which was also damaged in Paul Broca's two historic cases. Here, we dissociate the effect of damage to Broca's area from the effect of damage to surrounding areas by studying long-term speech production outcome in 134 stroke survivors with relatively circumscribed left frontal lobe lesions that spared posterior speech production areas in lateral inferior parietal and superior temporal association cortices. Collectively, these patients had varying degrees of damage to one or more of nine atlas-based grey or white matter regions: Brodmann areas 44 and 45 (together known as Broca's area), ventral premotor cortex, primary motor cortex, insula, putamen, the anterior segment of the arcuate fasciculus, uncinate fasciculus and frontal aslant tract. Spoken picture description scores from the Comprehensive Aphasia Test were used as the outcome measure. Multiple regression analyses allowed us to tease apart the contribution of other variables influencing speech production abilities such as total lesion volume and time post-stroke. We found that, in our sample of patients with left frontal damage, long-term speech production impairments (lasting beyond 3 months post-stroke) were solely predicted by the degree of damage to white matter, directly above the insula, in the vicinity of the anterior part of the arcuate fasciculus, with no contribution from the degree of damage to Broca's area (as confirmed with Bayesian statistics). The effect of white matter damage cannot be explained by a disconnection of Broca's area, because speech production scores were worse after damage to the anterior arcuate fasciculus with relative sparing of Broca's area than after damage to Broca's area with relative sparing of the anterior arcuate fasciculus. Our findings provide evidence for three novel conclusions: (i) Broca's area damage does not contribute to long-term speech production outcome after left frontal lobe strokes; (ii) persistent speech production impairments after damage to the anterior arcuate fasciculus cannot be explained by a disconnection of Broca's area; and (iii) the prior association between persistent speech production impairments and Broca's area damage can be explained by co-occurring white matter damage, above the insula, in the vicinity of the anterior part of the arcuate fasciculus.© The Author(s) (2021). Published by Oxford University Press on behalf of the Guarantors of Brain.
The neural correlates of inner speech defined by voxel-based lesion-symptom mapping
DOI:10.1093/brain/awr232 URL [本文引用: 3]
An investigation of the neural association between auditory imagery and perception of complex sounds
DOI:10.1007/s00429-019-01948-z
PMID:31468120
[本文引用: 1]
Neuroimaging studies have demonstrated that mental imagery and perception share similar neural substrates, however, there are still ambiguities according to different auditory imagery content. In addition, there is still a lack of information regarding the underlying neural correlation between the two modalities. In the present study, we adopted functional magnetic resonance imaging to explore the neural representation during imagery and perception of actual sounds in our surroundings. Univariate analysis was used to assess the differences between the modalities of average activation intensity, and stronger imagery activation was found in sensorimotor regions but weaker activation in auditory association cortices. Additionally, multi-voxel pattern analysis with a support vector machine classifier was implemented to decode environmental sounds within- or cross-modality. Significant above-chance accuracies were found in all overlapping regions in the classification of within-modality, while successful cross-modality classification only was found in sensorimotor regions. Both univariate and multivariate analyses found distinct representation between auditory imagery and perception in the overlapping regions, including superior temporal gyrus and inferior frontal sulcus as well as the precentral cortex and pre-supplementary motor area. Our results confirm the overlapping activation regions between auditory imagery and perception reported by previous studies and suggest that activation regions showed dissociable representation pattern in imagery and perception of sound categories.
A neural network model of speech acquisition and motor equivalent speech production
This article describes a neural network model that addresses the acquisition of speaking skills by infants and subsequent motor equivalent production of speech sounds. The model learns two mappings during a babbling phase. A phonetic-to-orosensory mapping specifies a vocal tract target for each speech sound; these targets take the form of convex regions in orosensory coordinates defining the shape of the vocal tract. The babbling process wherein these convex region targets are formed explains how an infant can learn phoneme-specific and language-specific limits on acceptable variability of articulator movements. The model also learns an orosensory-to-articulatory mapping wherein cells coding desired movement directions in orosensory space learn articulator movements that achieve these orosensory movement directions. The resulting mapping provides a natural explanation for the formation of coordinative structures. This mapping also makes efficient use of redundancy in the articulator system, thereby providing the model with motor equivalent capabilities. Simulations verify the model's ability to compensate for constraints or perturbations applied to the articulators automatically and without new learning and to explain contextual variability seen in human speech production.
Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production
DOI:10.1037/0033-295x.102.3.594
PMID:7624456
[本文引用: 3]
This article describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on variability, motor equivalence, coarticulation, and rate effects. Model parameters are learned during a babbling phase. To explain how infants learn language-specific variability limits, speech sound targets take the form of convex regions, rather than points, in orosensory coordinates. Reducing target size for better accuracy during slower speech leads to differential effects for vowels and consonants, as seen in experiments previously used as evidence for separate control processes for the 2 sound types. Anticipatory coarticulation arises when targets are reduced in size on the basis of context; this generalizes the well-known look-ahead model of coarticulation. Computer simulations verify the model's properties.
Cortical interactions underlying the production of speech sounds
DOI:10.1016/j.jcomdis.2006.06.013
PMID:16887139
[本文引用: 4]
Speech production involves the integration of auditory, somatosensory, and motor information in the brain. This article describes a model of speech motor control in which a feedforward control system, involving premotor and primary motor cortex and the cerebellum, works in concert with auditory and somatosensory feedback control systems that involve both sensory and motor cortical areas. New speech sounds are learned by first storing an auditory target for the sound, then using the auditory feedback control system to control production of the sound in early repetitions. Repeated production of the sound leads to tuning of feedforward commands which eventually supplant the feedback-based control signals. Although parts of the model remain speculative, it accounts for a wide range of kinematic, acoustic, and neuroimaging data collected during speech production and provides a framework for investigating communication disorders that involve malfunction of the cerebral cortex and interconnected subcortical structures.Readers will be able to: (1) describe several types of learning that occur in the sensory-motor system during babbling and early speech, (2) identify three neural control subsystems involved in speech production, (3) identify regions of the brain involved in monitoring auditory and somatosensory feedback during speech production, and (4) identify regions of the brain involved in feedforward control of speech.
A wireless brain-machine interface for real-time speech synthesis
Neural modeling and imaging of the cortical interactions underlying syllable production
DOI:10.1016/j.bandl.2005.06.001
PMID:16040108
[本文引用: 1]
This paper describes a neural model of speech acquisition and production that accounts for a wide range of acoustic, kinematic, and neuroimaging data concerning the control of speech movements. The model is a neural network whose components correspond to regions of the cerebral cortex and cerebellum, including premotor, motor, auditory, and somatosensory cortical areas. Computer simulations of the model verify its ability to account for compensation to lip and jaw perturbations during speech. Specific anatomical locations of the model's components are estimated, and these estimates are used to simulate fMRI experiments of simple syllable production.
A neural theory of speech acquisition and production
This article describes a computational model, called DIVA, that provides a quantitative framework for understanding the roles of various brain regions involved in speech acquisition and production. An overview of the DIVA model is first provided, along with descriptions of the computations performed in the different brain regions represented in the model. Particular focus is given to the model's speech sound map, which provides a link between the sensory representation of a speech sound and the motor program for that sound. Neurons in this map share with "mirror neurons" described in monkey ventral premotor cortex the key property of being active during both production and perception of specific motor actions. As the DIVA model is defined both computationally and anatomically, it is ideal for generating precise predictions concerning speech-related brain activation patterns observed during functional imaging experiments. The DIVA model thus provides a well-defined framework for guiding the interpretation of experimental results related to the putative human speech mirror system.
Therapeutic instrumental music training and motor imagery in post-stroke upper-extremity rehabilitation: A randomized-controlled pilot study
Motor imagery ability in patients with early- and mid-stage Parkinson disease
DOI:10.1177/1545968310370750
PMID:21239707
[本文引用: 1]
Motor imagery has recently gained attention as a promising new rehabilitation method for patients with neurological disorders. Up to now, however, it has been unclear whether this practice method can also be successfully applied in the rehabilitation of patients with Parkinson disease (PD).This study aimed to investigate whether the motor imagery ability of patients with PD is still intact despite basal ganglia dysfunctioning.A total of 14 patients with early- and mid-stage PD (Hoehn and Yahr 1-3) and 14 healthy controls were evaluated by means of an extensive imagery ability assessment battery, consisting of 2 questionnaires, the Chaotic Motor Imagery Assessment battery, and a test based on mental chronometry.PD patients performed the imagery tasks more slowly than controls, but the motor imagery vividness and accuracy of most patients were well preserved.These results are promising regarding the potential use of motor imagery practice in the rehabilitation of patients with PD.
Computational neuroanatomy of speech production
Sensorimotor integration in speech processing: Computational basis and neural organization
DOI:10.1016/j.neuron.2011.01.019
PMID:21315253
[本文引用: 2]
Sensorimotor integration is an active domain of speech research and is characterized by two main ideas, that the auditory system is critically involved in speech production and that the motor system is critically involved in speech perception. Despite the complementarity of these ideas, there is little crosstalk between these literatures. We propose an integrative model of the speech-related "dorsal stream" in which sensorimotor interaction primarily supports speech production, in the form of a state feedback control architecture. A critical component of this control system is forward sensory prediction, which affords a natural mechanism for limited motor influence on perception, as recent perceptual research has suggested. Evidence shows that this influence is modulatory but not necessary for speech perception. The neuroanatomy of the proposed circuit is discussed as well as some probable clinical correlates including conduction aphasia, stuttering, and aspects of schizophrenia.Copyright © 2011 Elsevier Inc. All rights reserved.
Speech production as state feedback control
DOI:10.3389/fnhum.2011.00082
PMID:22046152
[本文引用: 2]
Spoken language exists because of a remarkable neural process. Inside a speaker's brain, an intended message gives rise to neural signals activating the muscles of the vocal tract. The process is remarkable because these muscles are activated in just the right way that the vocal tract produces sounds a listener understands as the intended message. What is the best approach to understanding the neural substrate of this crucial motor control process? One of the key recent modeling developments in neuroscience has been the use of state feedback control (SFC) theory to explain the role of the CNS in motor control. SFC postulates that the CNS controls motor output by (1) estimating the current dynamic state of the thing (e. g., arm) being controlled, and (2) generating controls based on this estimated state. SFC has successfully predicted a great range of non-speech motor phenomena, but as yet has not received attention in the speech motor control community. Here, we review some of the key characteristics of speech motor control and what they say about the role of the CNS in the process. We then discuss prior efforts to model the role of CNS in speech motor control, and argue that these models have inherent limitations - limitations that are overcome by an SFC model of speech motor control which we describe. We conclude by discussing a plausible neural substrate of our model.
Exploring the ecological validity of thinking on demand: Neural correlates of elicited vs. spontaneously occurring inner speech
The contribution of different frequency bands in class separability of covert speech tasks for BCIs
The relative contribution of high-gamma linguistic processing stages of word production, and motor imagery of articulation in class separability of covert speech tasks in EEG data
DOI:10.1007/s10916-018-1137-9
PMID:30564961
[本文引用: 1]
Word production begins with high-Gamma automatic linguistic processing functions followed by speech motor planning and articulation. Phonetic properties are processed in both linguistic and motor stages of word production. Four phonetically dissimilar phonemic structures "BA", "FO", "LE", and "RY" were chosen as covert speech tasks. Ten neurologically healthy volunteers with the age range of 21-33 participated in this experiment. Participants were asked to covertly speak a phonemic structure when they heard an auditory cue. EEG was recorded with 64 electrodes at 2048 samples/s. Initially, one-second trials were used, which contained linguistic and motor imagery activities. The four-class true positive rate was calculated. In the next stage, 312 ms trials were used to exclude covert articulation from analysis. By eliminating the covert articulation stage, the four-class grand average classification accuracy dropped from 96.4% to 94.5%. The most valuable features emerge after Auditory cue recognition (~100 ms post onset), and within the 70-128 Hz frequency range. The most significant identified brain regions were the Prefrontal Cortex (linked to stimulus driven executive control), Wernicke's area (linked to Phonological code retrieval), the right IFG, and Broca's area (linked to syllabification). Alpha and Beta band oscillations associated with motor imagery do not contain enough information to fully reflect the complexity of speech movements. Over 90% of the most class-dependent features were in the 30-128 Hz range, even during the covert articulation stage. As a result, compared to linguistic functions, the contribution of motor imagery of articulation in class separability of covert speech tasks from EEG data is negligible.
Thought as action: Inner speech, self-monitoring, and auditory verbal hallucinations
DOI:10.1016/j.concog.2005.12.003 URL [本文引用: 1]
Articulating: The neural mechanisms of speech production
Predictive processing: A canonical cortical computation
DOI:S0896-6273(18)30857-2
PMID:30359606
[本文引用: 2]
This perspective describes predictive processing as a computational framework for understanding cortical function in the context of emerging evidence, with a focus on sensory processing. We discuss how the predictive processing framework may be implemented at the level of cortical circuits and how its implementation could be falsified experimentally. Lastly, we summarize the general implications of predictive processing on cortical function in healthy and diseased states.Copyright © 2018 Elsevier Inc. All rights reserved.
Motor imagery involves predicting the sensory consequences of the imagined movement
Resting-state functional connectivity: An emerging method for the study of language networks in post-stroke aphasia
DOI:S0278-2626(17)30107-0
PMID:28865994
[本文引用: 1]
Aphasia results both from direct effects of focal damage to eloquent cortical areas as well as dysfunction of interconnected remote areas within the language network. Resting-state functional MRI (rsfMRI) can be used to examine functional connectivity (FC) within these networks. Herein we review publications, which applied rsfMRI to understand network pathology in post stroke aphasia. A common finding in this research is an acute disruption of connectivity within the language network, which is correlated with loss of language function and tends to resolve with recovery from aphasia. All studies are limited by small sample sizes, heterogeneous patient characteristics and a wide range of analytical approaches, which further hinder deduction of common patterns across studies. One recent large-scale study examining FC and behavior across various cognitive domains, however, has made substantial progress with the description of a "network phenotype of stroke injury", which consists of a disruption of interhemispheric connectivity and reduced segregation of intrahemispheric networks. Unlike in other domains, language functions showed substantial dependence on intact left intrahemispheric connectivity (Siegel, Ramsey et al., 2016). In the future, such analyses of network pathology might support prognosis and development of effective treatment strategies in individual patients with aphasia.Copyright © 2017 Elsevier Inc. All rights reserved.
Inner speech deficits in people with aphasia
DOI:10.3389/fpsyg.2015.00528
PMID:25999876
[本文引用: 3]
Despite the ubiquity of inner speech in our mental lives, methods for objectively assessing inner speech capacities remain underdeveloped. The most common means of assessing inner speech is to present participants with tasks requiring them to silently judge whether two words rhyme. We developed a version of this task to assess the inner speech of a population of patients with aphasia and corresponding language production deficits. Patients' performance on the silent rhyming task was severely impaired relative to controls. Patients' performance on this task did not, however, correlate with their performance on a variety of other standard tests of overt language and rhyming abilities. In particular, patients who were generally unimpaired in their abilities to overtly name objects during confrontation naming tasks, and who could reliably judge when two words spoken to them rhymed, were still severely impaired (relative to controls) at completing the silent rhyme task. A variety of explanations for these results are considered, as a means to critically reflecting on the relations among inner speech, outer speech, and silent rhyme judgments more generally.
App-based data collection, mental imagery, and naming performance in adults with aphasia
DOI:10.1016/j.ctcp.2021.101422 URL [本文引用: 2]
Modulation of EMG power spectrum frequency during motor imagery
DOI:10.1016/j.neulet.2008.02.033
PMID:18343579
[本文引用: 1]
To provide evidence that motor imagery (MI) is accompanied by improvement of intramuscular conduction velocity (CV), we investigated surface electromyographic (EMG) activity of 3 muscles during the elbow flexion/extension. Thirty right-handed participants were asked to lift or to imagine lifting a weighted dumbbell under 3 types of muscular contractions, i.e. concentric, isometric and eccentric, taken as independent variables. The EMG activity of the agonist (long and short heads of biceps brachii) and the antagonist (long portion of triceps brachii) muscles was recorded and processed to determine the median frequency (MF) of EMG power spectrum as dependant variable. The MF was significantly higher during the MI sessions than during the resting condition while the participants remained strictly motionless. Moreover, the MF during imagined concentric contraction was significantly higher than during the eccentric. Thus, the MF variation was correlated to the type of contraction the muscle produced. During MI, the EMG patterns corresponding to each type of muscle contraction remained comparable to those observed during actual movement. In conclusion, specific motor programming is hypothesized to be performed as a function of muscle contraction type during MI.
Motor imagery training induces changes in brain neural networks in stroke patients
DOI:10.4103/1673-5374.238616
PMID:30136692
[本文引用: 2]
Motor imagery is the mental representation of an action without overt movement or muscle activation. However, the effects of motor imagery on stroke-induced hand dysfunction and brain neural networks are still unknown. We conducted a randomized controlled trial in the China Rehabilitation Research Center. Twenty stroke patients, including 13 males and 7 females, 32-51 years old, were recruited and randomly assigned to the traditional rehabilitation treatment group (PP group, n = 10) or the motor imagery training combined with traditional rehabilitation treatment group (MP group, n = 10). All patients received rehabilitation training once a day, 45 minutes per session, five times per week, for 4 consecutive weeks. In the MP group, motor imagery training was performed for 45 minutes after traditional rehabilitation training, daily. Action Research Arm Test and the Fugl-Meyer Assessment of the upper extremity were used to evaluate hand functions before and after treatment. Transcranial magnetic stimulation was used to analyze motor evoked potentials in the affected extremity. Diffusion tensor imaging was used to assess changes in brain neural networks. Compared with the PP group, the MP group showed better recovery of hand function, higher amplitude of the motor evoked potential in the abductor pollicis brevis, greater fractional anisotropy of the right dorsal pathway, and an increase in the fractional anisotropy of the bilateral dorsal pathway. Our findings indicate that 4 weeks of motor imagery training combined with traditional rehabilitation treatment improves hand function in stroke patients by enhancing the dorsal pathway. This trial has been registered with the Chinese Clinical Trial Registry (registration number: ChiCTR-OCH-12002238).
Motor imagery difficulties in children with cerebral palsy: A specific or general deficit?
DOI:10.1016/j.ridd.2016.06.010
PMID:27399206
[本文引用: 1]
The aim of this study was to examine the specificity of motor imagery (MI) difficulties in children with CP.Performance of 22 children with CP was compared to a gender and age matched control group. MI ability was measured with the Hand Laterality Judgment (HLJ) task, examining specifically the direction of rotation (DOR) effect, and the Praxis Imagery Questionnaire (PIQ).In the back view condition of the HLJ task both groups used MI, as evidenced by longer response times for lateral compared with medial rotational angles. In the palm view condition children with CP did not show an effect of DOR, unlike controls. Error scores did not differ between groups. Both groups performed well on the PIQ, with no significant difference between them in response pattern.The present study suggests that children with CP show deficits on tasks that trigger implicit use of MI, whereas explicit MI ability was relatively preserved, as assessed using the PIQ. These results suggest that employing more explicit methods of MI training may well be more suitable for children with CP in rehabilitation of motor function.Copyright © 2016 Elsevier Ltd. All rights reserved.
Decoding inner speech using electrocorticography: Progress and challenges toward a speech prosthesis
DOI:10.3389/fnins.2018.00422
PMID:29977189
[本文引用: 2]
Certain brain disorders resulting from brainstem infarcts, traumatic brain injury, cerebral palsy, stroke, and amyotrophic lateral sclerosis, limit verbal communication despite the patient being fully aware. People that cannot communicate due to neurological disorders would benefit from a system that can infer internal speech directly from brain signals. In this review article, we describe the state of the art in decoding inner speech, ranging from early acoustic sound features, to higher order speech units. We focused on intracranial recordings, as this technique allows monitoring brain activity with high spatial, temporal, and spectral resolution, and therefore is a good candidate to investigate inner speech. Despite intense efforts, investigating how the human cortex encodes inner speech remains an elusive challenge, due to the lack of behavioral and observable measures. We emphasize various challenges commonly encountered when investigating inner speech decoding, and propose potential solutions in order to get closer to a natural speech assistive device.
Internally simulated movement sensations during motor imagery activate cortical motor areas and the cerebellum
It has been proposed that motor imagery contains an element of sensory experiences (kinesthetic sensations), which is a substitute for the sensory feedback that would normally arise from the overt action. No evidence has been provided about whether kinesthetic sensation is centrally simulated during motor imagery. We psychophysically tested whether motor imagery of palmar flexion or dorsiflexion of the right wrist would influence the sensation of illusory palmar flexion elicited by tendon vibration. We also tested whether motor imagery of wrist movement shared the same neural substrates involving the illusory sensation elicited by the peripheral stimuli. Regional cerebral blood flow was measured with H215O and positron emission tomography in 10 right-handed subjects. The right tendon of the wrist extensor was vibrated at 83 Hz ("illusion") or at 12.5 Hz with no illusion ("vibration"). Subjects imagined doing wrist movements of alternating palmar and dorsiflexion at the same speed with the experienced illusory movements ("imagery"). A "rest" condition with eyes closed was included. We identified common active fields between the contrasts of imagery versus rest and illusion versus vibration. Motor imagery of palmar flexion psychophysically enhanced the experienced illusory angles of plamar flexion, whereas dorsiflexion imagery reduced it in the absence of overt movement. Motor imagery and the illusory sensation commonly activated the contralateral cingulate motor areas, supplementary motor area, dorsal premotor cortex, and ipsilateral cerebellum. We conclude that kinesthetic sensation associated with imagined movement is internally simulated during motor imagery by recruiting multiple motor areas.
Structural connectivity of right frontal hyperactive areas scales with stuttering severity
DOI:10.1093/brain/awx316
PMID:29228195
[本文引用: 1]
A neuronal sign of persistent developmental stuttering is the magnified coactivation of right frontal brain regions during speech production. Whether and how stuttering severity relates to the connection strength of these hyperactive right frontal areas to other brain areas is an open question. Scrutinizing such brain-behaviour and structure-function relationships aims at disentangling suspected underlying neuronal mechanisms of stuttering. Here, we acquired diffusion-weighted and functional images from 31 adults who stutter and 34 matched control participants. Using a newly developed structural connectivity measure, we calculated voxel-wise correlations between connection strength and stuttering severity within tract volumes that originated from functionally hyperactive right frontal regions. Correlation analyses revealed that with increasing speech motor deficits the connection strength increased in the right frontal aslant tract, the right anterior thalamic radiation, and in U-shaped projections underneath the right precentral sulcus. In contrast, with decreasing speech motor deficits connection strength increased in the right uncinate fasciculus. Additional group comparisons of whole-brain white matter skeletons replicated the previously reported reduction of fractional anisotropy in the left and right superior longitudinal fasciculus as well as at the junction of right frontal aslant tract and right superior longitudinal fasciculus in adults who stutter compared to control participants. Overall, our investigation suggests that right fronto-temporal networks play a compensatory role as a fluency enhancing mechanism. In contrast, the increased connection strength within subcortical-cortical pathways may be implied in an overly active global response suppression mechanism in stuttering. Altogether, this combined functional MRI-diffusion tensor imaging study disentangles different networks involved in the neuronal underpinnings of the speech motor deficit in persistent developmental stuttering.© The Author (2017). Published by Oxford University Press on behalf of the Guarantors of Brain.
Left posterior-dorsal area 44 couples with parietal areas to promote speech fluency, while right area 44 activity promotes the stopping of motor responses
DOI:S1053-8119(16)30415-3
PMID:27542724
[本文引用: 2]
Area 44 is a cytoarchitectonically distinct portion of Broca's region. Parallel and overlapping large-scale networks couple with this region thereby orchestrating heterogeneous language, cognitive, and motor functions. In the context of stuttering, area 44 frequently comes into focus because structural and physiological irregularities affect developmental trajectories, stuttering severity, persistency, and etiology. A remarkable phenomenon accompanying stuttering is the preserved ability to sing. Speaking and singing are connatural behaviours recruiting largely overlapping brain networks including left and right area 44. Analysing which potential subregions of area 44 are malfunctioning in adults who stutter, and what effectively suppresses stuttering during singing, may provide a better understanding of the coordination and reorganization of large-scale brain networks dedicated to speaking and singing in general. We used fMRI to investigate functionally distinct subregions of area 44 during imagery of speaking and imaginary of humming a melody in 15 dextral males who stutter and 17 matched control participants. Our results are fourfold. First, stuttering was specifically linked to a reduced activation of left posterior-dorsal area 44, a subregion that is involved in speech production, including phonological word processing, pitch processing, working memory processes, sequencing, motor planning, pseudoword learning, and action inhibition. Second, functional coupling between left posterior area 44 and left inferior parietal lobule was deficient in stuttering. Third, despite the preserved ability to sing, males who stutter showed bilaterally a reduced activation of area 44 when imagine humming a melody, suggesting that this fluency-enhancing condition seems to bypass posterior-dorsal area 44 to achieve fluency. Fourth, time courses of the posterior subregions in area 44 showed delayed peak activations in the right hemisphere in both groups, possibly signaling the offset response. Because these offset response-related activations in the right hemisphere were comparably large in males who stutter, our data suggest a hyperactive mechanism to stop speech motor responses and thus possibly reflect a pathomechanism, which, until now, has been neglected. Overall, the current results confirmed a recently described co-activation based parcellation supporting the idea of functionally distinct subregions of left area 44.Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.
Inferring imagined speech using EEG signals: A new approach using Riemannian manifold features
Motor movement matters: The flexible abstractness of inner speech
Speech imagery decoding as a window to speech planning and production
Statistical learning as reinforcement learning phenomena
An event-related fMRI study of overt and covert word stem completion
DOI:10.1006/nimg.2001.0779 URL [本文引用: 2]
Impaired feedforward control and enhanced feedback control of speech in patients with cerebellar degeneration
DOI:10.1523/JNEUROSCI.3363-16.2017
PMID:28842410
[本文引用: 1]
The cerebellum has been hypothesized to form a crucial part of the speech motor control network. Evidence for this comes from patients with cerebellar damage, who exhibit a variety of speech deficits, as well as imaging studies showing cerebellar activation during speech production in healthy individuals. To date, the precise role of the cerebellum in speech motor control remains unclear, as it has been implicated in both anticipatory (feedforward) and reactive (feedback) control. Here, we assess both anticipatory and reactive aspects of speech motor control, comparing the performance of patients with cerebellar degeneration and matched controls. Experiment 1 tested feedforward control by examining speech adaptation across trials in response to a consistent perturbation of auditory feedback. Experiment 2 tested feedback control, examining online corrections in response to inconsistent perturbations of auditory feedback. Both male and female patients and controls were tested. The patients were impaired in adapting their feedforward control system relative to controls, exhibiting an attenuated anticipatory response to the perturbation. In contrast, the patients produced even larger compensatory responses than controls, suggesting an increased reliance on sensory feedback to guide speech articulation in this population. Together, these results suggest that the cerebellum is crucial for maintaining accurate feedforward control of speech, but relatively uninvolved in feedback control. Speech motor control is a complex activity that is thought to rely on both predictive, feedforward control as well as reactive, feedback control. While the cerebellum has been shown to be part of the speech motor control network, its functional contribution to feedback and feedforward control remains controversial. Here, we use real-time auditory perturbations of speech to show that patients with cerebellar degeneration are impaired in adapting feedforward control of speech but retain the ability to make online feedback corrections; indeed, the patients show an increased sensitivity to feedback. These results indicate that the cerebellum forms a crucial part of the feedforward control system for speech but is not essential for online, feedback control.Copyright © 2017 the authors 0270-6474/17/379249-10$15.00/0.
The human imagination: The cognitive neuroscience of visual mental imagery
DOI:10.1038/s41583-019-0202-9
PMID:31384033
[本文引用: 1]
Mental imagery can be advantageous, unnecessary and even clinically disruptive. With methodological constraints now overcome, research has shown that visual imagery involves a network of brain areas from the frontal cortex to sensory areas, overlapping with the default mode network, and can function much like a weak version of afferent perception. Imagery vividness and strength range from completely absent (aphantasia) to photo-like (hyperphantasia). Both the anatomy and function of the primary visual cortex are related to visual imagery. The use of imagery as a tool has been linked to many compound cognitive processes and imagery plays both symptomatic and mechanistic roles in neurological and mental disorders and treatments.
Spatiotemporal dynamics of electrocorticographic high gamma activity during overt and covert word repetition
DOI:10.1016/j.neuroimage.2010.10.029
PMID:21029784
[本文引用: 2]
Language is one of the defining abilities of humans. Many studies have characterized the neural correlates of different aspects of language processing. However, the imaging techniques typically used in these studies were limited in either their temporal or spatial resolution. Electrocorticographic (ECoG) recordings from the surface of the brain combine high spatial with high temporal resolution and thus could be a valuable tool for the study of neural correlates of language function. In this study, we defined the spatiotemporal dynamics of ECoG activity during a word repetition task in nine human subjects. ECoG was recorded while each subject overtly or covertly repeated words that were presented either visually or auditorily. ECoG amplitudes in the high gamma (HG) band confidently tracked neural changes associated with stimulus presentation and with the subject's verbal response. Overt word production was primarily associated with HG changes in the superior and middle parts of temporal lobe, Wernicke's area, the supramarginal gyrus, Broca's area, premotor cortex (PMC), primary motor cortex. Covert word production was primarily associated with HG changes in superior temporal lobe and the supramarginal gyrus. Acoustic processing from both auditory stimuli as well as the subject's own voice resulted in HG power changes in superior temporal lobe and Wernicke's area. In summary, this study represents a comprehensive characterization of overt and covert speech using electrophysiological imaging with high spatial and temporal resolution. It thereby complements the findings of previous neuroimaging studies of language and thus further adds to current understanding of word processing in humans.Published by Elsevier Inc.
What is that little voice inside my head? Inner speech phenomenology, its role in cognitive performance, and its relation to self-monitoring
DOI:10.1016/j.bbr.2013.12.034
PMID:24412278
[本文引用: 2]
The little voice inside our head, or inner speech, is a common everyday experience. It plays a central role in human consciousness at the interplay of language and thought. An impressive host of research works has been carried out on inner speech these last fifty years. Here we first describe the phenomenology of inner speech by examining five issues: common behavioural and cerebral correlates with overt speech, different types of inner speech (wilful verbal thought generation and verbal mind wandering), presence of inner speech in reading and in writing, inner signing and voice-hallucinations in deaf people. Secondly, we review the role of inner speech in cognitive performance (i.e., enhancement vs. perturbation). Finally, we consider agency in inner speech and how our inner voice is known to be self-generated and not produced by someone else.Copyright © 2014 Elsevier B.V. All rights reserved.
DOI:10.1101/2021.01.26.428315 [本文引用: 7]
Assessing and mapping language, attention and executive multidimensional deficits in stroke aphasia
DOI:10.1093/brain/awz258
PMID:31504247
[本文引用: 1]
There is growing awareness that aphasia following a stroke can include deficits in other cognitive functions and that these are predictive of certain aspects of language function, recovery and rehabilitation. However, data on attentional and executive (dys)functions in individuals with stroke aphasia are still scarce and the relationship to underlying lesions is rarely explored. Accordingly in this investigation, an extensive selection of standardized non-verbal neuropsychological tests was administered to 38 individuals with chronic post-stroke aphasia, in addition to detailed language testing and MRI. To establish the core components underlying the variable patients' performance, behavioural data were explored with rotated principal component analyses, first separately for the non-verbal and language tests, then in a combined analysis including all tests. Three orthogonal components for the non-verbal tests were extracted, which were interpreted as shift-update, inhibit-generate and speed. Three components were also extracted for the language tests, representing phonology, semantics and speech quanta. Individual continuous scores on each component were then included in a voxel-based correlational methodology analysis, yielding significant clusters for all components. The shift-update component was associated with a posterior left temporo-occipital and bilateral medial parietal cluster, the inhibit-generate component was mainly associated with left frontal and bilateral medial frontal regions, and the speed component with several small right-sided fronto-parieto-occipital clusters. Two complementary multivariate brain-behaviour mapping methods were also used, which showed converging results. Together the results suggest that a range of brain regions are involved in attention and executive functioning, and that these non-language domains play a role in the abilities of patients with chronic aphasia. In conclusion, our findings confirm and extend our understanding of the multidimensionality of stroke aphasia, emphasize the importance of assessing non-verbal cognition in this patient group and provide directions for future research and clinical practice. We also briefly compare and discuss univariate and multivariate methods for brain-behaviour mapping.© The Author(s) (2019). Published by Oxford University Press on behalf of the Guarantors of Brain.
Inner speech captures the perception of external speech
Online classification of imagined speech using functional near-infrared spectroscopy signals
A functional study of auditory verbal imagery
We used functional MRI to examine the functional anatomy of inner speech and different forms of auditory verbal imagery (imagining speech) in normal volunteers. We hypothesized that generating inner speech and auditory verbal imagery would be associated with left inferior frontal activation, and that generating auditory verbal imagery would involve additional activation in the lateral temporal cortices.Subjects were scanned, while performing inner speech and auditory verbal imagery tasks, using a 1.5 Tesla magnet.The generation of inner speech was associated with activation in the left inferior frontal/insula region, the left temporo-parietal cortex, right cerebellum and the supplementary motor area. Auditory verbal imagery in general, as indexed by the three imagery tasks combined, was associated with activation in the areas engaged during the inner speech task, plus the left precentral and superior temporal gyri (STG), and the right homologues of all these areas.These results are consistent with the use of the 'articulatory loop' during both inner speech and auditory verbal imagery, and the greater engagement of verbal self-monitoring during auditory verbal imagery.
Dreaming and REM sleep are controlled by different brain mechanisms
The paradigmatic assumption that REM sleep is the physiological equivalent of dreaming is in need of fundamental revision. A mounting body of evidence suggests that dreaming and REM sleep are dissociable states, and that dreaming is controlled by forebrain mechanisms. Recent neuropsychological, radiological, and pharmacological findings suggest that the cholinergic brain stem mechanisms that control the REM state can only generate the psychological phenomena of dreaming through the mediation of a second, probably dopaminergic, forebrain mechanism. The latter mechanism (and thus dreaming itself) can also be activated by a variety of nonREM triggers. Dreaming can be manipulated by dopamine agonists and antagonists with no concomitant change in REM frequency, duration, and density. Dreaming can also be induced by focal forebrain stimulation and by complex partial (forebrain) seizures during nonREM sleep, when the involvement of brainstem REM mechanisms is precluded. Likewise, dreaming is obliterated by focal lesions along a specific (probably dopaminergic) forebrain pathway, and these lesions do not have any appreciable effects on REM frequency, duration, and density. These findings suggest that the forebrain mechanism in question is the final common path to dreaming and that the brainstem oscillator that controls the REM state is just one of the many arousal triggers that can activate this forebrain mechanism. The "REM-on" mechanism (like its various NREM equivalents) therefore stands outside the dream process itself, which is mediated by an independent, forebrain "dream-on" mechanism.
Motor imagery in children with unilateral cerebral palsy: A case-control study
Inner speech's relationship with overt speech in poststroke aphasia
Dynamic causal modelling for functional near-infrared spectroscopy
DOI:10.1016/j.neuroimage.2015.02.035
PMID:25724757
[本文引用: 1]
Functional near-infrared spectroscopy (fNIRS) is an emerging technique for measuring changes in cerebral hemoglobin concentration via optical absorption changes. Although there is great interest in using fNIRS to study brain connectivity, current methods are unable to infer the directionality of neuronal connections. In this paper, we apply Dynamic Causal Modelling (DCM) to fNIRS data. Specifically, we present a generative model of how observed fNIRS data are caused by interactions among hidden neuronal states. Inversion of this generative model, using an established Bayesian framework (variational Laplace), then enables inference about changes in directed connectivity at the neuronal level. Using experimental data acquired during motor imagery and motor execution tasks, we show that directed (i.e., effective) connectivity from the supplementary motor area to the primary motor cortex is negatively modulated by motor imagery, and this suppressive influence causes reduced activity in the primary motor cortex during motor imagery. These results are consistent with findings of previous functional magnetic resonance imaging (fMRI) studies, suggesting that the proposed method enables one to infer directed interactions in the brain mediated by neuronal dynamics from measurements of optical density changes. Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.
Mental imagery of speech and movement implicates the dynamics of internal forward models
DOI:10.3389/fpsyg.2010.00166
PMID:21897822
[本文引用: 3]
The classical concept of efference copies in the context of internal forward models has stimulated productive research in cognitive science and neuroscience. There are compelling reasons to argue for such a mechanism, but finding direct evidence in the human brain remains difficult. Here we investigate the dynamics of internal forward models from an unconventional angle: mental imagery, assessed while recording high temporal resolution neuronal activity using magnetoencephalography. We compare two overt and covert tasks; our covert, mental imagery tasks are unconfounded by overt input/output demands - but in turn necessitate the development of appropriate multi-dimensional topographic analyses. Finger tapping (studies 1 and 2) and speech experiments (studies 3-5) provide temporally constrained results that implicate the estimation of an efference copy. We suggest that one internal forward model over parietal cortex subserves the kinesthetic feeling in motor imagery. Secondly, observed auditory neural activity similar to 170 ms after motor estimation in speech experiments (studies 3-5) demonstrates the anticipated auditory consequences of planned motor commands in a second internal forward model in imagery of speech production. Our results provide neurophysiological evidence from the human brain in favor of internal forward models deploying efference copies in somatosensory and auditory cortex, in finger tapping and speech production tasks, respectively, and also suggest the dynamics and sequential updating structure of internal forward models.
Mental imagery of speech: Linking motor and perceptual systems through internal simulation and estimation
DOI:10.3389/fnhum.2012.00314
PMID:23226121
[本文引用: 5]
The neural basis of mental imagery has been investigated by localizing the underlying neural networks, mostly in motor and perceptual systems, separately. However, how modality-specific representations are top-down induced and how the action and perception systems interact in the context of mental imagery is not well understood. Imagined speech production ("articulation imagery"), which induces the kinesthetic feeling of articulator movement and its auditory consequences, provides a new angle because of the concurrent involvement of motor and perceptual systems. On the basis of previous findings in mental imagery of speech, we argue for the following regarding the induction mechanisms of mental imagery and the interaction between motor and perceptual systems: (1) Two distinct top-down mechanisms, memory retrieval and motor simulation, exist to induce estimation in perceptual systems. (2) Motor simulation is sufficient to internally induce the representation of perceptual changes that would be caused by actual movement (perceptual associations); however, this simulation process only has modulatory effects on the perception of external stimuli, which critically depends on context and task demands. Considering the proposed simulation-estimation processes as common mechanisms for interaction between motor and perceptual systems, we outline how mental imagery (of speech) relates to perception and production, and how these hypothesized mechanisms might underpin certain neural disorders.
Dynamics of self-monitoring and error detection in speech production: Evidence from mental imagery and MEG.
DOI:10.1162/jocn_a_00692
PMID:25061925
[本文引用: 1]
A critical subroutine of self-monitoring during speech production is to detect any deviance between expected and actual auditory feedback. Here we investigated the associated neural dynamics using MEG recording in mental-imagery-of-speech paradigms. Participants covertly articulated the vowel /a/; their own (individually recorded) speech was played back, with parametric manipulation using four levels of pitch shift, crossed with four levels of onset delay. A nonmonotonic function was observed in early auditory responses when the onset delay was shorter than 100 msec: Suppression was observed for normal playback, but enhancement for pitch-shifted playback; however, the magnitude of enhancement decreased at the largest level of pitch shift that was out of pitch range for normal conversion, as suggested in two behavioral experiments. No difference was observed among different types of playback when the onset delay was longer than 100 msec. These results suggest that the prediction suppresses the response to normal feedback, which mediates source monitoring. When auditory feedback does not match the prediction, an "error term" is generated, which underlies deviance detection. We argue that, based on the observed nonmonotonic function, a frequency window (addressing spectral difference) and a time window (constraining temporal difference) jointly regulate the comparison between prediction and feedback in speech.
Mental imagery of speech implicates two mechanisms of perceptual reactivation
DOI:10.1016/j.cortex.2016.01.002 URL [本文引用: 3]
Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification
DOI:10.1016/j.eswa.2016.04.011 URL [本文引用: 1]
Analysis and classification of speech imagery EEG for BCI.
DOI:10.1016/j.bspc.2013.07.011 URL [本文引用: 2]
Visual and speech imagery
Decoding spoken English from intracortical electrode arrays in dorsal precentral gyrus
Neural substrates of tactile imagery: A functional MRI study
DOI:10.1097/00001756-200303240-00011 URL [本文引用: 2]
DOI:10.1109/EMBC44109.2020.9176608 [本文引用: 1]
/
| 〈 |
|
〉 |
